New scrapers: - GBICS.com (BigCommerce, GBP prices, 10 categories, 78 products) - Juniper HCT (Next.js SSR parser, 475 transceivers with specs/EOL) - SFPcables.com (Magento store, 16 categories, 78 products) - Fluxlight (BigCommerce, 6 pages, 118 products) - Champion ONE (compatible vendor scraper) Scraper fixes: - 10Gtek: rewritten to parse HTML spec tables (152 products) - Flexoptix: fix price extraction from Magento Hyva HTML - Register all scrapers in CLI (--gbics, --juniper, --sfpcables, etc.) Hype Cycle Engine enhancements: - Data-driven enrichment from scraped vendor/price data - Revenue lifecycle prediction (peak year, decline, revenue index) - Regional adoption model (NA, China, APAC, Europe, RoW with lag coefficients) - New API endpoints: /enriched, /lifecycle, /regional/:tech DB growth: 89 → 1,168 transceivers, 0 → 416 prices, 6 vendors Qdrant: 1,162 products embedded with nomic-embed-text Research: Norton-Bass model, standards-to-market timelines, hype signals
685 lines
31 KiB
Markdown
685 lines
31 KiB
Markdown
# Hype Cycle Signal Research: Quantifiable Data Inputs for Automatic Technology Positioning
|
|
|
|
**Date:** 2026-03-28
|
|
**For:** Transceiver Intelligence Platform (TIP) — Hype Cycle Engine
|
|
**Status:** Deep Research Complete — Ready for Implementation Planning
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
This document catalogs **10 quantifiable signal categories** that can feed the TIP Hype Cycle Engine to automatically position optical transceiver technologies (400G, 800G, 1.6T, QSFP-DD, OSFP, silicon photonics, coherent pluggable, co-packaged optics, etc.) on a Norton-Bass-derived hype cycle.
|
|
|
|
**Key finding:** A composite of 5-6 signals provides robust positioning. No single signal is sufficient alone. The recommended **Phase 1 implementation** (3 signals, all free, all validated) can be built in ~2 weeks.
|
|
|
|
---
|
|
|
|
## Signal Catalog
|
|
|
|
---
|
|
|
|
### 1. PATENT DATA (Technology Innovation Signal)
|
|
|
|
**What it measures:** R&D investment intensity, innovation velocity, technology maturation
|
|
**Hype cycle relevance:** Patents LEAD actual market adoption by 3-5 years. Patent filing surges correlate with "Technology Trigger" and early "Peak of Inflated Expectations."
|
|
|
|
#### Data Source: USPTO PatentsView API (migrating to data.uspto.gov March 2026)
|
|
|
|
| Attribute | Detail |
|
|
|-----------|--------|
|
|
| **API URL** | `https://search.patentsview.org/api/v1/patent/` |
|
|
| **Auth** | API key required (header `X-Api-Key`). Free but new grants temporarily suspended during migration to data.uspto.gov |
|
|
| **Rate Limit** | 45 requests/minute |
|
|
| **Update Frequency** | Quarterly |
|
|
| **Cost** | Free (CC BY 4.0 license) |
|
|
| **Python Library** | `requests` (REST API), `patentsview2` (R package, no maintained Python equivalent) |
|
|
| **Implementation Complexity** | 2/5 |
|
|
|
|
#### Relevant CPC Classes for Optical Transceivers
|
|
|
|
| CPC Class | Description |
|
|
|-----------|-------------|
|
|
| **H04B10** | Transmission systems employing electromagnetic waves other than radio waves (optical communication) |
|
|
| **G02B6** | Light guides; structural details of fibre-optic arrangements |
|
|
| **H01S5** | Semiconductor lasers (VCSELs, DFB, EML — core transceiver components) |
|
|
| **H04J14** | Optical multiplex systems (WDM, DWDM) |
|
|
| **G02F1** | Devices or arrangements for the control of light intensity (modulators) |
|
|
|
|
#### Queryable Metrics
|
|
|
|
1. **Patent Filing Velocity** — Count of new patent applications per CPC class per quarter
|
|
2. **Patent Grant Rate** — Ratio of grants to applications (maturity indicator)
|
|
3. **Citation Velocity** — How quickly new patents cite each other (hot field indicator)
|
|
4. **Technology Cycle Time (TCT)** — Median age of citations (shorter = faster-moving field)
|
|
5. **Assignee Concentration** — Herfindahl index of patent holders (few holders = early stage; many = maturation)
|
|
|
|
#### Example Query (PatentsView Search API)
|
|
```
|
|
GET https://search.patentsview.org/api/v1/patent/
|
|
?q={"_and":[{"_begins":{"cpc_at_issue.cpc_subclass_id":"H04B10"}},{"_gte":{"patent_date":"2024-01-01"}},{"_text_any":{"patent_abstract":"transceiver 400G 800G QSFP OSFP"}}]}
|
|
&f=["patent_id","patent_date","patent_title","assignees.assignee_organization"]
|
|
&o={"size":100}
|
|
```
|
|
|
|
Response includes `total_hits` for counting.
|
|
|
|
#### Academic Validation
|
|
|
|
- **BIMATEM method** (Manrique-Castillo et al., Scientometrics 2018): Patent records of mature technologies display **logistic growth** behavior. Fitting logistic curves to patent counts per technology enables TRL assignment.
|
|
- **Gao et al. (2013)**: Using multiple patent-based indicators with a nearest-neighbour classifier for technology life cycle stage classification.
|
|
- **Technology Cycle Time**: Kayal's TCT indicator — median citation age predicts technology maturity phase.
|
|
|
|
#### Correlation with Hype Cycle Position
|
|
|
|
- **High filing velocity + low grant rate** = Technology Trigger / early Peak
|
|
- **Peak filing count reached** = Peak of Inflated Expectations
|
|
- **Declining filings + rising citations** = Trough / early Slope
|
|
- **Stable filings + high citation density** = Plateau of Productivity
|
|
|
|
---
|
|
|
|
### 2. ACADEMIC PUBLICATION METRICS (Knowledge Creation Signal)
|
|
|
|
**What it measures:** Scientific research intensity, knowledge maturation
|
|
**Hype cycle relevance:** Publication counts follow a logistic S-curve. The inflection point of the S-curve corresponds roughly to the transition from Peak to Trough.
|
|
|
|
#### Data Source: Semantic Scholar API (VALIDATED — working)
|
|
|
|
| Attribute | Detail |
|
|
|-----------|--------|
|
|
| **API URL** | `https://api.semanticscholar.org/graph/v1/paper/search/bulk` |
|
|
| **Auth** | None required (public). API key available for higher rate limits |
|
|
| **Rate Limit** | 1000 req/sec (shared unauthenticated), 1 req/sec (with free API key) |
|
|
| **Update Frequency** | Continuous (near real-time) |
|
|
| **Cost** | Free |
|
|
| **Coverage** | ~200 million papers across all disciplines |
|
|
| **Python Library** | `semanticscholar` (PyPI) or direct `requests` |
|
|
| **Implementation Complexity** | 1/5 |
|
|
|
|
#### Validated Paper Counts (tested 2026-03-28)
|
|
|
|
| Technology | Total Papers | Maturity Signal |
|
|
|------------|-------------|-----------------|
|
|
| silicon photonics transceiver | 905 | Mature (deep research base) |
|
|
| 100G transceiver | 144 | Late maturity |
|
|
| 400G transceiver | 100 | Growth phase |
|
|
| 200G transceiver | 43 | Moderate |
|
|
| coherent pluggable optics | 40 | Growth phase |
|
|
| 800G transceiver | 39 | Early growth |
|
|
| QSFP-DD optical | 26 | Emerging |
|
|
| OSFP transceiver | 11 | Very early |
|
|
| 1.6T transceiver optical | 10 | Pre-commercial |
|
|
|
|
#### Year-by-Year Trend (400G transceiver, validated)
|
|
|
|
| Year | Papers | Signal |
|
|
|------|--------|--------|
|
|
| 2018 | 10 | Early research |
|
|
| 2019 | 7 | Stable |
|
|
| 2020 | 7 | Stable |
|
|
| 2021 | 9 | Slight increase |
|
|
| 2022 | 15 | Growth spike |
|
|
| 2023 | 6 | Decline |
|
|
| 2024 | 8 | Recovery |
|
|
| 2025 | 12 | Resurgence |
|
|
|
|
This pattern (spike in 2022, decline 2023, recovery 2024-25) maps well to the 400G transition from Peak to Slope of Enlightenment.
|
|
|
|
#### Key Metrics to Extract
|
|
|
|
1. **Annual paper count** per technology keyword
|
|
2. **Rate of change** (first derivative — acceleration/deceleration)
|
|
3. **Citation count distribution** — highly cited papers = foundational work = maturation
|
|
4. **Author diversity** — many unique authors = broad interest = growth phase
|
|
5. **Venue distribution** — OFC/ECOC papers vs. general journals
|
|
|
|
#### Supplementary Source: IEEE Xplore
|
|
|
|
- URL: `https://ieeexploreapi.ieee.org/api/v1/search/articles`
|
|
- API key required (free for research)
|
|
- Specifically covers OFC, ECOC, CLEO proceedings
|
|
- Higher signal quality for optical networking specifically
|
|
|
|
---
|
|
|
|
### 3. GOOGLE TRENDS (Public Interest / Hype Proxy)
|
|
|
|
**What it measures:** Search interest as a proxy for market attention and hype
|
|
**Hype cycle relevance:** Google Trends data directly models the "hype" component. Academic validation exists (Jun 2012, van Lente 2013).
|
|
|
|
#### Data Source: Google Trends via pytrends (VALIDATED — working)
|
|
|
|
| Attribute | Detail |
|
|
|-----------|--------|
|
|
| **API** | Unofficial (Google Trends web scraping via pytrends) |
|
|
| **Auth** | None |
|
|
| **Rate Limit** | ~10 requests/minute (unofficial, subject to blocking) |
|
|
| **Update Frequency** | Real-time (weekly/monthly granularity) |
|
|
| **Cost** | Free |
|
|
| **Python Library** | `pytrends` (PyPI, v4.9.2) |
|
|
| **Implementation Complexity** | 1/5 |
|
|
|
|
#### Validated Data (tested 2026-03-28)
|
|
|
|
**Batch 1 — Form Factors & Speeds (relative to each other):**
|
|
|
|
| Technology | Current Interest | Peak Value | Peak Date | Trajectory |
|
|
|------------|-----------------|------------|-----------|------------|
|
|
| silicon photonics | 100 (reference) | 100 | 2026-03 | Rising strongly |
|
|
| OSFP | 34 | 45 | 2024-05 | Peaked, declining |
|
|
| 800G transceiver | 10 | 10 | 2026-02 | Rising |
|
|
| QSFP-DD | 8 | 10 | 2025-11 | Declining from peak |
|
|
| 400G transceiver | 2 | 3 | 2025-12 | Low/stable (mature) |
|
|
|
|
**Batch 2 — Emerging Technologies:**
|
|
|
|
| Technology | Current Interest | Peak Value | Peak Date | Trajectory |
|
|
|------------|-----------------|------------|-----------|------------|
|
|
| co-packaged optics | 100 (reference) | 100 | 2026-03 | Rising strongly |
|
|
| coherent optics | 45 | 45 | 2026-03 | Rising |
|
|
| 1.6T ethernet | 5 | 14 | 2025-08 | Peaked, declining |
|
|
| 100G transceiver | 5 | 8 | 2026-02 | Low/stable |
|
|
|
|
#### Key Observations
|
|
|
|
- **OSFP peaked May 2024** — consistent with 802.3df approval (Feb 2024) driving peak hype
|
|
- **QSFP-DD declining from Nov 2025 peak** — market settling
|
|
- **co-packaged optics and silicon photonics surging** — current hype leaders
|
|
- **400G transceiver at floor** — fully mature, no hype left (Plateau of Productivity)
|
|
- **1.6T peaked Aug 2025** then declined — possible "Peak of Inflated Expectations" → Trough
|
|
|
|
#### Implementation Notes
|
|
|
|
- Normalize by comparing technologies against each other (relative index)
|
|
- Use monthly granularity for trend detection
|
|
- Calculate: peak detection, slope analysis, time-since-peak
|
|
- Combine with absolute volume signals (paper counts) since Google Trends is relative only
|
|
- **Limitation:** B2B niche terms have low search volumes — use broader terms ("silicon photonics" not "silicon photonics transceiver module QSFP-DD800")
|
|
|
|
#### Academic Validation
|
|
|
|
- **Jun (2012)**: "An empirical study of users' hype cycle based on search traffic" — validated Google Trends hype cycle matching for hybrid cars (*Scientometrics* 91(1), pp. 81-99)
|
|
- **van Lente, Spitters & Peine (2013)**: "Comparing technological hype cycles: Towards a theory" (*Technological Forecasting and Social Change* 80(8))
|
|
- **Choi & Varian (2010)**: "Predicting the Present with Google Trends" (foundational paper on search data as predictor)
|
|
- **Caveat**: Medeiros et al. (arXiv 2021) document preprocessing requirements for reliable forecasting from Trends data
|
|
|
|
---
|
|
|
|
### 4. NEWS/MEDIA VOLUME (Hype Amplification Signal)
|
|
|
|
**What it measures:** Trade press and media coverage volume and sentiment
|
|
**Hype cycle relevance:** News volume directly measures the "hype" dimension. Sentiment analysis distinguishes Peak (positive) from Trough (negative/absent).
|
|
|
|
#### Data Source A: GDELT DOC 2.0 API (VALIDATED — working, limited for niche B2B)
|
|
|
|
| Attribute | Detail |
|
|
|-----------|--------|
|
|
| **API URL** | `https://api.gdeltproject.org/api/v2/doc/doc` |
|
|
| **Auth** | None |
|
|
| **Rate Limit** | Reasonable (no published limit) |
|
|
| **Update Frequency** | Every 15 minutes |
|
|
| **Cost** | Free |
|
|
| **Coverage** | 100+ languages, 65 translated, millions of sources |
|
|
| **History** | Last 3 months reliably (older data not guaranteed) |
|
|
| **Python Library** | `gdeltdoc` (PyPI) or `gdeltPyR` (PyPI) |
|
|
| **Implementation Complexity** | 2/5 |
|
|
|
|
**Limitation for TIP:** GDELT covers general news very well but B2B optical transceiver coverage is sparse. Testing showed only 1 article for "400G optical" in 3 months. Better for broader terms like "silicon photonics" or "data center optics."
|
|
|
|
#### Data Source B: NewsAPI.org
|
|
|
|
| Attribute | Detail |
|
|
|-----------|--------|
|
|
| **API URL** | `https://newsapi.org/v2/everything` |
|
|
| **Free Tier** | 100 requests/day, 1-month history, 24h delay, dev-only |
|
|
| **Paid** | From $40/month |
|
|
| **Python** | `requests` (simple REST) |
|
|
| **Implementation Complexity** | 1/5 |
|
|
|
|
#### Data Source C: Trade Press RSS/Scraping (RECOMMENDED for optical)
|
|
|
|
Monitor these sources directly (Crawlee/Playwright — already in TIP architecture):
|
|
|
|
| Source | URL | Relevance |
|
|
|--------|-----|-----------|
|
|
| LightReading | lightreading.com | Primary (optical networking) |
|
|
| Fierce Telecom | fiercetelecom.com | Primary |
|
|
| Datacenter Dynamics | datacenterdynamics.com | Primary |
|
|
| SDxCentral | sdxcentral.com | Primary |
|
|
| Lightwave Online | lightwaveonline.com | Primary (optical specific) |
|
|
| Gazettabyte | gazettabyte.com | High (standards/specs) |
|
|
| Converge Digest | convergedigest.com | Moderate |
|
|
| Semiconductor Today | semiconductor-today.com | Moderate (component level) |
|
|
|
|
#### Metrics to Extract
|
|
|
|
1. **Article count per technology per month** (volume)
|
|
2. **Sentiment score** using VADER (lightweight) or FinBERT (more accurate)
|
|
3. **Source diversity** — how many different outlets cover the topic
|
|
4. **Headline vs. mention** — is the technology the headline or just mentioned?
|
|
|
|
#### Sentiment Analysis Tools
|
|
|
|
| Tool | Type | Cost | Accuracy | Speed |
|
|
|------|------|------|----------|-------|
|
|
| VADER | Rule-based | Free | Good for general | Very fast |
|
|
| FinBERT | Transformer | Free | Best for financial/tech | Moderate |
|
|
| Ollama (qwen2.5:14b) | LLM | Free (local) | Very good | Slow |
|
|
| TextBlob | Rule-based | Free | Basic | Very fast |
|
|
|
|
**Recommendation:** Use VADER for initial scoring, Ollama for nuanced classification on flagged articles.
|
|
|
|
---
|
|
|
|
### 5. VENDOR COUNT / SKU PROLIFERATION (Market Adoption Signal)
|
|
|
|
**What it measures:** Market entry velocity, competitive maturation, commoditization
|
|
**Hype cycle relevance:** This is THE strongest signal for distinguishing Slope of Enlightenment from Plateau of Productivity. Directly measurable from TIP's own scraper data.
|
|
|
|
#### Data Source: TIP's Own Scraper Database (ZERO ADDITIONAL COST)
|
|
|
|
| Attribute | Detail |
|
|
|-----------|--------|
|
|
| **Source** | TIP price_observations + vendor tables |
|
|
| **Auth** | Internal |
|
|
| **Update Frequency** | Real-time (5-15 min scraper intervals) |
|
|
| **Cost** | Already being collected |
|
|
| **Implementation Complexity** | 1/5 (data already exists) |
|
|
|
|
#### Metrics
|
|
|
|
1. **Vendor Count per Technology** — How many vendors sell a given form factor/speed
|
|
- 1-3 vendors = Technology Trigger / early Peak
|
|
- 4-10 vendors = Peak / early Slope
|
|
- 10-30 vendors = Slope of Enlightenment
|
|
- 30+ vendors = Plateau of Productivity
|
|
|
|
2. **SKU Growth Rate** — New product listings per month
|
|
- Accelerating = Growth phase
|
|
- Decelerating = Maturation
|
|
- Flat = Plateau
|
|
|
|
3. **Price Coefficient of Variation (CV)** — Standard deviation / mean of prices across vendors
|
|
- High CV (>0.5) = Early market, pricing uncertainty
|
|
- Medium CV (0.2-0.5) = Competitive market
|
|
- Low CV (<0.2) = Commodity market (Plateau)
|
|
|
|
4. **Price Decline Rate** — $/Gbps over time
|
|
- Steep decline = Growth → Slope transition
|
|
- Gradual decline = Slope
|
|
- Flat = Plateau
|
|
|
|
5. **Geographic Vendor Distribution** — Chinese vendors entering = commoditization signal
|
|
|
|
#### Why This Signal is Critical
|
|
|
|
This is **the only signal that directly measures actual market behavior** rather than proxies (search interest, papers, patents). Combined with price data, it provides ground truth for hype cycle calibration.
|
|
|
|
---
|
|
|
|
### 6. STANDARDS PROGRESS (Technology Readiness Signal)
|
|
|
|
**What it measures:** Standardization maturity as proxy for technology readiness
|
|
**Hype cycle relevance:** Standards progress is a LEADING indicator. "Study group formed" precedes market by 3-5 years.
|
|
|
|
#### Standards Phase Mapping to Hype Cycle
|
|
|
|
| Standards Phase | Typical Duration | Hype Cycle Phase |
|
|
|----------------|-----------------|------------------|
|
|
| Call for Interest / Study Group | 6-12 months | Pre-Trigger |
|
|
| Task Force Formed | 0 | Technology Trigger |
|
|
| First Draft | 12-18 months | Peak of Inflated Expectations |
|
|
| Working Group Ballot | 6-12 months | Peak → Trough transition |
|
|
| Sponsor Ballot | 3-6 months | Trough → Slope |
|
|
| Standard Published | 0 | Slope of Enlightenment |
|
|
| First Amendment | 12-24 months | Plateau of Productivity |
|
|
|
|
#### Current State (validated 2026-03-28)
|
|
|
|
| Technology | Standard | Status | Hype Phase Inference |
|
|
|------------|----------|--------|---------------------|
|
|
| 400G Ethernet | IEEE 802.3bs | Published Dec 2017 | Plateau |
|
|
| 800G Ethernet (100G/lane) | IEEE 802.3df | Published Feb 2024 | Slope of Enlightenment |
|
|
| 800G Ethernet (200G/lane) | IEEE 802.3dj | In progress, target Jul 2026 | Peak → Trough |
|
|
| 1.6T Ethernet | IEEE 802.3dj | In progress, target Jul 2026 | Peak of Inflated Expectations |
|
|
| 3.2T Ethernet | OIF/MSA discussions | Study group phase | Pre-Trigger |
|
|
| 400ZR Coherent | OIF IA published Apr 2020 | Published | Late Slope |
|
|
|
|
#### Trackable Standards Bodies
|
|
|
|
| Body | What to Track | URL |
|
|
|------|--------------|-----|
|
|
| **IEEE 802.3** | Task force status, ballot dates | ieee802.org/3/ |
|
|
| **OIF** | Implementation Agreements (IAs), CMIS versions | oiforum.com/technical-work/implementation-agreements-ias/ |
|
|
| **QSFP-DD MSA** | Spec revisions (now at QSFP-DD1600) | qsfp-dd.com |
|
|
| **OSFP MSA** | Spec revisions (now at Rev 5.21) | osfpmsa.org |
|
|
| **100G Lambda MSA** | FR/LR specs | 100glambda.com |
|
|
|
|
#### Implementation
|
|
|
|
- Maintain a manually-curated `standards_progress` table
|
|
- Update quarterly (standards move slowly)
|
|
- Each standard gets a numeric score: 0 (no activity) → 10 (published + amendments)
|
|
- **Implementation Complexity:** 2/5 (manual curation, low frequency)
|
|
|
|
---
|
|
|
|
### 7. JOB MARKET SIGNALS (Demand/Deployment Signal)
|
|
|
|
**What it measures:** Actual hiring demand for technology-specific skills
|
|
**Hype cycle relevance:** Job posting surges lag the Peak by 12-18 months and correlate with Slope of Enlightenment.
|
|
|
|
#### Data Sources
|
|
|
|
| Source | Cost | API | Quality |
|
|
|--------|------|-----|---------|
|
|
| **TheirStack** | Free tier available | REST API | Best (deduplication, 324k ATS platforms) |
|
|
| **FlyByAPIs** | Free (200 req/month) | RapidAPI | Good (Google Jobs index) |
|
|
| **Sumble** | Free 500 credits/month | REST API | Good (LinkedIn + hiring signals) |
|
|
| **LinkedIn Talent** | Enterprise ($$$) | Partner only | Best but inaccessible |
|
|
| **Indeed Job Sync** | Free (partner) | REST API | Posting-focused, not search |
|
|
|
|
**Recommended:** TheirStack or FlyByAPIs for free tier.
|
|
|
|
#### Metrics
|
|
|
|
1. **Job posting count** per technology keyword per month
|
|
2. **Job posting velocity** — rate of change
|
|
3. **Salary range** — higher salaries = talent scarcity = early adoption
|
|
4. **Geographic distribution** — US/EU = early; APAC = maturation
|
|
|
|
#### Implementation Complexity: 3/5
|
|
|
|
---
|
|
|
|
### 8. SOCIAL MEDIA / COMMUNITY SIGNALS (Practitioner Interest)
|
|
|
|
**What it measures:** Operator and engineer discussion intensity
|
|
**Hype cycle relevance:** Community buzz leads deployment by 6-12 months.
|
|
|
|
#### Data Sources
|
|
|
|
| Source | API | Cost | Python Library |
|
|
|--------|-----|------|----------------|
|
|
| **Reddit** (r/networking, r/homelab, r/datacenter) | Reddit API via PRAW | Free | `praw` |
|
|
| **NANOG mailing list** | No API (scrape archives) | Free | `requests` + `beautifulsoup4` |
|
|
| **LinkedIn** | No public search API | N/A | N/A |
|
|
|
|
#### Reddit via PRAW
|
|
|
|
- Free Reddit API access (60 req/min)
|
|
- Search subreddits by keyword, filter by time
|
|
- Count posts + comments mentioning technology terms
|
|
- **PRAWtools** provides keyword alerts and subreddit statistics
|
|
- Limitation: 1,000 post search window
|
|
|
|
#### NANOG Mailing List
|
|
|
|
- Archives available at `nanog.org/nanog-mailing-list/list-archives/` and `marc.info`
|
|
- Monthly text file downloads available
|
|
- ETH Zurich thesis (Gehri 2021) demonstrated NLP topic modeling and sentiment analysis on 89,000+ NANOG emails
|
|
- No API — requires scraping or bulk download
|
|
- Highly relevant for optical networking technology adoption signals
|
|
|
|
#### Metrics
|
|
|
|
1. **Post/email count per technology per month**
|
|
2. **Engagement ratio** (comments/votes per post)
|
|
3. **Sentiment** (positive deployment reports vs. complaints)
|
|
4. **Question vs. statement ratio** (questions = early adoption; statements = maturity)
|
|
|
|
#### Implementation Complexity: 3/5
|
|
|
|
---
|
|
|
|
### 9. EARNINGS CALL / FINANCIAL SIGNALS (Enterprise Adoption Signal)
|
|
|
|
**What it measures:** How often public companies mention technologies in financial disclosures
|
|
**Hype cycle relevance:** Earnings call mentions are a LAGGING indicator that confirms Slope of Enlightenment → Plateau transition.
|
|
|
|
#### Data Source A: SEC EDGAR EFTS (VALIDATED — working, 899 filings found)
|
|
|
|
| Attribute | Detail |
|
|
|-----------|--------|
|
|
| **API URL** | `https://efts.sec.gov/LATEST/search-index` |
|
|
| **Auth** | None (free public API) |
|
|
| **Rate Limit** | ~10 requests/second (fair use) |
|
|
| **Update Frequency** | Real-time (new filings indexed immediately) |
|
|
| **Cost** | Free |
|
|
| **Coverage** | All SEC filings since ~1993 |
|
|
| **Python Library** | `requests` (direct) or `sec-api` (paid wrapper) |
|
|
| **Implementation Complexity** | 2/5 |
|
|
|
|
**Validated result:** Query for `"optical transceiver" OR "400G" OR "800G optics"` returned **899 filings** across 10-K, 10-Q, and 8-K forms.
|
|
|
|
#### Data Source B: Financial Modeling Prep (FMP)
|
|
|
|
| Attribute | Detail |
|
|
|-----------|--------|
|
|
| **API URL** | `https://financialmodelingprep.com/api/v3/earning_call_transcript/{SYMBOL}` |
|
|
| **Auth** | API key (free tier available) |
|
|
| **Cost** | Free tier, paid plans from $29/month |
|
|
| **Coverage** | Full earnings call transcripts for public companies |
|
|
| **Python Library** | `requests` |
|
|
| **Implementation Complexity** | 2/5 |
|
|
|
|
#### Target Companies for Optical Transceiver Mentions
|
|
|
|
| Ticker | Company | Relevance |
|
|
|--------|---------|-----------|
|
|
| COHR | Coherent Corp (formerly II-VI/Finisar) | Transceiver manufacturer |
|
|
| LITE | Lumentum | Laser/transceiver manufacturer |
|
|
| CSCO | Cisco | Network equipment + transceivers |
|
|
| JNPR | Juniper Networks | Network equipment |
|
|
| ANET | Arista Networks | Datacenter switching |
|
|
| AVGO | Broadcom | Transceiver silicon |
|
|
| INTC | Intel (Altera) | Silicon photonics |
|
|
| CIEN | Ciena | Coherent optics |
|
|
| INFN | Infinera | Coherent optics |
|
|
| AAOI | Applied Optoelectronics | Transceiver manufacturer |
|
|
|
|
#### Metrics
|
|
|
|
1. **Mention frequency** — count of technology term mentions per earnings call
|
|
2. **Mention sentiment** — positive/negative context around mentions
|
|
3. **First mention** — when a company first mentions a technology (leading indicator)
|
|
4. **Revenue attribution** — when companies break out revenue by technology generation
|
|
|
|
---
|
|
|
|
### 10. COMPOSITE SIGNAL ALGORITHM
|
|
|
|
#### Academic Foundation
|
|
|
|
**Ren (2015)**: "An Approach for Predicting Hype Cycle Based on Machine Learning" (CEUR-WS Vol-1437, IPAMIN 2015)
|
|
- Used SKNN (improved K-Nearest Neighbor) classifier
|
|
- Features extracted from paper data and patent data
|
|
- Achieved **67.24% precision, 68.46% recall** classifying technologies into 5 hype cycle phases
|
|
- Noted accuracy drops in phases 4-5 due to small training samples
|
|
|
|
**BIMATEM (Manrique-Castillo et al., Scientometrics 2018)**:
|
|
- Combines **three data streams**: scientific papers (logistic growth), patents (logistic growth), news (hype-type curve)
|
|
- Fits logistic regression to paper/patent counts
|
|
- Fits hype-type regression to news counts
|
|
- Assigns TRL (Technology Readiness Level) based on curve position
|
|
- Applied successfully to additive manufacturing technologies
|
|
|
|
**Composite Early Warning Index (CEWI) approach** (financial crisis literature):
|
|
- Uses PCA to synthesize diverse variables into a single latent factor
|
|
- Applicable to combining patent, publication, trends, and market signals
|
|
|
|
#### Recommended Algorithm: Weighted Multi-Signal Scoring
|
|
|
|
```
|
|
HypeScore(tech, t) = w1 * Patent_Signal(tech, t)
|
|
+ w2 * Publication_Signal(tech, t)
|
|
+ w3 * Trends_Signal(tech, t)
|
|
+ w4 * News_Signal(tech, t)
|
|
+ w5 * Vendor_Signal(tech, t)
|
|
+ w6 * Standards_Signal(tech, t)
|
|
+ w7 * Earnings_Signal(tech, t)
|
|
+ w8 * Jobs_Signal(tech, t)
|
|
```
|
|
|
|
#### Signal Time Horizons and Weights
|
|
|
|
| Signal | Lead/Lag | Suggested Weight | Update Freq |
|
|
|--------|----------|-----------------|-------------|
|
|
| Patents | Leads by 3-5 years | 0.10 | Quarterly |
|
|
| Publications | Leads by 1-3 years | 0.10 | Monthly |
|
|
| Google Trends | Real-time | 0.20 | Monthly |
|
|
| News Volume | Real-time | 0.10 | Weekly |
|
|
| **Vendor Count/Price** | **Real-time** | **0.25** | **Daily** |
|
|
| Standards Progress | Leads by 2-4 years | 0.10 | Quarterly |
|
|
| Earnings Calls | Lags by 6-12 months | 0.10 | Quarterly |
|
|
| Job Postings | Lags by 12-18 months | 0.05 | Monthly |
|
|
|
|
**Vendor Count/Price gets the highest weight** because it is the only direct market measurement.
|
|
|
|
#### Phase Classification Approach
|
|
|
|
1. **Normalize each signal** to 0-100 scale per technology
|
|
2. **Calculate rate of change** (first derivative) for each signal
|
|
3. **Calculate acceleration** (second derivative) for trend detection
|
|
4. **Apply phase classification rules:**
|
|
|
|
| Phase | Signal Pattern |
|
|
|-------|---------------|
|
|
| **Technology Trigger** | Patents rising, Publications starting, Trends near zero, Vendors 0-3, Standard in study group |
|
|
| **Peak of Inflated Expectations** | Trends peaking, News volume peaking, Publications rising fast, Vendors 3-8, Sentiment highly positive |
|
|
| **Trough of Disillusionment** | Trends declining, News declining, Sentiment negative, Vendors may decrease, Publications slowing |
|
|
| **Slope of Enlightenment** | Vendors growing steadily, Price CV declining, Earnings mentions increasing, Jobs increasing, Standards published |
|
|
| **Plateau of Productivity** | All signals stable, Price CV < 0.2, Vendor count > 30, Publications steady, Standards have amendments |
|
|
|
|
5. **Optional ML layer:** Train a Random Forest or Gradient Boosting classifier on known technology trajectories (100G, 40G, 10G historical data as training set)
|
|
|
|
#### Norton-Bass Integration
|
|
|
|
The composite signal feeds into the Norton-Bass multigenerational diffusion model:
|
|
- **p (innovation coefficient)** ← derived from patent/publication velocity
|
|
- **q (imitation coefficient)** ← derived from vendor count growth rate + Google Trends
|
|
- **M (market potential)** ← derived from addressable port count in deployed switches
|
|
- **tau (generation introduction time)** ← derived from IEEE standard publication date
|
|
- **Python:** `scipy.optimize.curve_fit` with Bass model function, or `bassmodeldiffusion` package (PyPI)
|
|
|
|
---
|
|
|
|
## Prioritized Implementation Plan
|
|
|
|
### Phase 1: Quick Wins (Week 1-2) — HIGH VALUE, LOW EFFORT
|
|
|
|
| # | Signal | API | Cost | Complexity | Why First |
|
|
|---|--------|-----|------|------------|-----------|
|
|
| 1 | **Google Trends** | pytrends | Free | 1/5 | Already validated, immediate hype measurement |
|
|
| 2 | **Vendor Count/Price** | Internal DB | Free | 1/5 | Data already being collected by TIP scrapers |
|
|
| 3 | **Semantic Scholar** | REST API | Free | 1/5 | Already validated, publication trend curves |
|
|
|
|
**Deliverable:** Basic hype cycle positioning for all tracked technologies using 3 signals.
|
|
|
|
### Phase 2: Depth Signals (Week 3-4) — HIGH VALUE, MODERATE EFFORT
|
|
|
|
| # | Signal | API | Cost | Complexity |
|
|
|---|--------|-----|------|------------|
|
|
| 4 | **SEC EDGAR EFTS** | REST API | Free | 2/5 |
|
|
| 5 | **Standards Progress** | Manual curation | Free | 2/5 |
|
|
| 6 | **Trade Press Scraping** | Crawlee (existing) | Free | 2/5 |
|
|
|
|
**Deliverable:** 6-signal composite with financial and standards validation.
|
|
|
|
### Phase 3: Extended Signals (Week 5-8) — MODERATE VALUE, HIGHER EFFORT
|
|
|
|
| # | Signal | API | Cost | Complexity |
|
|
|---|--------|-----|------|------------|
|
|
| 7 | **USPTO Patents** | PatentsView | Free (need API key) | 2/5 |
|
|
| 8 | **Reddit/PRAW** | Reddit API | Free | 3/5 |
|
|
| 9 | **Job Postings** | TheirStack/FlyByAPIs | Free tier | 3/5 |
|
|
| 10 | **Earnings Transcripts** | FMP | Free tier | 2/5 |
|
|
|
|
**Deliverable:** Full 10-signal composite with ML phase classifier.
|
|
|
|
### Phase 4: ML Calibration (Week 9-12)
|
|
|
|
1. Collect historical data for training technologies (10G, 40G, 100G — known trajectories)
|
|
2. Train Random Forest classifier on multi-signal features
|
|
3. Validate against known Gartner positioning (where available)
|
|
4. Implement Norton-Bass curve fitting with signal-derived parameters
|
|
5. Build confidence scoring and uncertainty quantification
|
|
|
|
---
|
|
|
|
## Key Python Dependencies
|
|
|
|
```
|
|
# Phase 1
|
|
pytrends==4.9.2 # Google Trends
|
|
semanticscholar # Paper counts
|
|
requests # General HTTP
|
|
scipy # Curve fitting (Norton-Bass)
|
|
numpy # Numerical
|
|
pandas # Data manipulation
|
|
|
|
# Phase 2
|
|
beautifulsoup4 # HTML parsing (trade press)
|
|
vaderSentiment # Sentiment analysis
|
|
|
|
# Phase 3
|
|
praw # Reddit API
|
|
bassmodeldiffusion # Bass model fitting
|
|
|
|
# Phase 4
|
|
scikit-learn # Random Forest, PCA
|
|
xgboost # Gradient boosting
|
|
```
|
|
|
|
---
|
|
|
|
## Signal Correlation Summary
|
|
|
|
| Signal | Free? | Real-time? | Validated? | Hype Correlation | Implementation |
|
|
|--------|-------|-----------|------------|-----------------|---------------|
|
|
| Google Trends | Yes | Yes | YES | HIGH (academic proof) | 1/5 |
|
|
| Vendor Count/Price | Yes | Yes | YES (own data) | HIGHEST (direct) | 1/5 |
|
|
| Semantic Scholar | Yes | Yes | YES | MODERATE-HIGH | 1/5 |
|
|
| SEC EDGAR EFTS | Yes | Yes | YES | MODERATE | 2/5 |
|
|
| News/Trade Press | Yes | Weekly | Partial | HIGH | 2/5 |
|
|
| Standards Progress | Yes | Quarterly | YES | HIGH (leading) | 2/5 |
|
|
| Patents (USPTO) | Yes | Quarterly | Not yet (API key needed) | MODERATE-HIGH | 2/5 |
|
|
| Reddit/PRAW | Yes | Daily | Not yet | LOW-MODERATE | 3/5 |
|
|
| Job Postings | Free tier | Daily | Not yet | MODERATE | 3/5 |
|
|
| Earnings Calls | Free tier | Quarterly | Not yet | MODERATE | 2/5 |
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
### Academic Papers
|
|
- Manrique-Castillo et al. (2018). "A bibliometric method for assessing technological maturity: the case of additive manufacturing." *Scientometrics* 117(3).
|
|
- Ren, Z. (2015). "An Approach for Predicting Hype Cycle Based on Machine Learning." CEUR-WS Vol-1437.
|
|
- Jun, S.P. (2012). "An empirical study of users' hype cycle based on search traffic." *Scientometrics* 91(1), 81-99.
|
|
- van Lente, H., Spitters, C., & Peine, A. (2013). "Comparing technological hype cycles." *Technological Forecasting and Social Change* 80(8).
|
|
- Gao, L. et al. (2013). "Technology life cycle analysis method based on patent documents." *Technological Forecasting and Social Change*.
|
|
- Huang et al. (2022). "Technology life cycle analysis: From the dynamic perspective of patent citation networks." *Technological Forecasting and Social Change*.
|
|
- Choi, H. & Varian, H. (2010). "Predicting the Present with Google Trends." SSRN.
|
|
- Dedehayir, O. & Steinert, M. (2016). "The hype cycle model: A review and future directions." *Technological Forecasting and Social Change* 108(C).
|
|
- Norton, J.A. & Bass, F.M. (1987). "A diffusion theory model of adoption and substitution for successive generations of high-technology products." *Management Science* 33(9).
|
|
- Gehri, L. (2021). "NANOG Mailing List Analysis." ETH Zurich Semester Thesis.
|
|
|
|
### API Documentation
|
|
- PatentsView Search API: https://search.patentsview.org/docs/
|
|
- Semantic Scholar API: https://api.semanticscholar.org/api-docs
|
|
- GDELT DOC API: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/
|
|
- SEC EDGAR EFTS: https://efts.sec.gov/LATEST/search-index
|
|
- Financial Modeling Prep: https://site.financialmodelingprep.com/developer/docs
|
|
- Google Trends (pytrends): https://pypi.org/project/pytrends/
|
|
- Reddit (PRAW): https://praw.readthedocs.io/
|
|
- IEEE 802.3dj Task Force: https://www.ieee802.org/3/dj/index.html
|
|
- OIF Implementation Agreements: https://www.oiforum.com/technical-work/implementation-agreements-ias/
|
|
|
|
### Python Libraries
|
|
- `pytrends`: https://pypi.org/project/pytrends/
|
|
- `semanticscholar`: https://pypi.org/project/semanticscholar/
|
|
- `gdeltdoc`: https://pypi.org/project/gdeltdoc/
|
|
- `praw`: https://pypi.org/project/praw/
|
|
- `bassmodeldiffusion`: https://github.com/marmiskarian/bassmodeldiffusion
|
|
- `vaderSentiment`: https://pypi.org/project/vaderSentiment/
|