New scrapers: - GBICS.com (BigCommerce, GBP prices, 10 categories, 78 products) - Juniper HCT (Next.js SSR parser, 475 transceivers with specs/EOL) - SFPcables.com (Magento store, 16 categories, 78 products) - Fluxlight (BigCommerce, 6 pages, 118 products) - Champion ONE (compatible vendor scraper) Scraper fixes: - 10Gtek: rewritten to parse HTML spec tables (152 products) - Flexoptix: fix price extraction from Magento Hyva HTML - Register all scrapers in CLI (--gbics, --juniper, --sfpcables, etc.) Hype Cycle Engine enhancements: - Data-driven enrichment from scraped vendor/price data - Revenue lifecycle prediction (peak year, decline, revenue index) - Regional adoption model (NA, China, APAC, Europe, RoW with lag coefficients) - New API endpoints: /enriched, /lifecycle, /regional/:tech DB growth: 89 → 1,168 transceivers, 0 → 416 prices, 6 vendors Qdrant: 1,162 products embedded with nomic-embed-text Research: Norton-Bass model, standards-to-market timelines, hype signals
31 KiB
Hype Cycle Signal Research: Quantifiable Data Inputs for Automatic Technology Positioning
Date: 2026-03-28 For: Transceiver Intelligence Platform (TIP) — Hype Cycle Engine Status: Deep Research Complete — Ready for Implementation Planning
Executive Summary
This document catalogs 10 quantifiable signal categories that can feed the TIP Hype Cycle Engine to automatically position optical transceiver technologies (400G, 800G, 1.6T, QSFP-DD, OSFP, silicon photonics, coherent pluggable, co-packaged optics, etc.) on a Norton-Bass-derived hype cycle.
Key finding: A composite of 5-6 signals provides robust positioning. No single signal is sufficient alone. The recommended Phase 1 implementation (3 signals, all free, all validated) can be built in ~2 weeks.
Signal Catalog
1. PATENT DATA (Technology Innovation Signal)
What it measures: R&D investment intensity, innovation velocity, technology maturation Hype cycle relevance: Patents LEAD actual market adoption by 3-5 years. Patent filing surges correlate with "Technology Trigger" and early "Peak of Inflated Expectations."
Data Source: USPTO PatentsView API (migrating to data.uspto.gov March 2026)
| Attribute | Detail |
|---|---|
| API URL | https://search.patentsview.org/api/v1/patent/ |
| Auth | API key required (header X-Api-Key). Free but new grants temporarily suspended during migration to data.uspto.gov |
| Rate Limit | 45 requests/minute |
| Update Frequency | Quarterly |
| Cost | Free (CC BY 4.0 license) |
| Python Library | requests (REST API), patentsview2 (R package, no maintained Python equivalent) |
| Implementation Complexity | 2/5 |
Relevant CPC Classes for Optical Transceivers
| CPC Class | Description |
|---|---|
| H04B10 | Transmission systems employing electromagnetic waves other than radio waves (optical communication) |
| G02B6 | Light guides; structural details of fibre-optic arrangements |
| H01S5 | Semiconductor lasers (VCSELs, DFB, EML — core transceiver components) |
| H04J14 | Optical multiplex systems (WDM, DWDM) |
| G02F1 | Devices or arrangements for the control of light intensity (modulators) |
Queryable Metrics
- Patent Filing Velocity — Count of new patent applications per CPC class per quarter
- Patent Grant Rate — Ratio of grants to applications (maturity indicator)
- Citation Velocity — How quickly new patents cite each other (hot field indicator)
- Technology Cycle Time (TCT) — Median age of citations (shorter = faster-moving field)
- Assignee Concentration — Herfindahl index of patent holders (few holders = early stage; many = maturation)
Example Query (PatentsView Search API)
GET https://search.patentsview.org/api/v1/patent/
?q={"_and":[{"_begins":{"cpc_at_issue.cpc_subclass_id":"H04B10"}},{"_gte":{"patent_date":"2024-01-01"}},{"_text_any":{"patent_abstract":"transceiver 400G 800G QSFP OSFP"}}]}
&f=["patent_id","patent_date","patent_title","assignees.assignee_organization"]
&o={"size":100}
Response includes total_hits for counting.
Academic Validation
- BIMATEM method (Manrique-Castillo et al., Scientometrics 2018): Patent records of mature technologies display logistic growth behavior. Fitting logistic curves to patent counts per technology enables TRL assignment.
- Gao et al. (2013): Using multiple patent-based indicators with a nearest-neighbour classifier for technology life cycle stage classification.
- Technology Cycle Time: Kayal's TCT indicator — median citation age predicts technology maturity phase.
Correlation with Hype Cycle Position
- High filing velocity + low grant rate = Technology Trigger / early Peak
- Peak filing count reached = Peak of Inflated Expectations
- Declining filings + rising citations = Trough / early Slope
- Stable filings + high citation density = Plateau of Productivity
2. ACADEMIC PUBLICATION METRICS (Knowledge Creation Signal)
What it measures: Scientific research intensity, knowledge maturation Hype cycle relevance: Publication counts follow a logistic S-curve. The inflection point of the S-curve corresponds roughly to the transition from Peak to Trough.
Data Source: Semantic Scholar API (VALIDATED — working)
| Attribute | Detail |
|---|---|
| API URL | https://api.semanticscholar.org/graph/v1/paper/search/bulk |
| Auth | None required (public). API key available for higher rate limits |
| Rate Limit | 1000 req/sec (shared unauthenticated), 1 req/sec (with free API key) |
| Update Frequency | Continuous (near real-time) |
| Cost | Free |
| Coverage | ~200 million papers across all disciplines |
| Python Library | semanticscholar (PyPI) or direct requests |
| Implementation Complexity | 1/5 |
Validated Paper Counts (tested 2026-03-28)
| Technology | Total Papers | Maturity Signal |
|---|---|---|
| silicon photonics transceiver | 905 | Mature (deep research base) |
| 100G transceiver | 144 | Late maturity |
| 400G transceiver | 100 | Growth phase |
| 200G transceiver | 43 | Moderate |
| coherent pluggable optics | 40 | Growth phase |
| 800G transceiver | 39 | Early growth |
| QSFP-DD optical | 26 | Emerging |
| OSFP transceiver | 11 | Very early |
| 1.6T transceiver optical | 10 | Pre-commercial |
Year-by-Year Trend (400G transceiver, validated)
| Year | Papers | Signal |
|---|---|---|
| 2018 | 10 | Early research |
| 2019 | 7 | Stable |
| 2020 | 7 | Stable |
| 2021 | 9 | Slight increase |
| 2022 | 15 | Growth spike |
| 2023 | 6 | Decline |
| 2024 | 8 | Recovery |
| 2025 | 12 | Resurgence |
This pattern (spike in 2022, decline 2023, recovery 2024-25) maps well to the 400G transition from Peak to Slope of Enlightenment.
Key Metrics to Extract
- Annual paper count per technology keyword
- Rate of change (first derivative — acceleration/deceleration)
- Citation count distribution — highly cited papers = foundational work = maturation
- Author diversity — many unique authors = broad interest = growth phase
- Venue distribution — OFC/ECOC papers vs. general journals
Supplementary Source: IEEE Xplore
- URL:
https://ieeexploreapi.ieee.org/api/v1/search/articles - API key required (free for research)
- Specifically covers OFC, ECOC, CLEO proceedings
- Higher signal quality for optical networking specifically
3. GOOGLE TRENDS (Public Interest / Hype Proxy)
What it measures: Search interest as a proxy for market attention and hype Hype cycle relevance: Google Trends data directly models the "hype" component. Academic validation exists (Jun 2012, van Lente 2013).
Data Source: Google Trends via pytrends (VALIDATED — working)
| Attribute | Detail |
|---|---|
| API | Unofficial (Google Trends web scraping via pytrends) |
| Auth | None |
| Rate Limit | ~10 requests/minute (unofficial, subject to blocking) |
| Update Frequency | Real-time (weekly/monthly granularity) |
| Cost | Free |
| Python Library | pytrends (PyPI, v4.9.2) |
| Implementation Complexity | 1/5 |
Validated Data (tested 2026-03-28)
Batch 1 — Form Factors & Speeds (relative to each other):
| Technology | Current Interest | Peak Value | Peak Date | Trajectory |
|---|---|---|---|---|
| silicon photonics | 100 (reference) | 100 | 2026-03 | Rising strongly |
| OSFP | 34 | 45 | 2024-05 | Peaked, declining |
| 800G transceiver | 10 | 10 | 2026-02 | Rising |
| QSFP-DD | 8 | 10 | 2025-11 | Declining from peak |
| 400G transceiver | 2 | 3 | 2025-12 | Low/stable (mature) |
Batch 2 — Emerging Technologies:
| Technology | Current Interest | Peak Value | Peak Date | Trajectory |
|---|---|---|---|---|
| co-packaged optics | 100 (reference) | 100 | 2026-03 | Rising strongly |
| coherent optics | 45 | 45 | 2026-03 | Rising |
| 1.6T ethernet | 5 | 14 | 2025-08 | Peaked, declining |
| 100G transceiver | 5 | 8 | 2026-02 | Low/stable |
Key Observations
- OSFP peaked May 2024 — consistent with 802.3df approval (Feb 2024) driving peak hype
- QSFP-DD declining from Nov 2025 peak — market settling
- co-packaged optics and silicon photonics surging — current hype leaders
- 400G transceiver at floor — fully mature, no hype left (Plateau of Productivity)
- 1.6T peaked Aug 2025 then declined — possible "Peak of Inflated Expectations" → Trough
Implementation Notes
- Normalize by comparing technologies against each other (relative index)
- Use monthly granularity for trend detection
- Calculate: peak detection, slope analysis, time-since-peak
- Combine with absolute volume signals (paper counts) since Google Trends is relative only
- Limitation: B2B niche terms have low search volumes — use broader terms ("silicon photonics" not "silicon photonics transceiver module QSFP-DD800")
Academic Validation
- Jun (2012): "An empirical study of users' hype cycle based on search traffic" — validated Google Trends hype cycle matching for hybrid cars (Scientometrics 91(1), pp. 81-99)
- van Lente, Spitters & Peine (2013): "Comparing technological hype cycles: Towards a theory" (Technological Forecasting and Social Change 80(8))
- Choi & Varian (2010): "Predicting the Present with Google Trends" (foundational paper on search data as predictor)
- Caveat: Medeiros et al. (arXiv 2021) document preprocessing requirements for reliable forecasting from Trends data
4. NEWS/MEDIA VOLUME (Hype Amplification Signal)
What it measures: Trade press and media coverage volume and sentiment Hype cycle relevance: News volume directly measures the "hype" dimension. Sentiment analysis distinguishes Peak (positive) from Trough (negative/absent).
Data Source A: GDELT DOC 2.0 API (VALIDATED — working, limited for niche B2B)
| Attribute | Detail |
|---|---|
| API URL | https://api.gdeltproject.org/api/v2/doc/doc |
| Auth | None |
| Rate Limit | Reasonable (no published limit) |
| Update Frequency | Every 15 minutes |
| Cost | Free |
| Coverage | 100+ languages, 65 translated, millions of sources |
| History | Last 3 months reliably (older data not guaranteed) |
| Python Library | gdeltdoc (PyPI) or gdeltPyR (PyPI) |
| Implementation Complexity | 2/5 |
Limitation for TIP: GDELT covers general news very well but B2B optical transceiver coverage is sparse. Testing showed only 1 article for "400G optical" in 3 months. Better for broader terms like "silicon photonics" or "data center optics."
Data Source B: NewsAPI.org
| Attribute | Detail |
|---|---|
| API URL | https://newsapi.org/v2/everything |
| Free Tier | 100 requests/day, 1-month history, 24h delay, dev-only |
| Paid | From $40/month |
| Python | requests (simple REST) |
| Implementation Complexity | 1/5 |
Data Source C: Trade Press RSS/Scraping (RECOMMENDED for optical)
Monitor these sources directly (Crawlee/Playwright — already in TIP architecture):
| Source | URL | Relevance |
|---|---|---|
| LightReading | lightreading.com | Primary (optical networking) |
| Fierce Telecom | fiercetelecom.com | Primary |
| Datacenter Dynamics | datacenterdynamics.com | Primary |
| SDxCentral | sdxcentral.com | Primary |
| Lightwave Online | lightwaveonline.com | Primary (optical specific) |
| Gazettabyte | gazettabyte.com | High (standards/specs) |
| Converge Digest | convergedigest.com | Moderate |
| Semiconductor Today | semiconductor-today.com | Moderate (component level) |
Metrics to Extract
- Article count per technology per month (volume)
- Sentiment score using VADER (lightweight) or FinBERT (more accurate)
- Source diversity — how many different outlets cover the topic
- Headline vs. mention — is the technology the headline or just mentioned?
Sentiment Analysis Tools
| Tool | Type | Cost | Accuracy | Speed |
|---|---|---|---|---|
| VADER | Rule-based | Free | Good for general | Very fast |
| FinBERT | Transformer | Free | Best for financial/tech | Moderate |
| Ollama (qwen2.5:14b) | LLM | Free (local) | Very good | Slow |
| TextBlob | Rule-based | Free | Basic | Very fast |
Recommendation: Use VADER for initial scoring, Ollama for nuanced classification on flagged articles.
5. VENDOR COUNT / SKU PROLIFERATION (Market Adoption Signal)
What it measures: Market entry velocity, competitive maturation, commoditization Hype cycle relevance: This is THE strongest signal for distinguishing Slope of Enlightenment from Plateau of Productivity. Directly measurable from TIP's own scraper data.
Data Source: TIP's Own Scraper Database (ZERO ADDITIONAL COST)
| Attribute | Detail |
|---|---|
| Source | TIP price_observations + vendor tables |
| Auth | Internal |
| Update Frequency | Real-time (5-15 min scraper intervals) |
| Cost | Already being collected |
| Implementation Complexity | 1/5 (data already exists) |
Metrics
-
Vendor Count per Technology — How many vendors sell a given form factor/speed
- 1-3 vendors = Technology Trigger / early Peak
- 4-10 vendors = Peak / early Slope
- 10-30 vendors = Slope of Enlightenment
- 30+ vendors = Plateau of Productivity
-
SKU Growth Rate — New product listings per month
- Accelerating = Growth phase
- Decelerating = Maturation
- Flat = Plateau
-
Price Coefficient of Variation (CV) — Standard deviation / mean of prices across vendors
- High CV (>0.5) = Early market, pricing uncertainty
- Medium CV (0.2-0.5) = Competitive market
- Low CV (<0.2) = Commodity market (Plateau)
-
Price Decline Rate — $/Gbps over time
- Steep decline = Growth → Slope transition
- Gradual decline = Slope
- Flat = Plateau
-
Geographic Vendor Distribution — Chinese vendors entering = commoditization signal
Why This Signal is Critical
This is the only signal that directly measures actual market behavior rather than proxies (search interest, papers, patents). Combined with price data, it provides ground truth for hype cycle calibration.
6. STANDARDS PROGRESS (Technology Readiness Signal)
What it measures: Standardization maturity as proxy for technology readiness Hype cycle relevance: Standards progress is a LEADING indicator. "Study group formed" precedes market by 3-5 years.
Standards Phase Mapping to Hype Cycle
| Standards Phase | Typical Duration | Hype Cycle Phase |
|---|---|---|
| Call for Interest / Study Group | 6-12 months | Pre-Trigger |
| Task Force Formed | 0 | Technology Trigger |
| First Draft | 12-18 months | Peak of Inflated Expectations |
| Working Group Ballot | 6-12 months | Peak → Trough transition |
| Sponsor Ballot | 3-6 months | Trough → Slope |
| Standard Published | 0 | Slope of Enlightenment |
| First Amendment | 12-24 months | Plateau of Productivity |
Current State (validated 2026-03-28)
| Technology | Standard | Status | Hype Phase Inference |
|---|---|---|---|
| 400G Ethernet | IEEE 802.3bs | Published Dec 2017 | Plateau |
| 800G Ethernet (100G/lane) | IEEE 802.3df | Published Feb 2024 | Slope of Enlightenment |
| 800G Ethernet (200G/lane) | IEEE 802.3dj | In progress, target Jul 2026 | Peak → Trough |
| 1.6T Ethernet | IEEE 802.3dj | In progress, target Jul 2026 | Peak of Inflated Expectations |
| 3.2T Ethernet | OIF/MSA discussions | Study group phase | Pre-Trigger |
| 400ZR Coherent | OIF IA published Apr 2020 | Published | Late Slope |
Trackable Standards Bodies
| Body | What to Track | URL |
|---|---|---|
| IEEE 802.3 | Task force status, ballot dates | ieee802.org/3/ |
| OIF | Implementation Agreements (IAs), CMIS versions | oiforum.com/technical-work/implementation-agreements-ias/ |
| QSFP-DD MSA | Spec revisions (now at QSFP-DD1600) | qsfp-dd.com |
| OSFP MSA | Spec revisions (now at Rev 5.21) | osfpmsa.org |
| 100G Lambda MSA | FR/LR specs | 100glambda.com |
Implementation
- Maintain a manually-curated
standards_progresstable - Update quarterly (standards move slowly)
- Each standard gets a numeric score: 0 (no activity) → 10 (published + amendments)
- Implementation Complexity: 2/5 (manual curation, low frequency)
7. JOB MARKET SIGNALS (Demand/Deployment Signal)
What it measures: Actual hiring demand for technology-specific skills Hype cycle relevance: Job posting surges lag the Peak by 12-18 months and correlate with Slope of Enlightenment.
Data Sources
| Source | Cost | API | Quality |
|---|---|---|---|
| TheirStack | Free tier available | REST API | Best (deduplication, 324k ATS platforms) |
| FlyByAPIs | Free (200 req/month) | RapidAPI | Good (Google Jobs index) |
| Sumble | Free 500 credits/month | REST API | Good (LinkedIn + hiring signals) |
| LinkedIn Talent | Enterprise ($$$) | Partner only | Best but inaccessible |
| Indeed Job Sync | Free (partner) | REST API | Posting-focused, not search |
Recommended: TheirStack or FlyByAPIs for free tier.
Metrics
- Job posting count per technology keyword per month
- Job posting velocity — rate of change
- Salary range — higher salaries = talent scarcity = early adoption
- Geographic distribution — US/EU = early; APAC = maturation
Implementation Complexity: 3/5
8. SOCIAL MEDIA / COMMUNITY SIGNALS (Practitioner Interest)
What it measures: Operator and engineer discussion intensity Hype cycle relevance: Community buzz leads deployment by 6-12 months.
Data Sources
| Source | API | Cost | Python Library |
|---|---|---|---|
| Reddit (r/networking, r/homelab, r/datacenter) | Reddit API via PRAW | Free | praw |
| NANOG mailing list | No API (scrape archives) | Free | requests + beautifulsoup4 |
| No public search API | N/A | N/A |
Reddit via PRAW
- Free Reddit API access (60 req/min)
- Search subreddits by keyword, filter by time
- Count posts + comments mentioning technology terms
- PRAWtools provides keyword alerts and subreddit statistics
- Limitation: 1,000 post search window
NANOG Mailing List
- Archives available at
nanog.org/nanog-mailing-list/list-archives/andmarc.info - Monthly text file downloads available
- ETH Zurich thesis (Gehri 2021) demonstrated NLP topic modeling and sentiment analysis on 89,000+ NANOG emails
- No API — requires scraping or bulk download
- Highly relevant for optical networking technology adoption signals
Metrics
- Post/email count per technology per month
- Engagement ratio (comments/votes per post)
- Sentiment (positive deployment reports vs. complaints)
- Question vs. statement ratio (questions = early adoption; statements = maturity)
Implementation Complexity: 3/5
9. EARNINGS CALL / FINANCIAL SIGNALS (Enterprise Adoption Signal)
What it measures: How often public companies mention technologies in financial disclosures Hype cycle relevance: Earnings call mentions are a LAGGING indicator that confirms Slope of Enlightenment → Plateau transition.
Data Source A: SEC EDGAR EFTS (VALIDATED — working, 899 filings found)
| Attribute | Detail |
|---|---|
| API URL | https://efts.sec.gov/LATEST/search-index |
| Auth | None (free public API) |
| Rate Limit | ~10 requests/second (fair use) |
| Update Frequency | Real-time (new filings indexed immediately) |
| Cost | Free |
| Coverage | All SEC filings since ~1993 |
| Python Library | requests (direct) or sec-api (paid wrapper) |
| Implementation Complexity | 2/5 |
Validated result: Query for "optical transceiver" OR "400G" OR "800G optics" returned 899 filings across 10-K, 10-Q, and 8-K forms.
Data Source B: Financial Modeling Prep (FMP)
| Attribute | Detail |
|---|---|
| API URL | https://financialmodelingprep.com/api/v3/earning_call_transcript/{SYMBOL} |
| Auth | API key (free tier available) |
| Cost | Free tier, paid plans from $29/month |
| Coverage | Full earnings call transcripts for public companies |
| Python Library | requests |
| Implementation Complexity | 2/5 |
Target Companies for Optical Transceiver Mentions
| Ticker | Company | Relevance |
|---|---|---|
| COHR | Coherent Corp (formerly II-VI/Finisar) | Transceiver manufacturer |
| LITE | Lumentum | Laser/transceiver manufacturer |
| CSCO | Cisco | Network equipment + transceivers |
| JNPR | Juniper Networks | Network equipment |
| ANET | Arista Networks | Datacenter switching |
| AVGO | Broadcom | Transceiver silicon |
| INTC | Intel (Altera) | Silicon photonics |
| CIEN | Ciena | Coherent optics |
| INFN | Infinera | Coherent optics |
| AAOI | Applied Optoelectronics | Transceiver manufacturer |
Metrics
- Mention frequency — count of technology term mentions per earnings call
- Mention sentiment — positive/negative context around mentions
- First mention — when a company first mentions a technology (leading indicator)
- Revenue attribution — when companies break out revenue by technology generation
10. COMPOSITE SIGNAL ALGORITHM
Academic Foundation
Ren (2015): "An Approach for Predicting Hype Cycle Based on Machine Learning" (CEUR-WS Vol-1437, IPAMIN 2015)
- Used SKNN (improved K-Nearest Neighbor) classifier
- Features extracted from paper data and patent data
- Achieved 67.24% precision, 68.46% recall classifying technologies into 5 hype cycle phases
- Noted accuracy drops in phases 4-5 due to small training samples
BIMATEM (Manrique-Castillo et al., Scientometrics 2018):
- Combines three data streams: scientific papers (logistic growth), patents (logistic growth), news (hype-type curve)
- Fits logistic regression to paper/patent counts
- Fits hype-type regression to news counts
- Assigns TRL (Technology Readiness Level) based on curve position
- Applied successfully to additive manufacturing technologies
Composite Early Warning Index (CEWI) approach (financial crisis literature):
- Uses PCA to synthesize diverse variables into a single latent factor
- Applicable to combining patent, publication, trends, and market signals
Recommended Algorithm: Weighted Multi-Signal Scoring
HypeScore(tech, t) = w1 * Patent_Signal(tech, t)
+ w2 * Publication_Signal(tech, t)
+ w3 * Trends_Signal(tech, t)
+ w4 * News_Signal(tech, t)
+ w5 * Vendor_Signal(tech, t)
+ w6 * Standards_Signal(tech, t)
+ w7 * Earnings_Signal(tech, t)
+ w8 * Jobs_Signal(tech, t)
Signal Time Horizons and Weights
| Signal | Lead/Lag | Suggested Weight | Update Freq |
|---|---|---|---|
| Patents | Leads by 3-5 years | 0.10 | Quarterly |
| Publications | Leads by 1-3 years | 0.10 | Monthly |
| Google Trends | Real-time | 0.20 | Monthly |
| News Volume | Real-time | 0.10 | Weekly |
| Vendor Count/Price | Real-time | 0.25 | Daily |
| Standards Progress | Leads by 2-4 years | 0.10 | Quarterly |
| Earnings Calls | Lags by 6-12 months | 0.10 | Quarterly |
| Job Postings | Lags by 12-18 months | 0.05 | Monthly |
Vendor Count/Price gets the highest weight because it is the only direct market measurement.
Phase Classification Approach
- Normalize each signal to 0-100 scale per technology
- Calculate rate of change (first derivative) for each signal
- Calculate acceleration (second derivative) for trend detection
- Apply phase classification rules:
| Phase | Signal Pattern |
|---|---|
| Technology Trigger | Patents rising, Publications starting, Trends near zero, Vendors 0-3, Standard in study group |
| Peak of Inflated Expectations | Trends peaking, News volume peaking, Publications rising fast, Vendors 3-8, Sentiment highly positive |
| Trough of Disillusionment | Trends declining, News declining, Sentiment negative, Vendors may decrease, Publications slowing |
| Slope of Enlightenment | Vendors growing steadily, Price CV declining, Earnings mentions increasing, Jobs increasing, Standards published |
| Plateau of Productivity | All signals stable, Price CV < 0.2, Vendor count > 30, Publications steady, Standards have amendments |
- Optional ML layer: Train a Random Forest or Gradient Boosting classifier on known technology trajectories (100G, 40G, 10G historical data as training set)
Norton-Bass Integration
The composite signal feeds into the Norton-Bass multigenerational diffusion model:
- p (innovation coefficient) ← derived from patent/publication velocity
- q (imitation coefficient) ← derived from vendor count growth rate + Google Trends
- M (market potential) ← derived from addressable port count in deployed switches
- tau (generation introduction time) ← derived from IEEE standard publication date
- Python:
scipy.optimize.curve_fitwith Bass model function, orbassmodeldiffusionpackage (PyPI)
Prioritized Implementation Plan
Phase 1: Quick Wins (Week 1-2) — HIGH VALUE, LOW EFFORT
| # | Signal | API | Cost | Complexity | Why First |
|---|---|---|---|---|---|
| 1 | Google Trends | pytrends | Free | 1/5 | Already validated, immediate hype measurement |
| 2 | Vendor Count/Price | Internal DB | Free | 1/5 | Data already being collected by TIP scrapers |
| 3 | Semantic Scholar | REST API | Free | 1/5 | Already validated, publication trend curves |
Deliverable: Basic hype cycle positioning for all tracked technologies using 3 signals.
Phase 2: Depth Signals (Week 3-4) — HIGH VALUE, MODERATE EFFORT
| # | Signal | API | Cost | Complexity |
|---|---|---|---|---|
| 4 | SEC EDGAR EFTS | REST API | Free | 2/5 |
| 5 | Standards Progress | Manual curation | Free | 2/5 |
| 6 | Trade Press Scraping | Crawlee (existing) | Free | 2/5 |
Deliverable: 6-signal composite with financial and standards validation.
Phase 3: Extended Signals (Week 5-8) — MODERATE VALUE, HIGHER EFFORT
| # | Signal | API | Cost | Complexity |
|---|---|---|---|---|
| 7 | USPTO Patents | PatentsView | Free (need API key) | 2/5 |
| 8 | Reddit/PRAW | Reddit API | Free | 3/5 |
| 9 | Job Postings | TheirStack/FlyByAPIs | Free tier | 3/5 |
| 10 | Earnings Transcripts | FMP | Free tier | 2/5 |
Deliverable: Full 10-signal composite with ML phase classifier.
Phase 4: ML Calibration (Week 9-12)
- Collect historical data for training technologies (10G, 40G, 100G — known trajectories)
- Train Random Forest classifier on multi-signal features
- Validate against known Gartner positioning (where available)
- Implement Norton-Bass curve fitting with signal-derived parameters
- Build confidence scoring and uncertainty quantification
Key Python Dependencies
# Phase 1
pytrends==4.9.2 # Google Trends
semanticscholar # Paper counts
requests # General HTTP
scipy # Curve fitting (Norton-Bass)
numpy # Numerical
pandas # Data manipulation
# Phase 2
beautifulsoup4 # HTML parsing (trade press)
vaderSentiment # Sentiment analysis
# Phase 3
praw # Reddit API
bassmodeldiffusion # Bass model fitting
# Phase 4
scikit-learn # Random Forest, PCA
xgboost # Gradient boosting
Signal Correlation Summary
| Signal | Free? | Real-time? | Validated? | Hype Correlation | Implementation |
|---|---|---|---|---|---|
| Google Trends | Yes | Yes | YES | HIGH (academic proof) | 1/5 |
| Vendor Count/Price | Yes | Yes | YES (own data) | HIGHEST (direct) | 1/5 |
| Semantic Scholar | Yes | Yes | YES | MODERATE-HIGH | 1/5 |
| SEC EDGAR EFTS | Yes | Yes | YES | MODERATE | 2/5 |
| News/Trade Press | Yes | Weekly | Partial | HIGH | 2/5 |
| Standards Progress | Yes | Quarterly | YES | HIGH (leading) | 2/5 |
| Patents (USPTO) | Yes | Quarterly | Not yet (API key needed) | MODERATE-HIGH | 2/5 |
| Reddit/PRAW | Yes | Daily | Not yet | LOW-MODERATE | 3/5 |
| Job Postings | Free tier | Daily | Not yet | MODERATE | 3/5 |
| Earnings Calls | Free tier | Quarterly | Not yet | MODERATE | 2/5 |
References
Academic Papers
- Manrique-Castillo et al. (2018). "A bibliometric method for assessing technological maturity: the case of additive manufacturing." Scientometrics 117(3).
- Ren, Z. (2015). "An Approach for Predicting Hype Cycle Based on Machine Learning." CEUR-WS Vol-1437.
- Jun, S.P. (2012). "An empirical study of users' hype cycle based on search traffic." Scientometrics 91(1), 81-99.
- van Lente, H., Spitters, C., & Peine, A. (2013). "Comparing technological hype cycles." Technological Forecasting and Social Change 80(8).
- Gao, L. et al. (2013). "Technology life cycle analysis method based on patent documents." Technological Forecasting and Social Change.
- Huang et al. (2022). "Technology life cycle analysis: From the dynamic perspective of patent citation networks." Technological Forecasting and Social Change.
- Choi, H. & Varian, H. (2010). "Predicting the Present with Google Trends." SSRN.
- Dedehayir, O. & Steinert, M. (2016). "The hype cycle model: A review and future directions." Technological Forecasting and Social Change 108(C).
- Norton, J.A. & Bass, F.M. (1987). "A diffusion theory model of adoption and substitution for successive generations of high-technology products." Management Science 33(9).
- Gehri, L. (2021). "NANOG Mailing List Analysis." ETH Zurich Semester Thesis.
API Documentation
- PatentsView Search API: https://search.patentsview.org/docs/
- Semantic Scholar API: https://api.semanticscholar.org/api-docs
- GDELT DOC API: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/
- SEC EDGAR EFTS: https://efts.sec.gov/LATEST/search-index
- Financial Modeling Prep: https://site.financialmodelingprep.com/developer/docs
- Google Trends (pytrends): https://pypi.org/project/pytrends/
- Reddit (PRAW): https://praw.readthedocs.io/
- IEEE 802.3dj Task Force: https://www.ieee802.org/3/dj/index.html
- OIF Implementation Agreements: https://www.oiforum.com/technical-work/implementation-agreements-ias/
Python Libraries
pytrends: https://pypi.org/project/pytrends/semanticscholar: https://pypi.org/project/semanticscholar/gdeltdoc: https://pypi.org/project/gdeltdoc/praw: https://pypi.org/project/praw/bassmodeldiffusion: https://github.com/marmiskarian/bassmodeldiffusionvaderSentiment: https://pypi.org/project/vaderSentiment/