transceiver-db/RESEARCH-hype-cycle-signals.md
Rene Fichtmueller c6308e93c0 feat: massive scraper expansion + hype cycle engine + lifecycle prediction
New scrapers:
- GBICS.com (BigCommerce, GBP prices, 10 categories, 78 products)
- Juniper HCT (Next.js SSR parser, 475 transceivers with specs/EOL)
- SFPcables.com (Magento store, 16 categories, 78 products)
- Fluxlight (BigCommerce, 6 pages, 118 products)
- Champion ONE (compatible vendor scraper)

Scraper fixes:
- 10Gtek: rewritten to parse HTML spec tables (152 products)
- Flexoptix: fix price extraction from Magento Hyva HTML
- Register all scrapers in CLI (--gbics, --juniper, --sfpcables, etc.)

Hype Cycle Engine enhancements:
- Data-driven enrichment from scraped vendor/price data
- Revenue lifecycle prediction (peak year, decline, revenue index)
- Regional adoption model (NA, China, APAC, Europe, RoW with lag coefficients)
- New API endpoints: /enriched, /lifecycle, /regional/:tech

DB growth: 89 → 1,168 transceivers, 0 → 416 prices, 6 vendors
Qdrant: 1,162 products embedded with nomic-embed-text

Research: Norton-Bass model, standards-to-market timelines, hype signals
2026-03-28 02:30:19 +13:00

31 KiB

Hype Cycle Signal Research: Quantifiable Data Inputs for Automatic Technology Positioning

Date: 2026-03-28 For: Transceiver Intelligence Platform (TIP) — Hype Cycle Engine Status: Deep Research Complete — Ready for Implementation Planning


Executive Summary

This document catalogs 10 quantifiable signal categories that can feed the TIP Hype Cycle Engine to automatically position optical transceiver technologies (400G, 800G, 1.6T, QSFP-DD, OSFP, silicon photonics, coherent pluggable, co-packaged optics, etc.) on a Norton-Bass-derived hype cycle.

Key finding: A composite of 5-6 signals provides robust positioning. No single signal is sufficient alone. The recommended Phase 1 implementation (3 signals, all free, all validated) can be built in ~2 weeks.


Signal Catalog


1. PATENT DATA (Technology Innovation Signal)

What it measures: R&D investment intensity, innovation velocity, technology maturation Hype cycle relevance: Patents LEAD actual market adoption by 3-5 years. Patent filing surges correlate with "Technology Trigger" and early "Peak of Inflated Expectations."

Data Source: USPTO PatentsView API (migrating to data.uspto.gov March 2026)

Attribute Detail
API URL https://search.patentsview.org/api/v1/patent/
Auth API key required (header X-Api-Key). Free but new grants temporarily suspended during migration to data.uspto.gov
Rate Limit 45 requests/minute
Update Frequency Quarterly
Cost Free (CC BY 4.0 license)
Python Library requests (REST API), patentsview2 (R package, no maintained Python equivalent)
Implementation Complexity 2/5

Relevant CPC Classes for Optical Transceivers

CPC Class Description
H04B10 Transmission systems employing electromagnetic waves other than radio waves (optical communication)
G02B6 Light guides; structural details of fibre-optic arrangements
H01S5 Semiconductor lasers (VCSELs, DFB, EML — core transceiver components)
H04J14 Optical multiplex systems (WDM, DWDM)
G02F1 Devices or arrangements for the control of light intensity (modulators)

Queryable Metrics

  1. Patent Filing Velocity — Count of new patent applications per CPC class per quarter
  2. Patent Grant Rate — Ratio of grants to applications (maturity indicator)
  3. Citation Velocity — How quickly new patents cite each other (hot field indicator)
  4. Technology Cycle Time (TCT) — Median age of citations (shorter = faster-moving field)
  5. Assignee Concentration — Herfindahl index of patent holders (few holders = early stage; many = maturation)

Example Query (PatentsView Search API)

GET https://search.patentsview.org/api/v1/patent/
?q={"_and":[{"_begins":{"cpc_at_issue.cpc_subclass_id":"H04B10"}},{"_gte":{"patent_date":"2024-01-01"}},{"_text_any":{"patent_abstract":"transceiver 400G 800G QSFP OSFP"}}]}
&f=["patent_id","patent_date","patent_title","assignees.assignee_organization"]
&o={"size":100}

Response includes total_hits for counting.

Academic Validation

  • BIMATEM method (Manrique-Castillo et al., Scientometrics 2018): Patent records of mature technologies display logistic growth behavior. Fitting logistic curves to patent counts per technology enables TRL assignment.
  • Gao et al. (2013): Using multiple patent-based indicators with a nearest-neighbour classifier for technology life cycle stage classification.
  • Technology Cycle Time: Kayal's TCT indicator — median citation age predicts technology maturity phase.

Correlation with Hype Cycle Position

  • High filing velocity + low grant rate = Technology Trigger / early Peak
  • Peak filing count reached = Peak of Inflated Expectations
  • Declining filings + rising citations = Trough / early Slope
  • Stable filings + high citation density = Plateau of Productivity

2. ACADEMIC PUBLICATION METRICS (Knowledge Creation Signal)

What it measures: Scientific research intensity, knowledge maturation Hype cycle relevance: Publication counts follow a logistic S-curve. The inflection point of the S-curve corresponds roughly to the transition from Peak to Trough.

Data Source: Semantic Scholar API (VALIDATED — working)

Attribute Detail
API URL https://api.semanticscholar.org/graph/v1/paper/search/bulk
Auth None required (public). API key available for higher rate limits
Rate Limit 1000 req/sec (shared unauthenticated), 1 req/sec (with free API key)
Update Frequency Continuous (near real-time)
Cost Free
Coverage ~200 million papers across all disciplines
Python Library semanticscholar (PyPI) or direct requests
Implementation Complexity 1/5

Validated Paper Counts (tested 2026-03-28)

Technology Total Papers Maturity Signal
silicon photonics transceiver 905 Mature (deep research base)
100G transceiver 144 Late maturity
400G transceiver 100 Growth phase
200G transceiver 43 Moderate
coherent pluggable optics 40 Growth phase
800G transceiver 39 Early growth
QSFP-DD optical 26 Emerging
OSFP transceiver 11 Very early
1.6T transceiver optical 10 Pre-commercial

Year-by-Year Trend (400G transceiver, validated)

Year Papers Signal
2018 10 Early research
2019 7 Stable
2020 7 Stable
2021 9 Slight increase
2022 15 Growth spike
2023 6 Decline
2024 8 Recovery
2025 12 Resurgence

This pattern (spike in 2022, decline 2023, recovery 2024-25) maps well to the 400G transition from Peak to Slope of Enlightenment.

Key Metrics to Extract

  1. Annual paper count per technology keyword
  2. Rate of change (first derivative — acceleration/deceleration)
  3. Citation count distribution — highly cited papers = foundational work = maturation
  4. Author diversity — many unique authors = broad interest = growth phase
  5. Venue distribution — OFC/ECOC papers vs. general journals

Supplementary Source: IEEE Xplore

  • URL: https://ieeexploreapi.ieee.org/api/v1/search/articles
  • API key required (free for research)
  • Specifically covers OFC, ECOC, CLEO proceedings
  • Higher signal quality for optical networking specifically

What it measures: Search interest as a proxy for market attention and hype Hype cycle relevance: Google Trends data directly models the "hype" component. Academic validation exists (Jun 2012, van Lente 2013).

Attribute Detail
API Unofficial (Google Trends web scraping via pytrends)
Auth None
Rate Limit ~10 requests/minute (unofficial, subject to blocking)
Update Frequency Real-time (weekly/monthly granularity)
Cost Free
Python Library pytrends (PyPI, v4.9.2)
Implementation Complexity 1/5

Validated Data (tested 2026-03-28)

Batch 1 — Form Factors & Speeds (relative to each other):

Technology Current Interest Peak Value Peak Date Trajectory
silicon photonics 100 (reference) 100 2026-03 Rising strongly
OSFP 34 45 2024-05 Peaked, declining
800G transceiver 10 10 2026-02 Rising
QSFP-DD 8 10 2025-11 Declining from peak
400G transceiver 2 3 2025-12 Low/stable (mature)

Batch 2 — Emerging Technologies:

Technology Current Interest Peak Value Peak Date Trajectory
co-packaged optics 100 (reference) 100 2026-03 Rising strongly
coherent optics 45 45 2026-03 Rising
1.6T ethernet 5 14 2025-08 Peaked, declining
100G transceiver 5 8 2026-02 Low/stable

Key Observations

  • OSFP peaked May 2024 — consistent with 802.3df approval (Feb 2024) driving peak hype
  • QSFP-DD declining from Nov 2025 peak — market settling
  • co-packaged optics and silicon photonics surging — current hype leaders
  • 400G transceiver at floor — fully mature, no hype left (Plateau of Productivity)
  • 1.6T peaked Aug 2025 then declined — possible "Peak of Inflated Expectations" → Trough

Implementation Notes

  • Normalize by comparing technologies against each other (relative index)
  • Use monthly granularity for trend detection
  • Calculate: peak detection, slope analysis, time-since-peak
  • Combine with absolute volume signals (paper counts) since Google Trends is relative only
  • Limitation: B2B niche terms have low search volumes — use broader terms ("silicon photonics" not "silicon photonics transceiver module QSFP-DD800")

Academic Validation

  • Jun (2012): "An empirical study of users' hype cycle based on search traffic" — validated Google Trends hype cycle matching for hybrid cars (Scientometrics 91(1), pp. 81-99)
  • van Lente, Spitters & Peine (2013): "Comparing technological hype cycles: Towards a theory" (Technological Forecasting and Social Change 80(8))
  • Choi & Varian (2010): "Predicting the Present with Google Trends" (foundational paper on search data as predictor)
  • Caveat: Medeiros et al. (arXiv 2021) document preprocessing requirements for reliable forecasting from Trends data

4. NEWS/MEDIA VOLUME (Hype Amplification Signal)

What it measures: Trade press and media coverage volume and sentiment Hype cycle relevance: News volume directly measures the "hype" dimension. Sentiment analysis distinguishes Peak (positive) from Trough (negative/absent).

Data Source A: GDELT DOC 2.0 API (VALIDATED — working, limited for niche B2B)

Attribute Detail
API URL https://api.gdeltproject.org/api/v2/doc/doc
Auth None
Rate Limit Reasonable (no published limit)
Update Frequency Every 15 minutes
Cost Free
Coverage 100+ languages, 65 translated, millions of sources
History Last 3 months reliably (older data not guaranteed)
Python Library gdeltdoc (PyPI) or gdeltPyR (PyPI)
Implementation Complexity 2/5

Limitation for TIP: GDELT covers general news very well but B2B optical transceiver coverage is sparse. Testing showed only 1 article for "400G optical" in 3 months. Better for broader terms like "silicon photonics" or "data center optics."

Data Source B: NewsAPI.org

Attribute Detail
API URL https://newsapi.org/v2/everything
Free Tier 100 requests/day, 1-month history, 24h delay, dev-only
Paid From $40/month
Python requests (simple REST)
Implementation Complexity 1/5

Monitor these sources directly (Crawlee/Playwright — already in TIP architecture):

Source URL Relevance
LightReading lightreading.com Primary (optical networking)
Fierce Telecom fiercetelecom.com Primary
Datacenter Dynamics datacenterdynamics.com Primary
SDxCentral sdxcentral.com Primary
Lightwave Online lightwaveonline.com Primary (optical specific)
Gazettabyte gazettabyte.com High (standards/specs)
Converge Digest convergedigest.com Moderate
Semiconductor Today semiconductor-today.com Moderate (component level)

Metrics to Extract

  1. Article count per technology per month (volume)
  2. Sentiment score using VADER (lightweight) or FinBERT (more accurate)
  3. Source diversity — how many different outlets cover the topic
  4. Headline vs. mention — is the technology the headline or just mentioned?

Sentiment Analysis Tools

Tool Type Cost Accuracy Speed
VADER Rule-based Free Good for general Very fast
FinBERT Transformer Free Best for financial/tech Moderate
Ollama (qwen2.5:14b) LLM Free (local) Very good Slow
TextBlob Rule-based Free Basic Very fast

Recommendation: Use VADER for initial scoring, Ollama for nuanced classification on flagged articles.


5. VENDOR COUNT / SKU PROLIFERATION (Market Adoption Signal)

What it measures: Market entry velocity, competitive maturation, commoditization Hype cycle relevance: This is THE strongest signal for distinguishing Slope of Enlightenment from Plateau of Productivity. Directly measurable from TIP's own scraper data.

Data Source: TIP's Own Scraper Database (ZERO ADDITIONAL COST)

Attribute Detail
Source TIP price_observations + vendor tables
Auth Internal
Update Frequency Real-time (5-15 min scraper intervals)
Cost Already being collected
Implementation Complexity 1/5 (data already exists)

Metrics

  1. Vendor Count per Technology — How many vendors sell a given form factor/speed

    • 1-3 vendors = Technology Trigger / early Peak
    • 4-10 vendors = Peak / early Slope
    • 10-30 vendors = Slope of Enlightenment
    • 30+ vendors = Plateau of Productivity
  2. SKU Growth Rate — New product listings per month

    • Accelerating = Growth phase
    • Decelerating = Maturation
    • Flat = Plateau
  3. Price Coefficient of Variation (CV) — Standard deviation / mean of prices across vendors

    • High CV (>0.5) = Early market, pricing uncertainty
    • Medium CV (0.2-0.5) = Competitive market
    • Low CV (<0.2) = Commodity market (Plateau)
  4. Price Decline Rate — $/Gbps over time

    • Steep decline = Growth → Slope transition
    • Gradual decline = Slope
    • Flat = Plateau
  5. Geographic Vendor Distribution — Chinese vendors entering = commoditization signal

Why This Signal is Critical

This is the only signal that directly measures actual market behavior rather than proxies (search interest, papers, patents). Combined with price data, it provides ground truth for hype cycle calibration.


6. STANDARDS PROGRESS (Technology Readiness Signal)

What it measures: Standardization maturity as proxy for technology readiness Hype cycle relevance: Standards progress is a LEADING indicator. "Study group formed" precedes market by 3-5 years.

Standards Phase Mapping to Hype Cycle

Standards Phase Typical Duration Hype Cycle Phase
Call for Interest / Study Group 6-12 months Pre-Trigger
Task Force Formed 0 Technology Trigger
First Draft 12-18 months Peak of Inflated Expectations
Working Group Ballot 6-12 months Peak → Trough transition
Sponsor Ballot 3-6 months Trough → Slope
Standard Published 0 Slope of Enlightenment
First Amendment 12-24 months Plateau of Productivity

Current State (validated 2026-03-28)

Technology Standard Status Hype Phase Inference
400G Ethernet IEEE 802.3bs Published Dec 2017 Plateau
800G Ethernet (100G/lane) IEEE 802.3df Published Feb 2024 Slope of Enlightenment
800G Ethernet (200G/lane) IEEE 802.3dj In progress, target Jul 2026 Peak → Trough
1.6T Ethernet IEEE 802.3dj In progress, target Jul 2026 Peak of Inflated Expectations
3.2T Ethernet OIF/MSA discussions Study group phase Pre-Trigger
400ZR Coherent OIF IA published Apr 2020 Published Late Slope

Trackable Standards Bodies

Body What to Track URL
IEEE 802.3 Task force status, ballot dates ieee802.org/3/
OIF Implementation Agreements (IAs), CMIS versions oiforum.com/technical-work/implementation-agreements-ias/
QSFP-DD MSA Spec revisions (now at QSFP-DD1600) qsfp-dd.com
OSFP MSA Spec revisions (now at Rev 5.21) osfpmsa.org
100G Lambda MSA FR/LR specs 100glambda.com

Implementation

  • Maintain a manually-curated standards_progress table
  • Update quarterly (standards move slowly)
  • Each standard gets a numeric score: 0 (no activity) → 10 (published + amendments)
  • Implementation Complexity: 2/5 (manual curation, low frequency)

7. JOB MARKET SIGNALS (Demand/Deployment Signal)

What it measures: Actual hiring demand for technology-specific skills Hype cycle relevance: Job posting surges lag the Peak by 12-18 months and correlate with Slope of Enlightenment.

Data Sources

Source Cost API Quality
TheirStack Free tier available REST API Best (deduplication, 324k ATS platforms)
FlyByAPIs Free (200 req/month) RapidAPI Good (Google Jobs index)
Sumble Free 500 credits/month REST API Good (LinkedIn + hiring signals)
LinkedIn Talent Enterprise ($$$) Partner only Best but inaccessible
Indeed Job Sync Free (partner) REST API Posting-focused, not search

Recommended: TheirStack or FlyByAPIs for free tier.

Metrics

  1. Job posting count per technology keyword per month
  2. Job posting velocity — rate of change
  3. Salary range — higher salaries = talent scarcity = early adoption
  4. Geographic distribution — US/EU = early; APAC = maturation

Implementation Complexity: 3/5


8. SOCIAL MEDIA / COMMUNITY SIGNALS (Practitioner Interest)

What it measures: Operator and engineer discussion intensity Hype cycle relevance: Community buzz leads deployment by 6-12 months.

Data Sources

Source API Cost Python Library
Reddit (r/networking, r/homelab, r/datacenter) Reddit API via PRAW Free praw
NANOG mailing list No API (scrape archives) Free requests + beautifulsoup4
LinkedIn No public search API N/A N/A

Reddit via PRAW

  • Free Reddit API access (60 req/min)
  • Search subreddits by keyword, filter by time
  • Count posts + comments mentioning technology terms
  • PRAWtools provides keyword alerts and subreddit statistics
  • Limitation: 1,000 post search window

NANOG Mailing List

  • Archives available at nanog.org/nanog-mailing-list/list-archives/ and marc.info
  • Monthly text file downloads available
  • ETH Zurich thesis (Gehri 2021) demonstrated NLP topic modeling and sentiment analysis on 89,000+ NANOG emails
  • No API — requires scraping or bulk download
  • Highly relevant for optical networking technology adoption signals

Metrics

  1. Post/email count per technology per month
  2. Engagement ratio (comments/votes per post)
  3. Sentiment (positive deployment reports vs. complaints)
  4. Question vs. statement ratio (questions = early adoption; statements = maturity)

Implementation Complexity: 3/5


9. EARNINGS CALL / FINANCIAL SIGNALS (Enterprise Adoption Signal)

What it measures: How often public companies mention technologies in financial disclosures Hype cycle relevance: Earnings call mentions are a LAGGING indicator that confirms Slope of Enlightenment → Plateau transition.

Data Source A: SEC EDGAR EFTS (VALIDATED — working, 899 filings found)

Attribute Detail
API URL https://efts.sec.gov/LATEST/search-index
Auth None (free public API)
Rate Limit ~10 requests/second (fair use)
Update Frequency Real-time (new filings indexed immediately)
Cost Free
Coverage All SEC filings since ~1993
Python Library requests (direct) or sec-api (paid wrapper)
Implementation Complexity 2/5

Validated result: Query for "optical transceiver" OR "400G" OR "800G optics" returned 899 filings across 10-K, 10-Q, and 8-K forms.

Data Source B: Financial Modeling Prep (FMP)

Attribute Detail
API URL https://financialmodelingprep.com/api/v3/earning_call_transcript/{SYMBOL}
Auth API key (free tier available)
Cost Free tier, paid plans from $29/month
Coverage Full earnings call transcripts for public companies
Python Library requests
Implementation Complexity 2/5

Target Companies for Optical Transceiver Mentions

Ticker Company Relevance
COHR Coherent Corp (formerly II-VI/Finisar) Transceiver manufacturer
LITE Lumentum Laser/transceiver manufacturer
CSCO Cisco Network equipment + transceivers
JNPR Juniper Networks Network equipment
ANET Arista Networks Datacenter switching
AVGO Broadcom Transceiver silicon
INTC Intel (Altera) Silicon photonics
CIEN Ciena Coherent optics
INFN Infinera Coherent optics
AAOI Applied Optoelectronics Transceiver manufacturer

Metrics

  1. Mention frequency — count of technology term mentions per earnings call
  2. Mention sentiment — positive/negative context around mentions
  3. First mention — when a company first mentions a technology (leading indicator)
  4. Revenue attribution — when companies break out revenue by technology generation

10. COMPOSITE SIGNAL ALGORITHM

Academic Foundation

Ren (2015): "An Approach for Predicting Hype Cycle Based on Machine Learning" (CEUR-WS Vol-1437, IPAMIN 2015)

  • Used SKNN (improved K-Nearest Neighbor) classifier
  • Features extracted from paper data and patent data
  • Achieved 67.24% precision, 68.46% recall classifying technologies into 5 hype cycle phases
  • Noted accuracy drops in phases 4-5 due to small training samples

BIMATEM (Manrique-Castillo et al., Scientometrics 2018):

  • Combines three data streams: scientific papers (logistic growth), patents (logistic growth), news (hype-type curve)
  • Fits logistic regression to paper/patent counts
  • Fits hype-type regression to news counts
  • Assigns TRL (Technology Readiness Level) based on curve position
  • Applied successfully to additive manufacturing technologies

Composite Early Warning Index (CEWI) approach (financial crisis literature):

  • Uses PCA to synthesize diverse variables into a single latent factor
  • Applicable to combining patent, publication, trends, and market signals
HypeScore(tech, t) = w1 * Patent_Signal(tech, t)
                   + w2 * Publication_Signal(tech, t)
                   + w3 * Trends_Signal(tech, t)
                   + w4 * News_Signal(tech, t)
                   + w5 * Vendor_Signal(tech, t)
                   + w6 * Standards_Signal(tech, t)
                   + w7 * Earnings_Signal(tech, t)
                   + w8 * Jobs_Signal(tech, t)

Signal Time Horizons and Weights

Signal Lead/Lag Suggested Weight Update Freq
Patents Leads by 3-5 years 0.10 Quarterly
Publications Leads by 1-3 years 0.10 Monthly
Google Trends Real-time 0.20 Monthly
News Volume Real-time 0.10 Weekly
Vendor Count/Price Real-time 0.25 Daily
Standards Progress Leads by 2-4 years 0.10 Quarterly
Earnings Calls Lags by 6-12 months 0.10 Quarterly
Job Postings Lags by 12-18 months 0.05 Monthly

Vendor Count/Price gets the highest weight because it is the only direct market measurement.

Phase Classification Approach

  1. Normalize each signal to 0-100 scale per technology
  2. Calculate rate of change (first derivative) for each signal
  3. Calculate acceleration (second derivative) for trend detection
  4. Apply phase classification rules:
Phase Signal Pattern
Technology Trigger Patents rising, Publications starting, Trends near zero, Vendors 0-3, Standard in study group
Peak of Inflated Expectations Trends peaking, News volume peaking, Publications rising fast, Vendors 3-8, Sentiment highly positive
Trough of Disillusionment Trends declining, News declining, Sentiment negative, Vendors may decrease, Publications slowing
Slope of Enlightenment Vendors growing steadily, Price CV declining, Earnings mentions increasing, Jobs increasing, Standards published
Plateau of Productivity All signals stable, Price CV < 0.2, Vendor count > 30, Publications steady, Standards have amendments
  1. Optional ML layer: Train a Random Forest or Gradient Boosting classifier on known technology trajectories (100G, 40G, 10G historical data as training set)

Norton-Bass Integration

The composite signal feeds into the Norton-Bass multigenerational diffusion model:

  • p (innovation coefficient) ← derived from patent/publication velocity
  • q (imitation coefficient) ← derived from vendor count growth rate + Google Trends
  • M (market potential) ← derived from addressable port count in deployed switches
  • tau (generation introduction time) ← derived from IEEE standard publication date
  • Python: scipy.optimize.curve_fit with Bass model function, or bassmodeldiffusion package (PyPI)

Prioritized Implementation Plan

Phase 1: Quick Wins (Week 1-2) — HIGH VALUE, LOW EFFORT

# Signal API Cost Complexity Why First
1 Google Trends pytrends Free 1/5 Already validated, immediate hype measurement
2 Vendor Count/Price Internal DB Free 1/5 Data already being collected by TIP scrapers
3 Semantic Scholar REST API Free 1/5 Already validated, publication trend curves

Deliverable: Basic hype cycle positioning for all tracked technologies using 3 signals.

Phase 2: Depth Signals (Week 3-4) — HIGH VALUE, MODERATE EFFORT

# Signal API Cost Complexity
4 SEC EDGAR EFTS REST API Free 2/5
5 Standards Progress Manual curation Free 2/5
6 Trade Press Scraping Crawlee (existing) Free 2/5

Deliverable: 6-signal composite with financial and standards validation.

Phase 3: Extended Signals (Week 5-8) — MODERATE VALUE, HIGHER EFFORT

# Signal API Cost Complexity
7 USPTO Patents PatentsView Free (need API key) 2/5
8 Reddit/PRAW Reddit API Free 3/5
9 Job Postings TheirStack/FlyByAPIs Free tier 3/5
10 Earnings Transcripts FMP Free tier 2/5

Deliverable: Full 10-signal composite with ML phase classifier.

Phase 4: ML Calibration (Week 9-12)

  1. Collect historical data for training technologies (10G, 40G, 100G — known trajectories)
  2. Train Random Forest classifier on multi-signal features
  3. Validate against known Gartner positioning (where available)
  4. Implement Norton-Bass curve fitting with signal-derived parameters
  5. Build confidence scoring and uncertainty quantification

Key Python Dependencies

# Phase 1
pytrends==4.9.2          # Google Trends
semanticscholar          # Paper counts
requests                 # General HTTP
scipy                    # Curve fitting (Norton-Bass)
numpy                    # Numerical
pandas                   # Data manipulation

# Phase 2
beautifulsoup4           # HTML parsing (trade press)
vaderSentiment           # Sentiment analysis

# Phase 3
praw                     # Reddit API
bassmodeldiffusion       # Bass model fitting

# Phase 4
scikit-learn             # Random Forest, PCA
xgboost                  # Gradient boosting

Signal Correlation Summary

Signal Free? Real-time? Validated? Hype Correlation Implementation
Google Trends Yes Yes YES HIGH (academic proof) 1/5
Vendor Count/Price Yes Yes YES (own data) HIGHEST (direct) 1/5
Semantic Scholar Yes Yes YES MODERATE-HIGH 1/5
SEC EDGAR EFTS Yes Yes YES MODERATE 2/5
News/Trade Press Yes Weekly Partial HIGH 2/5
Standards Progress Yes Quarterly YES HIGH (leading) 2/5
Patents (USPTO) Yes Quarterly Not yet (API key needed) MODERATE-HIGH 2/5
Reddit/PRAW Yes Daily Not yet LOW-MODERATE 3/5
Job Postings Free tier Daily Not yet MODERATE 3/5
Earnings Calls Free tier Quarterly Not yet MODERATE 2/5

References

Academic Papers

  • Manrique-Castillo et al. (2018). "A bibliometric method for assessing technological maturity: the case of additive manufacturing." Scientometrics 117(3).
  • Ren, Z. (2015). "An Approach for Predicting Hype Cycle Based on Machine Learning." CEUR-WS Vol-1437.
  • Jun, S.P. (2012). "An empirical study of users' hype cycle based on search traffic." Scientometrics 91(1), 81-99.
  • van Lente, H., Spitters, C., & Peine, A. (2013). "Comparing technological hype cycles." Technological Forecasting and Social Change 80(8).
  • Gao, L. et al. (2013). "Technology life cycle analysis method based on patent documents." Technological Forecasting and Social Change.
  • Huang et al. (2022). "Technology life cycle analysis: From the dynamic perspective of patent citation networks." Technological Forecasting and Social Change.
  • Choi, H. & Varian, H. (2010). "Predicting the Present with Google Trends." SSRN.
  • Dedehayir, O. & Steinert, M. (2016). "The hype cycle model: A review and future directions." Technological Forecasting and Social Change 108(C).
  • Norton, J.A. & Bass, F.M. (1987). "A diffusion theory model of adoption and substitution for successive generations of high-technology products." Management Science 33(9).
  • Gehri, L. (2021). "NANOG Mailing List Analysis." ETH Zurich Semester Thesis.

API Documentation

Python Libraries