The platform

One continuous engine. Four pillars. Five stages. Ship-ready strategy at the end.

Theia is a single integrated intelligence engine — not a dashboard, not a research wave. Every piece runs continuously. Every output is queryable. Every claim is source-traceable.

The 5-stage pipeline.

Each stage has a single responsibility and a stable output schema. Downstream stages never re-compute upstream work.

Collect

DataForSEO for SERP, AI Overview citations, Labs keyword expansion. Oxylabs for deep-scrape of reviews, YouTube transcripts, web articles, retailer pages. JungleScout / Stackline for marketplace sales. GfK and Canon 1P feeds where licensed. 8,000+ classified deep-web sources for B2B/industrial.

Enrich

Claude Haiku extracts features, benefits, use cases, comparisons and sentiment from every source — in the source language. Budget-guarded ($50 cap per batch). Distinctive-keyword scoping prevents runaway cost.

Structure

Bipartite keyword × product graphs. Leiden Surprise clustering finds market segments. HHI-weighted distinctiveness names them. Cross-language harmonisation maps raw labels to canonical properties. The result: a stable, queryable graph.

Strategise

Four agents in sequence: L1 Category Brief (what the market values), L2 Perception Report (how products perform), L3 Situation Analysis (what to do), L4 Content Generation (ship-ready listings, PDPs, briefs).

Converse

MCP-exposed intelligence. Connect Claude Desktop, Cursor or a custom agent. Ask natural-language questions against the full repository. Sourced answers in seconds.

The four pillars.

Mapped to the customer journey: Demand → Visibility → Sales → Perception (and back). Most platforms cover one. Theia connects all four into the same graph.

Demand

Search impressions & estimated volume

Google (GSC, Trends, Ad Planner), Amazon search volume, distinctive keywords per segment

Visibility

Click-weighted share of voice

Google rankings, Amazon rankings, AI Overview citation share, CTR-curve weighted

Sales

Units, revenue, share — daily/weekly

1P: Vendor Central + Canon internal. 3P: Stackline (Amazon), GfK (all channels)

Perception

Feature sentiment & trajectory

Reviews, YouTube, Reddit, web articles, BazaarVoice, AI Overviews — multi-source, multi-language

From data to deck: the L1-L4 strategy chain.

The strategy agent chain is the delivery surface. Each agent reads from pre-computed structured tables — never from raw text — which makes them fast, cheap, and reproducible.

Category Brief

What the market values: pain points, growth levers, audience segments, defining properties.

Perception Report

How your products perform: feature sentiment, trajectories, competitive leaderboard.

Situation Analysis

What to do: priorities per product, brand vs market gaps, recommended actions with evidence.

Content Generation

Ship: Amazon listings, brand PDPs, retailer-specific PDPs, content briefs, marketing copy.

Read the deep-dive: strategy agent chain.

Conversational, by design.

Theia exposes the intelligence repository through MCP (Model Context Protocol). Connect Claude Desktop, Cursor, or your own agent. Ask in natural language. Get sourced answers in seconds.

Learn more about MCP →

// example query via MCP

"Which 5 vendors are most cited

for GigE Vision compliance

in EU machine-vision forums

over the last 6 months?"

→ sourced answer in 4.2s

Eight principles behind the engine.

The opinions baked into the architecture. Each one is a choice that distinguishes Theia from another dashboard or another research wave.

Search engines have already segmented your market.

Continuous, planetary-scale query-to-result matching IS market segmentation. Reading it beats inventing your own.

Four pillars, never three.

Demand, Visibility, Sales, Perception. Any platform claiming to do market intelligence with fewer is selling you one slice.

Continuous beats quarterly.

Weekly refresh is the new floor. Quarterly waves discard 90% of the signal that actually moves brands.

Native language, then harmonise.

German extraction stays in German until the canonical mapping step. Translation-first pipelines lose 80% of the signal.

LLMs for extraction. Math for connections.

Features and sentiment need language understanding. Graph edges need cosine similarity and Leiden clustering — not LLM judgement.

Fixed entities, not LLM-discovered ontology.

Schema is curated by domain experts. LLMs do the extraction the LLMs are good at. This is what makes the graph stable.

Trajectory matters more than level.

Sentiment 0.69 is meaningless. Sentiment improving from 0.41 to 0.69 over 12 months is a strategy.

Strategy ships from the same engine as the data.

L1-L4 agents read the structured intelligence layer. The deck and the SQL come from the same place.

See the engine on your category.

A 30-minute walkthrough on your market with Pascal. Or a one-pack pilot that you scale from there.

Book a walkthrough See pricing