Mapping Architecture & Semantic Indexing¶
How the unified revenue operations schema maps to the existing systems of record, how the semantic indexing pipeline processes it, and how the ReAct agent uses the indexed ontology to serve queries and take actions.
Schema-First Architecture¶
The revenue operations ontology uses a schema-first approach — the unified schema is the primary knowledge source, not the FDW foreign tables. FDW becomes a mapping resolution layer that annotates which schema entities have live database connections.
flowchart TD
subgraph SchemaOverlay [Unified Schema - rev-ops-schema.yaml]
SF["Salesforce Entities<br/>Deal_Opportunity, Account_Profile, Contact..."]
GONG["Gong/Email Entities - virtual<br/>Conversation_Record, Email_Thread, Calendar_Event..."]
ORA["Oracle Finance Entities - virtual<br/>AR_Invoice, Revenue_Schedule, Payment..."]
PROD["Product Entities - virtual<br/>Subscription, Product_Usage..."]
UNS["Unstructured - virtual<br/>Contract_Record, Competitive_Signal..."]
end
subgraph FDWLayer [FDW Mapping Resolution]
DISC["FDW Discovery Service<br/>pg_foreign_server + information_schema"]
MATCH["Match schema fdw_table<br/>to live foreign tables"]
end
subgraph Status [Mapping Status]
MAPPED["MAPPED<br/>Live FDW table exists<br/>Queryable via SQL"]
VIRTUAL["VIRTUAL<br/>Data via integration<br/>Not directly queryable"]
UNMAPPED["UNMAPPED<br/>Expected FDW table<br/>not yet connected"]
end
SF --> MATCH
DISC --> MATCH
MATCH --> MAPPED
GONG --> VIRTUAL
ORA --> VIRTUAL
PROD --> VIRTUAL
UNS --> VIRTUAL
Entity Mapping Status¶
| Status | Meaning | Count | Example |
|---|---|---|---|
| Mapped | Live FDW foreign table exists; entity is queryable via query_table tool |
~6 | Deal_Opportunity, Account_Profile, Contact, Opportunity_Line_Item |
| Virtual | Entity data flows via integration sync; not directly queryable via FDW | ~28 | Conversation_Record, AR_Invoice, Revenue_Schedule, Subscription |
| Unmapped | Schema defines the entity but no FDW table or integration connected yet | ~4 | Deal_Velocity, Pipeline_Snapshot (derived entities) |
Semantic RAG Pipeline (12 Steps)¶
The pipeline follows the same 12-step process as other vertical ontologies:
flowchart LR
subgraph Extraction [Extraction Phase]
S1A["Step 1A<br/>Unified Schema Extract<br/>~40 entities from *-schema.yaml"]
S1B["Step 1B<br/>FDW Mapping Resolution<br/>Match to live FDW tables"]
S1C["Step 1C<br/>Legacy FDW Extract<br/>Non-schema FDW tables"]
S2["Step 2<br/>Policy Extract<br/>6 policies from *.md"]
S3["Step 3<br/>Workflow Extract<br/>8 workflows from *.yaml"]
S4["Step 4<br/>Integration Extract<br/>6 integrations from *.yaml"]
end
subgraph Processing [Processing Phase]
S5["Step 5<br/>Normalize + Dedupe<br/>Merge into OntoBundle"]
S6["Step 6<br/>Enrich<br/>MEDDPICC/ASC606-aware"]
S7["Step 7<br/>Chunk + Embed"]
end
subgraph Loading [Loading Phase]
S8["Step 8<br/>Load pgvector"]
S9["Step 9<br/>Load Apache AGE Graph"]
S10["Step 10<br/>Validate - scenarios"]
end
S1A --> S1B --> S1C --> S5
S2 --> S5
S3 --> S5
S4 --> S5
S5 --> S6 --> S7 --> S8 --> S9 --> S10
Step Details¶
| Step | File | What It Does |
|---|---|---|
| 1A | unified_schema_extractor.py |
Reads rev-ops-schema.yaml; creates OntoDocuments for every entity with MEDDPICC element, ASC 606 step, SaaS metric, and FDW mapping status annotations |
| 1B | fdw_mapping_resolver.py |
Queries FDWDiscoveryService to match schema entities to live FDW foreign tables; annotates as mapped/virtual/unmapped |
| 1C | fdw_extractor.py |
Original FDW extractor for non-schema tables (backward compatibility) |
| 2 | policy_extractor.py |
Auto-discovers all *.md from enterprise-knowledge/policies/ — includes 6 rev-ops policies |
| 3 | workflow_extractor.py |
Auto-discovers all *.yaml from enterprise-knowledge/workflows/ — includes 8 rev-ops workflows |
| 4 | integration_extractor.py |
Auto-discovers all *.yaml from enterprise-knowledge/integrations/ — includes 6 rev-ops integrations |
| 5 | normalizer.py |
Merges all extracted documents; deduplicates by ID; merges relationships and structured_metadata on collision |
| 6 | enricher.py |
Schema entities: auto-enriched with MEDDPICC element, ASC 606 step, SaaS metric, and FDW status. Policies/workflows/integrations: LLM-enriched via gpt-4o-mini |
| 7 | chunker.py |
1 document = 1 chunk; batch embedded (20/batch) via OpenAI text-embedding-3-small |
| 8 | vector_loader.py |
Upserted to pgvector control_plane_embeddings with content_type: onto_schema, onto_policy, onto_workflow, onto_integration |
| 9 | graph_loader.py |
Nodes (Entity) and edges (triggers/syncs_to/constrained_by/depends_on/validates) merged into Apache AGE enterprise_onto graph |
| 10 | validator.py |
Black-box test scenarios validating retrieval quality across 6 dimensions |
What Gets Indexed¶
| Source | Content Type | Approx Count |
|---|---|---|
| Unified schema (Salesforce + Gong + Oracle + Product + Documents + Market) | onto_schema |
~40 |
| Policies (6 rev-ops) | onto_policy |
~40+ (split by section) |
| Workflows (8 rev-ops) | onto_workflow |
~8 |
| Integrations (6 rev-ops) | onto_integration |
~6 |
| Total | — | ~95+ |
ReAct Agent and Tools¶
The ReAct agent uses the indexed ontology to answer questions and take actions. The flow is: Search ontology -> Reason with policies -> Execute actions -> Validate compliance.
Tool Inventory¶
Read Tools¶
| Tool | Domain | What It Does |
|---|---|---|
search_enterprise_knowledge |
Core | Hybrid vector + graph search across all ontology types |
search_schema_knowledge |
Core | Vector search over FDW table definitions |
discover_tables / discover_columns / query_table |
Core | FDW table discovery and parameterized SQL queries |
check_policy_compliance |
Governance | Validates proposed actions against indexed policies |
get_deal_360 |
RevOps | Assemble unified deal context: CRM + conversations + emails + product fit + competitive |
get_pipeline_health |
RevOps | Query pipeline coverage, velocity, concentration, and forecast accuracy |
get_account_health |
RevOps | Retrieve account health: usage, NPS, renewal status, expansion signals |
get_revenue_position |
RevOps | Query revenue recognition status, AR aging, and leakage metrics |
get_forecast_accuracy |
RevOps | Compare AI forecast vs. rep forecast vs. actuals over time |
Write Tools¶
| Tool | Risk Level | What It Does |
|---|---|---|
update_deal_score |
LOW_RISK_WRITE | Update deal health score and MEDDPICC assessment in Salesforce |
create_retention_task |
LOW_RISK_WRITE | Create retention/renewal task for CSM in Salesforce |
update_forecast_category |
LOW_RISK_WRITE | Update deal forecast category (Commit/Best Case/Pipeline) |
create_billing_schedule |
HIGH_RISK_WRITE | Create billing schedule in Oracle Finance (triggers invoicing) |
approve_discount |
HIGH_RISK_WRITE | Approve non-standard discount (triggers pricing policy check) |
End-to-End ReAct Flow¶
sequenceDiagram
participant User
participant Agent as ReAct Agent
participant RAG as Ontology Search
participant Policy as Policy Check
participant SoR as System of Record
User->>Agent: "What deals are at risk of slipping this quarter and why?"
Agent->>RAG: search_enterprise_knowledge("pipeline risk slip quarter deals")
RAG-->>Agent: Deal_Opportunity schema + Deal_Velocity + forecast-governance-policy + pipeline-risk workflow
Agent->>SoR: get_pipeline_health(period="Q2-2026", risk_filter="high_slip")
SoR-->>Agent: 12 deals with slip risk >0.6, total $4.2M at risk
Agent->>Agent: REASON: Top 3 deals have stalled stages + declining email engagement + no exec meeting in 30 days
Agent->>SoR: get_deal_360(opp_id="OPP-2026-1847")
SoR-->>Agent: MEDDPICC 42/100, champion silent 18 days, competitor detected, no next step committed
Agent->>User: 12 deals at risk ($4.2M). Top risk: OPP-1847 ($680K) — MEDDPICC score 42, champion silent 18 days, Competitor X detected in last call. Recommend: exec sponsor outreach + competitive battle card + re-qualify decision criteria.
UI Integration¶
Data Plane Page¶
- Vertical selector filters data sources by domain (All / Rev-Ops / Supply Chain / CRM)
- Each source node shows ontology entity count and MEDDPICC element coverage
- Source cards display FDW mapping status (mapped / virtual) and entity count badges
Control Plane Page¶
- Semantic Layer tab shows vertical-level stats (Rev-Ops: 40 entities, 8 workflows, 6 policies, 6 integrations)
- MEDDPICC element distribution badges + ASC 606 step coverage badges
- Knowledge Formation and Semantic Explorer tabs support system and standard filtering
Reasoning Page¶
- ReAct Tools tab organizes tools into Read Tools and Write Tools with domain badges (Core / RevOps / CRM / Governance)
- AI Copilot system prompt includes revenue operations context and tool selection strategy
Configuration¶
The pipeline is configured via SemanticRagConfig:
| Parameter | Default | Purpose |
|---|---|---|
schema_dir |
enterprise-knowledge/ |
Directory containing *-schema.yaml files |
policy_path |
enterprise-knowledge/policies/ |
Directory with policy Markdown files |
workflow_path |
enterprise-knowledge/workflows/ |
Directory with workflow YAML files |
integration_path |
enterprise-knowledge/integrations/ |
Directory with integration YAML files |
skip_unified_schema |
false |
Skip Step 1A (unified schema extraction) |
skip_fdw_mapping |
false |
Skip Step 1B (FDW mapping resolution) |
enrich_with_llm |
true |
Enable LLM enrichment for non-schema docs |
skip_graph |
false |
Skip Apache AGE graph loading |
Trigger reindex via: POST /api/v1/control-plane/reindex