Cut AI Costs by 70% with Intelligent Routing
Smart query routing avoids expensive web searches when context is sufficient. Save money without sacrificing quality.
The Cost Problem
AI APIs are expensive. A single web search can cost $0.01-0.05, and LLM inference adds more. For agents making thousands of requests, costs spiral quickly.
Where Money Goes
| Operation | Cost |
|---|---|
| Web Search | $0.01-0.05 per query |
| LLM Inference | $0.0001-0.01 per 1K tokens |
| Total per Agent Request | $0.02-0.10 |
At 10,000 requests/month: $200-1,000/month
The Routing Solution
Deep Research API's intelligent router analyzes each query and routes it to the optimal path:
CHAT Path (Free)
For casual queries that don't need web search:
// Query: "What's 2+2?"
// Router: No search needed
// Cost: $0.0001 (LLM only)
const response = await deepResearch({
query: "What's 2+2?"
})
// Uses cached LLM, no web search
// Cost: ~$0.0001
CONTEXT Path (Free)
For queries answerable from provided context:
// Query: "What's in my document?"
// Router: Answer from context
// Cost: $0.0001 (LLM only)
const response = await deepResearch({
query: "What's in my document?",
context: userDocument
})
// Uses provided context, no web search
// Cost: ~$0.0001
RESEARCH Path (Paid)
Only when web search is actually needed:
// Query: "What's Tesla's current stock price?"
// Router: Web search required
// Cost: $0.01-0.05 (search + LLM)
const response = await deepResearch({
query: "What's Tesla's current stock price?"
})
// Performs web search + LLM
// Cost: $0.01-0.05
How Routing Works
Intent Classification
The router analyzes query structure and context:
interface RouterAnalysis {
intent: "CHAT" | "CONTEXT" | "RESEARCH"
confidence: number
reason: string
}
const analysis: RouterAnalysis = {
intent: "CONTEXT",
confidence: 0.95,
reason: "Answerable from provided context"
}
Decision Logic
if (analysis.intent === "CHAT") {
// Casual query - use fast LLM
routeTo("chat_path")
} else if (analysis.intent === "CONTEXT") {
// Context available - skip search
routeTo("context_path")
} else if (analysis.intent === "RESEARCH") {
// Web search needed
routeTo("research_path")
}
Real-World Savings
Example 1: Customer Support Bot
// Without routing: Every query = $0.02
100 queries/day × $0.02 = $2/day
// With routing:
- 60 queries × $0.0001 (chat) = $0.006
- 40 queries × $0.0001 (context) = $0.004
- 0 queries × $0.02 (research) = $0
Total: $0.01/day
// Savings: 99.5%
Example 2: Document QA System
// Without routing: Every query = $0.02
1000 queries/day × $0.02 = $20/day
// With routing:
- 950 queries × $0.0001 (context) = $0.095
- 50 queries × $0.02 (research) = $1
Total: $1.095/day
// Savings: 94.5%
Example 3: Research Assistant
// Without routing: Every query = $0.02
500 queries/day × $0.02 = $10/day
// With routing:
- 200 queries × $0.0001 (chat) = $0.02
- 300 queries × $0.02 (research) = $6
Total: $6.02/day
// Savings: 39.8%
Best Practices
Provide Context
Always include relevant context to trigger cheaper paths:
const response = await deepResearch({
query: "What's the status of my order #12345?",
context: orderHistory
})
// Routes to CONTEXT (free) instead of RESEARCH (paid)
Batch Queries
Combine multiple queries when possible:
const results = await deepResearch({
queries: [
"Order #12345 status",
"Order #12346 status",
"Order #12347 status"
]
})
// Single API call, shared context
Monitor Routing
Check which paths your queries use:
const response = await deepResearch({ query: "..." })
console.log("Routed to:", response.meta?.routed_to)
console.log("Search skipped:", response.meta?.search_skipped)
// Optimize based on routing patterns
Get Started
Start saving on AI costs today.