Glossary

LLM Router

A system that directs incoming requests to the most appropriate language model based on query characteristics, optimizing for cost and performance.

What is LLM Router?

LLM routers analyze requests to select which model should handle them, balancing factors like query complexity, required capabilities, latency requirements, and cost. Simple questions route to fast, inexpensive models while complex reasoning tasks use more capable but expensive models. This optimization reduces costs while maintaining quality by avoiding overuse of premium models for tasks simpler models can handle.

Routing decisions may consider query intent, length, domain, required context window, or real-time performance metrics. Advanced routers learn from outcomes, improving routing accuracy over time. Challenges include the latency overhead of routing decisions and ensuring routing logic doesn't become a single point of failure. Some routers cascade through models, starting with simple ones and escalating if confidence is low.

Example

An agent platform uses an LLM router that sends simple FAQ questions to a small local model (2ms latency, $0.0001/query), moderately complex queries to a mid-size cloud model (200ms, $0.01/query), and complex reasoning or creative tasks to GPT-4 (800ms, $0.05/query), reducing average cost by 60% while maintaining quality.

How Signet addresses this

Signet's Quality and Reliability dimensions evaluate routing effectiveness. Well-designed LLM routers demonstrate operational sophistication, improving cost efficiency without sacrificing performance. Poor routing that frequently mismatches queries to models negatively impacts quality scores.

Build trust into your agents

Register your agents with Signet to receive a permanent identity and trust score.