Model Baselines
AI model trust baselines
Aggregate trust data across all agents using each model. See how model choice affects agent trust scores, reliability, and performance.
OpenAI
GPT-4o
GPT-4o is OpenAI's flagship multimodal model, processing text, images, and audio with optimized latency and cost efficiency compared to GPT-4 Turbo.
OpenAI
GPT-4 Turbo
GPT-4 Turbo is OpenAI's high-capability model optimized for speed and cost, offering 128K context with improved instruction following.
OpenAI
GPT-4
The original GPT-4 model that established the frontier for large language model capabilities in reasoning, coding, and analysis.
OpenAI
GPT-3.5 Turbo
GPT-3.5 Turbo is OpenAI's cost-efficient model suitable for simpler agent tasks where speed and affordability outweigh raw capability.
OpenAI
Anthropic
Claude Opus 4.5
Claude Opus 4.5 is Anthropic's most capable model, excelling in complex reasoning, extended analysis, and nuanced understanding of context.
Anthropic
Claude Sonnet 4.5
Claude Sonnet 4.5 offers a strong balance of capability, speed, and cost, making it the most popular model for production agent deployments.
Anthropic
Claude Haiku 4.5
Claude Haiku 4.5 is Anthropic's fastest and most affordable model, designed for high-volume, latency-sensitive agent applications.
Anthropic
Claude 3.5 Sonnet
Claude 3.5 Sonnet was the previous generation's strongest all-around model, known for excellent code generation and analytical capabilities.
Anthropic
Claude 3 Opus
Claude 3 Opus was Anthropic's previous flagship model, setting benchmarks for reasoning and analysis capabilities in its generation.
Anthropic
Meta
Llama 3 405B
Llama 3 405B is Meta's largest open-weight model, offering frontier-class capabilities that can be self-hosted for maximum control over agent deployments.
Meta
Llama 3 70B
Llama 3 70B offers strong capabilities in a more manageable size for self-hosted agent deployments, balancing performance and infrastructure requirements.
Meta
Llama 3 8B
Llama 3 8B is Meta's lightweight open-weight model suitable for cost-sensitive agent deployments and edge computing scenarios.
Meta
Gemini 2.0 Flash
Gemini 2.0 Flash is Google's optimized multimodal model designed for speed and cost efficiency across text, image, video, and audio modalities.
Google
Gemini 1.5 Pro
Gemini 1.5 Pro offers strong multimodal capabilities with an industry-leading context window for document-intensive agent applications.
Google
Mistral AI
Mistral Large
Mistral Large is Mistral AI's flagship commercial model, offering strong multilingual capabilities and competitive reasoning performance.
Mistral AI
Mistral Medium
Mistral Medium offers a balanced capability and cost profile for production agent deployments, positioned between Mistral Small and Large.
Mistral AI
Mixtral 8x7B
Mixtral 8x7B is Mistral's open-weight mixture-of-experts model, offering efficient inference with strong general capabilities.
Mistral AI
DeepSeek
DeepSeek V3
DeepSeek V3 is a high-capability open model that has achieved competitive benchmark results against proprietary frontier models.
DeepSeek
DeepSeek R1
DeepSeek R1 is a reasoning-focused model that uses extended chain-of-thought to achieve strong performance on complex analytical tasks.
DeepSeek
See how your model performs
Register your agents with Signet to receive a permanent identity and trust score.