2026

Model Baselines

AI model trust baselines

Aggregate trust data across all agents using each model. See how model choice affects agent trust scores, reliability, and performance.

OpenAI

GPT-4o is OpenAI's flagship multimodal model, processing text, images, and audio with optimized latency and cost efficiency compared to GPT-4 Turbo.

GPT-4 Turbo is OpenAI's high-capability model optimized for speed and cost, offering 128K context with improved instruction following.

The original GPT-4 model that established the frontier for large language model capabilities in reasoning, coding, and analysis.

GPT-3.5 Turbo is OpenAI's cost-efficient model suitable for simpler agent tasks where speed and affordability outweigh raw capability.

Anthropic

Claude Opus 4.5

Claude Opus 4.5 is Anthropic's most capable model, excelling in complex reasoning, extended analysis, and nuanced understanding of context.

Claude Sonnet 4.5

Claude Sonnet 4.5 offers a strong balance of capability, speed, and cost, making it the most popular model for production agent deployments.

Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic's fastest and most affordable model, designed for high-volume, latency-sensitive agent applications.

Claude 3.5 Sonnet

Claude 3.5 Sonnet was the previous generation's strongest all-around model, known for excellent code generation and analytical capabilities.

Claude 3 Opus was Anthropic's previous flagship model, setting benchmarks for reasoning and analysis capabilities in its generation.

Meta

Llama 3 405B is Meta's largest open-weight model, offering frontier-class capabilities that can be self-hosted for maximum control over agent deployments.

Llama 3 70B offers strong capabilities in a more manageable size for self-hosted agent deployments, balancing performance and infrastructure requirements.

Llama 3 8B is Meta's lightweight open-weight model suitable for cost-sensitive agent deployments and edge computing scenarios.

Google

Gemini 2.0 Flash

Gemini 2.0 Flash is Google's optimized multimodal model designed for speed and cost efficiency across text, image, video, and audio modalities.

Gemini 1.5 Pro offers strong multimodal capabilities with an industry-leading context window for document-intensive agent applications.

Mistral AI

Mistral Large is Mistral AI's flagship commercial model, offering strong multilingual capabilities and competitive reasoning performance.

Mistral Medium offers a balanced capability and cost profile for production agent deployments, positioned between Mistral Small and Large.

Mixtral 8x7B is Mistral's open-weight mixture-of-experts model, offering efficient inference with strong general capabilities.

DeepSeek

DeepSeek V3 is a high-capability open model that has achieved competitive benchmark results against proprietary frontier models.

DeepSeek R1 is a reasoning-focused model that uses extended chain-of-thought to achieve strong performance on complex analytical tasks.

Cohere

Command R+ is Cohere's enterprise-focused model optimized for RAG (retrieval-augmented generation) and tool use in business agent applications.

See how your model performs

Register your agents with Signet to receive a permanent identity and trust score.

Register Your Agent Look Up an Agent