Mistral AI

Models Benchmarking

Project for the Software Engineer Internship Application

Select Models

Choose which Mistral AI models to include in the benchmark

Premier Models

Mistral Medium 3.1

mistral-medium-2508

Our frontier-class multimodal model released August 2025. Improving tone and performance.

128kv25.08

Magistral Medium 1.1

magistral-medium-2507

Our frontier-class reasoning model released July 2025.

40kv25.07

Codestral 2508

codestral-2508

Our cutting-edge language model for coding released end of July 2025, specializes in low-latency, high-frequency tasks.

256kv25.08

Devstral Medium

devstral-medium-2507

An enterprise grade text model, that excels at using tools to explore codebases, editing multiple files and power software engineering agents.

128kv25.07

Ministral 3B

ministral-3b-2410

World's best edge model.

128kv24.10

Ministral 8B

ministral-8b-2410

Powerful edge model with extremely high performance/price ratio.

128kv24.10

Mistral Large 2.1

mistral-large-2411

Our top-tier large model for high-complexity tasks with the latest version released November 2024.

128kv24.11

Mistral Small 2

mistral-small-2407

Our updated small version, released September 2024.

32kv24.07

Open Models

Magistral Small 1.1

magistral-small-2507

Our small reasoning model released July 2025.

40kv25.07

Mistral Small 3.2

mistral-small-2506

An update to our previous small model, released June 2025.

128kv25.06

Devstral Small 1.1

devstral-small-2507

An update to our open source model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.

128kv25.07

Mistral Small 3.1

mistral-small-2503

A new leader in the small models category with image understanding capabilities, released March 2025.

128kv25.03

Pixtral 12B

pixtral-12b-2409

A 12B model with image understanding capabilities in addition to text.

128kv24.09

Mistral Nemo 12B

open-mistral-nemo

Our best multilingual open source model released July 2024.

128kv24.07

Test Prompt

This will send the same prompt to all selected models

Model Comparison

Compare responses from different Mistral AI models

ModelResponse
Judge 1 (Unbiased)
Judge 2 (Biased)
No models selected. Use the model selector to add models for benchmarking.