DeepSeek Released January 23, 2025 Synced Apr 7, 2026

DeepSeek: R1 Distill Llama 70B

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...

Reasoning

Why it stands out

131K-token context window handles longer documents and multi-turn conversations without truncation.

Reasoning capability positions it for multi-step analysis and chain-of-thought tasks.

What to watch

No tool-use capability is currently tracked, which limits its fit for agentic or function-calling patterns.

Text-only input — image or audio workflows require a separate model in the pipeline.

No benchmark score currently tracked — evaluate using task-specific testing alongside pricing and capability data.

Release timeline

Tracked events for DeepSeek: R1 Distill Llama 70B.

Back to model tracker

release

DeepSeek: R1 Distill Llama 70B entered the tracked catalog

January 23, 2025

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including: - AIME 2024 pass@1: 70.0 - MATH-500 pass@1: 94.5 - CodeForces Rating: 1633 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

Nearby alternatives

Other DeepSeek models worth checking.

Need a recommendation instead?

DeepSeek: DeepSeek V3.2 Speciale

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on Deep

Context 163,840

DeepSeek: DeepSeek V3.2

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use perfo

Context 163,840

DeepSeek: DeepSeek V3.2 Exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures

Context 163,840

DeepSeek: DeepSeek V3.1 Terminus (exacto)

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while

Context 163,840

Recent changes

LaunchJan 23

DeepSeek launched DeepSeek: R1 Distill Llama 70B

Compare

See how DeepSeek: R1 Distill Llama 70B stacks up.

All comparisons

vs Google: Gemini 2.0 Flash

Side-by-side pricing, context, and capabilities

vs Google: Gemini 2.0 Flash Lite

Side-by-side pricing, context, and capabilities

vs Google: Gemini 2.5 Flash Lite

Side-by-side pricing, context, and capabilities

vs Google: Gemini 2.5 Flash Lite Preview 09-2025

Side-by-side pricing, context, and capabilities

vs Google: Gemini 3.1 Flash Lite Preview

Side-by-side pricing, context, and capabilities

vs Google: Gemma 3 12B

Side-by-side pricing, context, and capabilities