by Google DeepMind

Gemma 3 27B.

multimodal open weights laptop+ 27B params 27B active 128K ctx Transformer Quality 67.1
Cheapest input
$0.08/M
on OpenRouter
Cheapest output
$0.16/M
on OpenRouter
Fastest
45 tok/s
on OpenRouter
Smallest GPU
1× Nvidia RTX 4090
Capability snapshot

What it's best at.

Math 89.0
Coding 81.0
General knowledge 76.2
Reasoning 67.5

Scores normalised against benchmark ceilings (100 = perfect). Coloured by tier — coral 80+ frontier, lavender 65+ strong, sage 50+ solid, slate below.

Benchmarks

Published scores.

Benchmark Score Source
GPQA 42.4 official ↗
MATH 89.0 official ↗
MMLU 76.2 official ↗
MMMU 64.9 official ↗
MMLU-Pro 67.5 official ↗
HumanEval 81.0 official ↗
Leaderboard standing

Independent rankings.

Artificial Analysis Quality Index
55.0
Composite of reasoning + coding + tool-use benchmarks
View on Artificial Analysis ↗
Description

About Gemma 3 27B.

Gemma 3 27B is Google's open-weight model in the Gemma family — distilled from the Gemini architecture, released under the Gemma Terms of Use (commercial use allowed with attribution). The 3 series introduced native multimodal support (vision input) and a 128K context window — both firsts for an open-weight Google model. Runs comfortably on a single H100 (FP16) or RTX 5090 (INT4). Strong on multilingual tasks (140+ languages) and competitive with Llama 3.3 70B at half the parameter count.

Architecture

How it's built.

Architecture
Transformer
Mixture of Experts — 27B params active per token out of 27B total.
Trained on
14.0T tokens
519 tokens per parameter — well above the Chinchilla optimum.
Knowledge cutoff
Aug 2024
223 days from cutoff to release.
Context window

How much it can remember.

128K tokens ≈ 96,000 English words
4K 32K 128K 1M
Max output per call: 8K tokens
Capabilities

What it can do.

Vision input
· Audio input
· Video input
Function calling
Tool use
JSON mode
Streaming
Fine-tuning