Model Selection ComparisonsModel Comparison

Gemma 4 on device performance and offline capability

April 5, 2026Rohan Paul, Ethan Mollick

Rohan Paul and Ethan Mollick highlight Gemma 4 running fully offline on phones at high token rates, while Mollick cautions that small on device models may still struggle with true agentic workflows requiring judgment and self correction.

Open in PulseSee the full authority discussion →

QUOTES

Incredible possibilities for on-device small models.

Here @adrgrondin is running Google’s Gemma 4 E2B on iPhone 17 Pro.

~40tk/s with MLX optimized for Apple Silicon

Fully offline with thinking mode.

I am impressed by Gemma 4, there’s a lot of power for an on-device model at fast speeds.

But I am not convinced you can get real agentic workflows out of a small model on device.

VOICES

Rohan Paul

Ethan Mollick

RELATED TERMS

on deviceopen modelsgooglegemmacontext windowopen models

OTHER FINDINGS IN MODEL SELECTION COMPARISONS

Codex vs Claude Code competition and MCP as the battleground Mythos size-and-price expectations (multi-trillion parameter '10T' pricing)Defaulting to OpenAI and Anthropic for consistency vs other LLMs breaking weirdly

AMYGDALA PULSE

See what authorities are saying right now

This finding is one of many signals tracked across Artificial Intelligence. The live feed updates every few hours with new authority voices, debates, and emerging ideas.

Open Artificial Intelligence Pulse Browse all topics

← Back to Artificial Intelligence