LLMsVideo TranslationImage GenerationVideo Generation
AI News

Independent coverage of the latest AI tool updates, releases, and comparisons.

Categories

  • AI LLMs
  • AI Video Translation
  • AI Image Generation
  • AI Video Generation

Company

  • About
  • Contact

Resources

  • Sitemap
  • AI Glossary
  • Tool Comparisons
  • Facts / Grounding
  • llms.txt
  • XML Sitemap
© 2026 AI News. Independent editorial coverage. Not affiliated with any AI company.
AI LLMs

Meta Ships Llama 4: Scout Fits on One GPU, Maverick Beats GPT-4o

Llama 4 introduces MoE architecture with three models. Scout has a 10M token context window. Maverick's 128 experts beat GPT-4o on LMArena. Behemoth is still training.

MJ

Maya Johnson

Saturday, April 5, 2025·3 min read

Meta released the Llama 4 family on April 5, 2025, marking a fundamental architecture shift: Llama 4 uses Mixture of Experts (MoE) for the first time. Three models shipped or were announced — Scout, Maverick, and Behemoth — each targeting a different scale point, according to Meta AI.

Llama 4 Scout: 10M Context on One GPU

Scout is the practical breakthrough. At 17B active parameters with 16 experts, it fits on a single H100 GPU while offering a 10M token context window — 50x larger than most competitors. That's enough to process entire codebases, book-length documents, or months of conversation history in a single prompt.

The 10M context window is particularly significant for enterprise applications that need to reason over massive document collections without retrieval-augmented generation (RAG) pipelines.

Llama 4 Maverick: Competing With Closed Models

Maverick scales up to 17B active parameters with 128 experts (400B total parameters). It beat GPT-4o and Gemini 2.0 Flash on LMArena with an Elo score of 1,417, making it the first open-source model to consistently outperform leading closed-source models on competitive benchmarks.

Maverick is natively multimodal — handling text, image, and video — pre-trained on 30T+ tokens across 200 languages. This is 2x the training data of Llama 3.

Behemoth: Still Training

Llama 4 Behemoth was announced but not yet released. At 288B active parameters and approximately 2T total parameters, it already outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM benchmarks despite being mid-training.

The Open Source Statement

All released Llama 4 models are open-source, continuing Meta's strategy of undermining the commercial moat of closed-source providers. By March 2025, Llama had passed 1 billion cumulative downloads — making it the most widely deployed AI model family in history.

Llama 4 is used in government (GSA partnership for federal agencies), military (expanded to NATO allies and Five Eyes+ nations), and space (deployed on the International Space Station via a partnership with Booz Allen and HPE).

LlamaCon: The Ecosystem Event

Alongside the model launch, Meta held LlamaCon (April 29) where it announced the Llama API (limited preview), performance partnerships with Cerebras and Groq for faster inference, security tools (Llama Guard 4, LlamaFirewall), and the Meta AI app.

Our Take

Llama 4 is Meta's strongest argument that open-source AI can compete with and beat closed-source models. Maverick beating GPT-4o is a milestone — it means the best freely available model now outperforms what was the best model in the world just a year ago. Scout's 10M context window on a single GPU is the kind of practical innovation that enterprises actually need. The question is whether Behemoth, when it ships, can compete with Claude Opus and GPT-5 at the frontier.

FAQ

What is Llama 4 Scout? Llama 4 Scout is Meta's efficient open-source model with 17B active parameters, 16 experts, and a 10M token context window. It fits on a single H100 GPU.

How does Llama 4 Maverick compare to GPT-4o? Maverick beat GPT-4o and Gemini 2.0 Flash on LMArena with an Elo score of 1,417. It has 400B total parameters with 128 experts.

Is Llama 4 open source? Yes, all released Llama 4 models are open-source. Llama has surpassed 1 billion cumulative downloads as of March 2025.

What is Llama 4 Behemoth? Behemoth is the largest Llama 4 model with 288B active parameters and ~2T total parameters. It was announced but not yet released as of April 2025, already outperforming GPT-4.5 on STEM benchmarks during training.

Tools Mentioned

Llama (Meta)Open-source large language models from Meta
Free (open source)
GPT (OpenAI)Industry-leading large language models powering ChatGPT
$20/mo (ChatGPT Plus)
Gemini (Google)Google's multimodal AI model family
$19.99/mo (Advanced)
MistralEuropean AI lab building efficient open and commercial LLMs
Usage-based API

More in AI LLMs

AI LLMs

Meta Launches Muse Spark — Its First Closed-Source Model Targets 'Personal Superintelligence'

Meta Superintelligence Labs unveils Muse Spark with dual modes, 58% on Humanity's Last Exam, and multimodal reasoning. Breaking with tradition, the model is not open-source.

Alex Chen·Apr 8, 2026
AI LLMs

OpenAI, Anthropic, and Google Unite to Combat AI Model Copying From China

The three biggest Western AI labs are sharing information through the Frontier Model Forum to prevent Chinese competitors from extracting their models' capabilities.

Sarah Mueller·Apr 7, 2026
← Back to all news