"The Game Can't Be Done" Leaderboard is sponsored by the companies it ranks

Artificial intelligence models are rapidly increasing and competition is increasing. With so many players crowded into the space, who will be the best? And who decides? Arena (formerly LM Arena) has emerged as the de facto public leaderboard for Frontier LLM, influencing funding, launch, and PR cycles. In just seven months, the startup has grown from a PhD research project at the University of California, Berkeley, to a $1.7 billion valuation.

Equity host Rebecca Bellan speaks with Arena co-founders Anastasios Angelopoulos and Wei-Lin Chiang about how their platform has become the go-to leaderboard for frontier AI models, and how they’re trying to build a neutral benchmark even as companies like OpenAI, Google, and Anthropic back the project.

They detail how Arena works and why it’s harder to game than static benchmarks, what “structural neutrality” actually means, why Claude is currently at the top of expert leaderboards in legal and medical use cases, and how the company is expanding beyond chat to benchmarking agents, coding, and real-world tasks with new enterprise products.

Subscribe to Equity on YouTube, Apple Podcasts, Overcast, Spotify, and all casts. You can also follow Equity on X and Threads at @EquityPod.

Source link

What's Hot

Happiest country: Nordic countries remain the happiest in the world

Spurs 3 – 2 A Madrid

The Dow hits a new low, and Asian markets fall. Bank of Japan decision

“The Game Can’t Be Done” Leaderboard is sponsored by the companies it ranks

Gemini-powered Google Workspace features worth using

Rebel Audio is a new AI podcasting tool aimed at creators for the first time

Patreon CEO says AI companies’ fair use debate is ‘bullshit’, says creators should be paid

U.S. requires up to $15,000 deposit for visa applicants from 12 new countries | Immigration News

Delcy Rodriguez to replace Venezuelan Defense Minister Vladimir Padrino US-Venezuela tensions news

US Fed keeps interest rates stable amid economic uncertainty and Iran war | Banking News