Artificial intelligence models are rapidly increasing and competition is increasing. With so many players crowded into the space, who will be the best? And who decides? Arena (formerly LM Arena) has emerged as the de facto public leaderboard for Frontier LLM, influencing funding, launch, and PR cycles. In just seven months, the startup has grown from a PhD research project at the University of California, Berkeley, to a $1.7 billion valuation.
Equity host Rebecca Bellan speaks with Arena co-founders Anastasios Angelopoulos and Wei-Lin Chiang about how their platform has become the go-to leaderboard for frontier AI models, and how they’re trying to build a neutral benchmark even as companies like OpenAI, Google, and Anthropic back the project.
They detail how Arena works and why it’s harder to game than static benchmarks, what “structural neutrality” actually means, why Claude is currently at the top of expert leaderboards in legal and medical use cases, and how the company is expanding beyond chat to benchmarking agents, coding, and real-world tasks with new enterprise products.
Subscribe to Equity on YouTube, Apple Podcasts, Overcast, Spotify, and all casts. You can also follow Equity on X and Threads at @EquityPod.
