Close Menu
  • Home
  • AI
  • Art & Style
  • Economy
  • Entertainment
  • International
  • Market
  • Opinion
  • Politics
  • Sports
  • Trump
  • US
  • World
What's Hot

Putin refuses to compromise on Ukraine despite Trump’s peace push

December 18, 2025

Jake Paul may have a ‘very good chance’ against Anthony Joshua, says sparring partner Lawrence Okolie Boxing News

December 18, 2025

Why are British politicians flocking to big American tech companies?

December 18, 2025
Facebook X (Twitter) Instagram
WhistleBuzz – Smart News on AI, Business, Politics & Global Trends
Facebook X (Twitter) Instagram
  • Home
  • AI
  • Art & Style
  • Economy
  • Entertainment
  • International
  • Market
  • Opinion
  • Politics
  • Sports
  • Trump
  • US
  • World
WhistleBuzz – Smart News on AI, Business, Politics & Global Trends
Home » Microsoft built a fake marketplace to test its AI agent – and it failed in a surprising way
AI

Microsoft built a fake marketplace to test its AI agent – and it failed in a surprising way

Editor-In-ChiefBy Editor-In-ChiefNovember 5, 2025No Comments2 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email


On Wednesday, Microsoft researchers released a new simulation environment designed to test AI agents, along with new research showing that current agent models may be vulnerable to manipulation. The study, conducted in collaboration with Arizona State University, raises new questions about how well AI agents perform when working without supervision, and how quickly AI companies can realize the promise of their future.

The simulation environment, named “Magentic Marketplace” by Microsoft, is built as a synthesis platform for experimenting with AI agent behavior. In a typical experiment, a customer agent might try to order dinner according to a user’s instructions, while agents representing different restaurants compete to get the order.

The team’s first experiment involved 100 individual customer-side agents interacting with 300 business-side agents. Because the Marketplace source code is open source, it is easy for other groups to adapt the code to run new experiments and reproduce the results.

Ece Kamar, managing director of Microsoft Research’s AI Frontiers Lab, said this type of research will be important for understanding the capabilities of AI agents. “There are real questions about how the world changes when these agents work together and talk to each other and negotiate with each other,” Kamal said. “We want to understand these things deeply.”

In our initial research, we investigated a combination of key models, including GPT-4o, GPT-5, and Gemini-2.5-Flash, and discovered some surprising weaknesses. Specifically, researchers have discovered several techniques that companies can use to manipulate customer agents into purchasing their products. Researchers found that efficiency decreased, especially as customer agents had more options to choose from and vast amounts of agent attention space.

“We want these agents to help us work through a lot of options,” Comer says. “And we find that the current model is actually overwhelmed by too many options.”

Agents also encountered problems when asked to work together toward a common goal. Apparently, they didn’t know which agent should play what role in the collaboration. Although giving the model clearer instructions on how to collaborate improved performance, the researchers believed that the model’s unique features still needed improvement.

tech crunch event

san francisco
|
October 13-15, 2026

“You can instruct a model step-by-step, just like you would teach a model,” Comer says. “But if you’re essentially testing collaborative features, you would expect these models to have those features by default.”



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Editor-In-Chief
  • Website

Related Posts

Why are British politicians flocking to big American tech companies?

December 18, 2025

Pickle Robot adds Tesla veteran as first CFO

December 18, 2025

ChatGPT launches app store and lets developers know the app store is open for business

December 18, 2025
Add A Comment

Comments are closed.

News

President Trump signs order to reclassify marijuana and ease research restrictions Donald Trump News

By Editor-In-ChiefDecember 18, 2025

The executive order calls on the U.S. Attorney General to expedite federal reclassification and reduce…

US further sanctions ICC judges, citing ruling on Israeli war crimes investigation | ICC News

December 18, 2025

Trump Media merges with nuclear fusion company to strengthen AI | Media News

December 18, 2025
Top Trending

Why are British politicians flocking to big American tech companies?

By Editor-In-ChiefDecember 18, 2025

The war for AI talent shows no signs of slowing down, with…

Pickle Robot adds Tesla veteran as first CFO

By Editor-In-ChiefDecember 18, 2025

Pickle Robot, which develops autonomous unloading robots for warehouses and distribution centers,…

ChatGPT launches app store and lets developers know the app store is open for business

By Editor-In-ChiefDecember 18, 2025

App developers who want to launch programs on ChatGPT can now submit…

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Welcome to WhistleBuzz.com (“we,” “our,” or “us”). Your privacy is important to us. This Privacy Policy explains how we collect, use, disclose, and safeguard your information when you visit our website https://whistlebuzz.com/ (the “Site”). Please read this policy carefully to understand our views and practices regarding your personal data and how we will treat it.

Facebook X (Twitter) Instagram Pinterest YouTube

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Facebook X (Twitter) Instagram Pinterest
  • Home
  • Advertise With Us
  • Contact US
  • DMCA Policy
  • Privacy Policy
  • Terms & Conditions
  • About US
© 2025 whistlebuzz. Designed by whistlebuzz.

Type above and press Enter to search. Press Esc to cancel.