Close Menu
  • Home
  • AI
  • Art & Style
  • Economy
  • Entertainment
  • International
  • Market
  • Opinion
  • Politics
  • Sports
  • Trump
  • US
  • World
What's Hot

Israel approves reopening of Rafah crossing in Gaza after being closed for nearly two years, official announced

February 2, 2026

Winmau World Darts Masters: Luke Littler defeats Luke Humphries in final set thriller to win TV title in Milton Keynes | Darts News

February 2, 2026

Disney (DIS) Q1 earnings

February 2, 2026
Facebook X (Twitter) Instagram
WhistleBuzz – Smart News on AI, Business, Politics & Global Trends
Facebook X (Twitter) Instagram
  • Home
  • AI
  • Art & Style
  • Economy
  • Entertainment
  • International
  • Market
  • Opinion
  • Politics
  • Sports
  • Trump
  • US
  • World
WhistleBuzz – Smart News on AI, Business, Politics & Global Trends
Home » Microsoft built a fake marketplace to test its AI agent – and it failed in a surprising way
AI

Microsoft built a fake marketplace to test its AI agent – and it failed in a surprising way

Editor-In-ChiefBy Editor-In-ChiefNovember 5, 2025No Comments2 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email


On Wednesday, Microsoft researchers released a new simulation environment designed to test AI agents, along with new research showing that current agent models may be vulnerable to manipulation. The study, conducted in collaboration with Arizona State University, raises new questions about how well AI agents perform when working without supervision, and how quickly AI companies can realize the promise of their future.

The simulation environment, named “Magentic Marketplace” by Microsoft, is built as a synthesis platform for experimenting with AI agent behavior. In a typical experiment, a customer agent might try to order dinner according to a user’s instructions, while agents representing different restaurants compete to get the order.

The team’s first experiment involved 100 individual customer-side agents interacting with 300 business-side agents. Because the Marketplace source code is open source, it is easy for other groups to adapt the code to run new experiments and reproduce the results.

Ece Kamar, managing director of Microsoft Research’s AI Frontiers Lab, said this type of research will be important for understanding the capabilities of AI agents. “There are real questions about how the world changes when these agents work together and talk to each other and negotiate with each other,” Kamal said. “We want to understand these things deeply.”

In our initial research, we investigated a combination of key models, including GPT-4o, GPT-5, and Gemini-2.5-Flash, and discovered some surprising weaknesses. Specifically, researchers have discovered several techniques that companies can use to manipulate customer agents into purchasing their products. Researchers found that efficiency decreased, especially as customer agents had more options to choose from and vast amounts of agent attention space.

“We want these agents to help us work through a lot of options,” Comer says. “And we find that the current model is actually overwhelmed by too many options.”

Agents also encountered problems when asked to work together toward a common goal. Apparently, they didn’t know which agent should play what role in the collaboration. Although giving the model clearer instructions on how to collaborate improved performance, the researchers believed that the model’s unique features still needed improvement.

tech crunch event

san francisco
|
October 13-15, 2026

“You can instruct a model step-by-step, just like you would teach a model,” Comer says. “But if you’re essentially testing collaborative features, you would expect these models to have those features by default.”



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Editor-In-Chief
  • Website

Related Posts

These AI note-taking devices help you record and transcribe meetings

February 2, 2026

AI staff reduction or “AI cleaning”? |Tech Crunch

February 1, 2026

India to cut taxes to zero until 2047 to attract global AI workloads

February 1, 2026
Add A Comment

Comments are closed.

News

Cuba denies accusations of security threat as US increases pressure | Political News

By Editor-In-ChiefFebruary 2, 2026

The Cuban government rejected accusations that it threatened U.S. security and insisted it was ready…

President Trump to close Kennedy Center for renovations following backlash from performers | 2020 Donald Trump News

February 1, 2026

5-year-old boy and father detained by ICE return to Minnesota | Migration News

February 1, 2026
Top Trending

These AI note-taking devices help you record and transcribe meetings

By Editor-In-ChiefFebruary 2, 2026

Digital meeting note-taking tools like Read AI, Fireflies.ai, Fathom, and Granola can…

AI staff reduction or “AI cleaning”? |Tech Crunch

By Editor-In-ChiefFebruary 1, 2026

How many of the companies that have recently made layoffs have truly…

India to cut taxes to zero until 2047 to attract global AI workloads

By Editor-In-ChiefFebruary 1, 2026

As the global race to build AI infrastructure accelerates, India has offered…

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Welcome to WhistleBuzz.com (“we,” “our,” or “us”). Your privacy is important to us. This Privacy Policy explains how we collect, use, disclose, and safeguard your information when you visit our website https://whistlebuzz.com/ (the “Site”). Please read this policy carefully to understand our views and practices regarding your personal data and how we will treat it.

Facebook X (Twitter) Instagram Pinterest YouTube

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Facebook X (Twitter) Instagram Pinterest
  • Home
  • Advertise With Us
  • Contact US
  • DMCA Policy
  • Privacy Policy
  • Terms & Conditions
  • About US
© 2026 whistlebuzz. Designed by whistlebuzz.

Type above and press Enter to search. Press Esc to cancel.