Close Menu
  • Home
  • AI
  • Art & Style
  • Economy
  • Entertainment
  • International
  • Market
  • Opinion
  • Politics
  • Sports
  • Trump
  • US
  • World
What's Hot

OpenAI to acquire Ona to support AI coding assistant Codex

June 11, 2026

SpaceX’s IPO tests how Wall Street will price ‘strategic technology’

June 11, 2026

Trump administration appeals ruling suspending $100,000 in H-1B visa fees

June 11, 2026
Facebook X (Twitter) Instagram
Smart Breaking News on AI, Business, Politics & Global Trends | WhistleBuzz
Facebook X (Twitter) Instagram
  • Home
  • AI
  • Art & Style
  • Economy
  • Entertainment
  • International
  • Market
  • Opinion
  • Politics
  • Sports
  • Trump
  • US
  • World
Smart Breaking News on AI, Business, Politics & Global Trends | WhistleBuzz
Home » New Microsoft tool lets developers start AI behavioral testing with text descriptions
AI

New Microsoft tool lets developers start AI behavioral testing with text descriptions

Editor-In-ChiefBy Editor-In-ChiefJune 2, 2026No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email


AI researchers and labs have made significant advances in evaluating AI models for everything from safety and compliance to sycophantism and cooperativeness. However, companies and developers appear to be facing new and specific needs to ensure that their AI systems work as intended for their specific products and services.

To simplify its testing process, Microsoft on Tuesday completely lifted ASSERT, which stands for Adaptive Spec-driven Scoring for Assessment and Regression Testing.

According to Microsoft, this open-source framework uses AI to transform high-level natural language descriptions of goals, policies, or intended behavior into thorough, explorable, scored tests, making it easier to evaluate application-specific AI behavior.

ASSERT takes a plain description of an AI model’s expected behavior and policies, transforms them into a structured set of acceptable and unacceptable behaviors, generates problem scenarios and test cases, runs them against the target system, and scores the results. It can also record the path taken by the AI ​​system, including intermediate actions and tool calls, allowing developers to inspect where failures occur.

Developers can also provide system context, tools, and constraints if they wish to further customize what is evaluated.

For example, developers can specify that the document investigation AI agent should not send emails to people outside the company, limit sensitive information to executive level, and provide concise summaries with up-front context in mind. ASSERT uses these rules to generate test cases that continuously check whether the system follows those rules.

Image credit: Microsoft

According to Microsoft, this framework fills a gap that broader, general assessments cannot, when AI models are intended to behave in ways that are shaped by the context, policies, and tools of an application or product.

“One of the things we learned is that evaluation is absolutely critical to making good decisions,” said Sarah Bird, chief product officer for responsible AI at Microsoft. “Because if you don’t understand how an AI system works, it’s very difficult to know whether it meets your organization’s standards…What we found is that if you really want to have a system you can trust, you need to evaluate many more application-specific aspects.”

Bird said ASSERT can be used during system construction, after deployment, and even for continuous monitoring.

This release comes amid gradual but broader changes in the AI ​​industry. As models improve in functionality, researchers are focusing on repeatable tests and regression checks, and evaluation groups such as Stanford’s HELM, MLCommons’ AILuminate, and METR are deploying benchmarks that measure how models perform under different conditions.

If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Editor-In-Chief
  • Website

Related Posts

SpaceX SPV investors won’t know their true holdings until post-IPO lockup is lifted

June 11, 2026

Deezer’s new tool can identify AI music from Spotify, Apple Music and more

June 11, 2026

Pool’s new app turns screenshots into something useful

June 11, 2026
Add A Comment

Comments are closed.

News

President Trump to appoint US attorney Jay Clayton as Director of National Intelligence Donald Trump News

By Editor-In-ChiefJune 11, 2026

The nomination comes amid backlash over President Trump’s selection of Bill Pelt to be acting…

Stephen Miller’s War | Donald Trump

June 11, 2026

I got a close look at Mr. Kushner’s Albanian resort – it’s an environmental disaster | Environment

June 11, 2026
Top Trending

SpaceX SPV investors won’t know their true holdings until post-IPO lockup is lifted

By Editor-In-ChiefJune 11, 2026

SpaceX will make its public debut on Friday, but some investors who…

Deezer’s new tool can identify AI music from Spotify, Apple Music and more

By Editor-In-ChiefJune 11, 2026

As AI-generated music continues to rise on streaming services, there are growing…

Pool’s new app turns screenshots into something useful

By Editor-In-ChiefJune 11, 2026

For years, your phone’s camera roll has served two purposes. Not only…

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Welcome to WhistleBuzz.com (“we,” “our,” or “us”). Your privacy is important to us. This Privacy Policy explains how we collect, use, disclose, and safeguard your information when you visit our website https://whistlebuzz.com/ (the “Site”). Please read this policy carefully to understand our views and practices regarding your personal data and how we will treat it.

Facebook X (Twitter) Instagram Pinterest YouTube

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Facebook X (Twitter) Instagram Pinterest
  • Home
  • Advertise With Us
  • Contact US
  • DMCA Policy
  • Privacy Policy
  • Terms & Conditions
  • About US
© 2026 whistlebuzz. Designed by whistlebuzz.

Type above and press Enter to search. Press Esc to cancel.