web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Copilot Studio / Copilot Studio Evaluat...
Copilot Studio
Suggested Answer

Copilot Studio Evaluations Failing with "Error" on all test cases

(0) ShareShare
ReportReport
Posted on by
Running into an issue with Copilot Studio Evaluations, trying to determine if this is a service issue, this is happening for multiple clients environments. anyone else experiencing issues with Copilot Studio Evaluations?
 

Tested:

  • Large and small evaluation datasets
  • A single "hello" -> "hello" test case
  • Multiple agents
  • An empty agent with no knowledge sources, topics, or actions

The agents work correctly in Test Chat, but Evaluations always return Error for both General Quality and Compare Meaning.

Seeing this same behavior in multiple client environments/tenants, which makes me wonder if this is a broader Copilot Studio issue?

 

Best,

Carl Mesias

eval issue.jpg
I have the same question (0)
  • Suggested answer
    Sunil Kumar Pashikanti Profile Picture
    2,211 Moderator on at
     
    I tested this scenario in my tenant to validate whether this is a broader issue.

    I ran evaluations on an unpublished agent using a very simple test case (“hello → hello”) and tried both:
    • Single response
    • Conversation (preview)
    Both evaluations completed successfully and returned 100% pass.

    Based on this, evaluations are working correctly in at least some environments, which suggests this is not a global platform outage.
     
    Given your results:
    • Failing even on a simple test case
    • Reproducing across multiple agents and datasets
    • Test Chat working fine
    This points more towards an environment or tenant-specific issue, rather than the agent configuration itself.

    A few things worth checking:
    • Try in a different region/environment (if available)
    • Have another user in the same tenant run the same test
    • Compare across tenants where you’re seeing the issue
    • Check if there are any service health advisories during the timeframe
    If the issue is consistent in specific tenants but not others, it’s likely something backend but scoped (not global), and a support ticket with Session ID and timestamps would help Microsoft narrow it down.

    In my case, the same minimal test worked fine, so it may help you isolate whether this is tenant-specific vs broader.
     
    Single Response:
    Conversation:
     
    ✅ If one of the responses here solved your issue, please mark it as Accepted so others facing the same problem can benefit as well.
    👍 If this or any other reply here helped you, feel free to give it a Like. It helps others and is always appreciated.

    Sunil Kumar Pashikanti, Moderator
    Blog: https://sunilpashikanti.com/posts/
  • Suggested answer
    11manish Profile Picture
    3,052 on at
    Based on your testing, this looks more like a Copilot Studio Evaluations service issue than an agent configuration problem.
    • Review Microsoft Service Health for Copilot Studio/AI services.
    • Capture:
      • Environment IDs
      • Evaluation run IDs
      • Correlation IDs (if available)
      • Timestamps of failures
    • Open a Microsoft Support ticket if the issue persists, as you'll have a strong reproducible case demonstrating that the problem occurs across multiple tenants and with minimal test data.

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Season of Sharing Community Challenge Launch!

Jump in, show your community spirit, and win prizes!

Kudos to our 2025 Community Spotlight Honorees

Expanding mentorship, skilling, and AI innovation

Congratulations to the May Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Copilot Studio

#1
Valantis Profile Picture

Valantis 302

#2
11manish Profile Picture

11manish 146

#3
chiaraalina Profile Picture

chiaraalina 118 Super User 2026 Season 1

Last 30 days Overall leaderboard