web
You’re offline. This is a read only version of the page.
close
Skip to main content
Community site session details

Community site session details

Session Id :
Copilot Studio - General
Unanswered

Create test cases to evaluate your agent (preview)

(0) ShareShare
ReportReport
Posted on by
Hello,
 
My customer is a US Department of Defense organization using M365 in GCCH.  We are designing and building applications and automations for them using the Power Platform, AI Builder, Copilot Studio Agents and Custom Coded Agentic AI.  Earlier in the week, I noticed this feature in our Copilot Studio Agents: Evaluate your agent's performance (preview) - Microsoft Copilot Studio | Microsoft Learn and we wanted to make use of this feature for this customer.  
 

Agent Name: Side-by-Side 

Feature: Run a Regression Test with New Test Set 

 

Where: 

Open Agent > Analytics Tab > See current Test Sets or create New Test Set 

 

Result: 

All 4 tests resulted in Error 

 

Test Name 

Test Method 

How Created 

Test Start 

Test Finish 

Test Duration 

Test Result 

Evaluate Side-by-Side Test 1 

General Quality 

Test Set: 10 random Generated tests 

21:16 UTC 

04:47 UTC 

7:31 

Error 

Evaluate Side-by-Side 

General Quality 

2 hand created questions 

13:06 UTC 

13:50 UTC 

44 Minutes 

Error 

CSV Test Set 

General Quality 

Imported csv: 2 created questions 

14:02 UTC 

15:32 UTC 

1:30 

Error 

Similarity Test 

Similarity 

Hand created 1 question 

15:44 UTC 

16:29 UTC 

45 minutes 

Error 

 

 

 

 Also noted, UI will not allow me to create a connection to my account for the Test Set to run under my UPN, and will not allow me to click Use test Account.  The Drop-Down remains ‘blank’ after clicking. 

 

 

 

Further Observation: 

  • I can only run 1 test at timeIf I attempt to run a new test, with another test currently in progress, it throws and error:  "There was a problem starting the evaluationPlease Try again. 

 

Conclusion: 

Appears this new Preview Feature has been released from the PG, but only in a partially functional state.

We didn't want to open a ticket as this is a Preview feature, but just wanted to send you a demand signal that this feature is wanted and would be immediately used if you can get this service fully functional

Thank you

Categories:
I have the same question (0)
  • Michael E. Gernaey Profile Picture
    51,337 Super User 2025 Season 2 on at
    Create test cases to evaluate your agent (preview)
     
    As this is in preview, I would not attempt to use it, especially at the government or really any customer.
     
     

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Responsible AI policies

As AI tools become more common, we’re introducing a Responsible AI Use…

Tom Macfarlan – Community Spotlight

We are honored to recognize Tom Macfarlan as our Community Spotlight for October…

Leaderboard > Copilot Studio

Last 30 days Overall leaderboard