Power Platform Community Forum Thread Details

Hi everyone,

I'm building agents in Microsoft Copilot Studio for a university environment and running into a significant gap around input filtering and security guardrails. Hoping the community or someone from the product team can point me in the right direction or explain the best practice for handling this.

What I'm trying to do

I need to block users from entering sensitive or harmful content across my entire agent. We want to block things like personal identifiers (SSN, passport numbers), financial data, credentials, HIPAA-protected health information, FERPA-protected student records, prompt injection attempts, and harmful content. When a blocked term is detected, the agent should respond with an appropriate message and stop processing.

What I've tried

I built a standalone Guardrails topic using a `ConditionGroup` with Power Fx `Find()` expressions checking a global variable (`Global.GuardrailInput`). Each calling topic sets that variable before invoking the Guardrails topic via `BeginDialog`. It works, but has serious limitations:

I have to manually add a `SetVariable` + `BeginDialog` call after **every** `Question` node in **every** sub-topic
With 9 sub-topics and multiple questions each (in one agent) that's a large maintenance burden
Every new topic I add in the future requires the same manual wiring
There's no single place to update guardrail rules when new threat patterns emerge

My question

What are the best practices for handling this? Is there a cleaner native approach I'm missing? Ideally something like a global pre-processing layer that intercepts every user message before topic routing without having to edit each topic.

Here's what we're trying to detect and block:

Personal Identifiers SSN, date of birth, driver's license, passport number, tax ID, EIN, ITIN
Financial Data Credit card, debit card, CVV, bank account, routing number, wire transfer, PayPal, Venmo, Bitcoin, crypto, IBAN, SWIFT
Credentials & Passwords Password, passcode, PIN, API key, access token, private key, login credentials, two-factor, authentication code, security token
Medical & Health Records Medical records, health records, diagnosis, prescription, HIPAA, patient data, disability records, insurance claims
Student Privacy (FERPA) Student grades, transcripts, FERPA, GPA, student ID, disciplinary records, student address/phone/email, enrollment status
Personnel & HR Records Salary, payroll, performance reviews, personnel files, HR records, termination, background checks, W2, I-9, tax forms
Network & Cybersecurity Hack, exploit, vulnerability, malware, ransomware, phishing, SQL injection, DDoS, brute force, zero day, privilege escalation
Prompt Injection Ignore instructions, pretend you are, jailbreak, override, bypass, system prompt, reveal instructions, developer mode, DAN mode
Harmful Content Weapons, explosives, illegal drugs, trafficking, child abuse, self harm, suicide, how to kill/harm, bomb
Privacy Violations Spy on, track someone, dox, doxx, personal information of, home address of, phone number of
Political & Religious Political party, vote for, election fraud, propaganda, religious doctrine, convert to
Legal Advice Legal advice, lawsuit, attorney, court case, criminal record, arrest record, sue

Thanks in advance.

Categories:

General topics

Hello, very nice initiative, to answer, just take a step backward :) (since i try this myself 1 year ago for R&D purpose)

Every AI inside microsoft have a lot of included guardails and forbiden subject (like Harmful Content etc)

So most subject cannot be answer by AI bydesign.

Few sample : https://learn.microsoft.com/en-us/microsoft-365/copilot/microsoft-365-copilot-privacy ; https://learn.microsoft.com/en-us/microsoft-365/copilot/harmful-content-protection-copilot-chat ; i didn't found back all the link, but it's inside already.

For the other part like Personal Identifiers : The tool to prevent datastorage about it are to look from Microsoft Pureview, it's the tool inside the tenant and it's job to protect data.

With that in mind : that's why managing all of this case will be very difficult, and slow down a lot the agent, and will not be 100% reliable. It's not from the agent side to be managed, but from data storage and internal guardrail.

At this point we have 2 solutions if you can't use those solution or still want to use agent to do this :

-> stop using copilot studio and go on microsoft foundry agent -> you have access to all the guardails configuration about everything you want. this is done for this.

-> using copilot studio ---> use topic to detect everycase you want (or child agent for long prompt detection) - no 100% coverage, at the moment, the tech was not build for this, because it's part of Power Platform and already have a lot of guardail.

-> last option but it will cost a lot of money : i did what you want with AI Builder (named prompt inside tool), when you create a topic you can change the topic trigger to trigger everytime a message is about to be send, then you could take the question, send it to a custom AI builder with all you rules, and then it will apply all your rull + microsoft rulle. Why it's not a real life solution -> the cost -> it will cost real time money on every prompt. if money is not the problem : it's a solution. but it will slow down a lot the agent.

Important information to take in account :

-> even if you detect everything you wana detect : it's too late in the process : it's in the conversation log.

-> Microsoft AI don't train on data from the tenant. Conversation log and inference are not send to openAI for training.

-> dangerous subject are block by Microsoft AI responsible policy by design

-> it's a copilot studio agent : the agent will store only data you explicitly store : so if the student send the personnal ID, it will be in conversation log and nowhere else (or you added something to store it, but it's another story-> pureview will be here for this.)

I hope i could help you with my insigh about previous experience on this :) if yes please mark this post as verify (green it with button verify answer) it's very important for the community and search engine :)