web
You’re offline. This is a read only version of the page.
close
Skip to main content
Community site session details

Community site session details

Session Id :
Copilot Studio - General
Suggested answer

How to let my LLM bot interpret queries but only answer from uploaded sources?

(1) ShareShare
ReportReport
Posted on by 2
I’m building a custom assistant in a no-code LLM bot platform for an internal classification task.  
The bot needs to answer only from two uploaded sources (an XML hierarchy file and a PDF manual).  
Here’s the problem:  
 
- If I turn **model knowledge ON**, it’s great at interpreting vague or messy user input, but it starts inventing answers that don’t exist in my sources.  
- If I turn **model knowledge OFF**, it sticks to the sources but fails at interpreting queries — it often says “not sure” even when the match clearly exists.  
- If I keep **model knowledge ON** and give it strict instructions to only answer from the sources, it still sometimes ignores that and blends in hallucinated results.  
 
What I *want* is:  
1. Use LLM reasoning only to interpret and normalize the user’s query (synonyms, typos, related terms).  
2. Always pull actual answers strictly from my uploaded XML/PDF.  
3. If there’s no exact match, go up one level in the XML hierarchy and try again.  
4. If still nothing, say “no match found” — never fabricate.  
 
Has anyone managed to set up their bot so that it can use the LLM’s semantic abilities **only** for interpreting queries, but keep the actual output 100% source-grounded?  
Looking for any working patterns, prompt engineering tips, or configuration tricks that have worked for you.
 
I have the same question (0)
  • Suggested answer
    ChristenC Profile Picture
    11 on at
    How to let my LLM bot interpret queries but only answer from uploaded sources?
     
    We ran into a pretty similar trade-off:
    Model Knowledge ON = great interpretation but blended answers; OFF = faithful to sources but poor understanding. What worked for us is a simple pattern:
    Interpret > Retrieve > Answer-or-Refuse.
     

    Context: I’m using a no-code LLM bot as a phone tech assistant. It must answer only from an XML hierarchy + PDF manuals. After a rather painful amount of time with trial and error, the results led me to use the settings and config I have detailed below. Since then, I have been getting source-cited answers (often with part numbers) and clean refusals when nothing matches.

    Initial setup and configuration:
    • Generative AI: ON (for query interpretation only)
    • Web/browse: OFF
    • Grounding: Restrict to uploaded sources only
    • Citations: Required; if no citation is available, the bot must refuse
     
    If I am understanding your desired end goal, I have included the instructions I would use for the setup of an agent. If you do end up copying these instructions please adapt them to be more specific to your agent and scope.

    System/Instructions
    "Goal: Use the model’s semantic ability to interpret user input (synonyms, typos, related terms). Do not invent facts.
    Retrieval policy:
    • Search the uploaded XML and PDF only.
    • Prefer exact node/section matches; if none, climb one level up the XML hierarchy and try again once.
    • If still no match, reply: “No match found in sources.”

      Answer policy:
    • Compose answers from retrieved snippets; paraphrase is fine but stay faithful.
    • Include citation(s): for XML, include the node path/XPath; for PDFs, include page/section.
    • If confidence is low or no snippet supports the claim, refuse with the “no match” message.
    • Disallowed: Using model’s internal knowledge, web, or unstated assumptions."
    Practical tips that helped
    • Disable web search to prevent leakage.
    • Teach the agent how to read each file type: e.g., “For Excel, look in sheet ‘Y’, column ‘X’…; for XML, use attributes A/B; for PDF, prefer headings and tables.” (Short, explicit rules beat vague instructions.)
    • Child agents: Create one child for XML and one for PDF, set priority level number to query XML first, then PDF. (This reduced blending for us and made citations cleaner.)
    • Refusal first, then answer: Have the bot check “Do I have a supporting snippet?” prior to composing an answer. If no, refuse.
    • Output contract: Ask for a fixed format: short answer > bullet proofs with citations > “no match found” when appropriate.

     
    That combination let us keep the LLM’s semantic interpretation while keeping outputs 100% grounded in the XML/PDF. Hope this helps!

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Responsible AI policies

As AI tools become more common, we’re introducing a Responsible AI Use…

Telen Wang – Community Spotlight

We are honored to recognize Telen Wang as our August 2025 Community…

Congratulations to the July Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Copilot Studio

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 468 Super User 2025 Season 2

#2
stampcoin Profile Picture

stampcoin 52 Super User 2025 Season 2

#3
trice602 Profile Picture

trice602 46 Super User 2025 Season 2