web
You’re offline. This is a read only version of the page.
close
Skip to main content
Community site session details

Community site session details

Session Id :
Power Apps - AI Builder
Suggested answer

Extract Text Accurate Results

(0) ShareShare
ReportReport
Posted on by 26
I am creating a solution to index invoices. I am not using the OOB feature in Power Automate/AI Builder because users would need to train invoices. We are using a custom prompt to extract the values from a PDF invoice. My problem is the prompt that I am using is very inconsistent. 
 
For example:
Prompt: Voucher BU (String)If the value has “MSC” in front, remove “MSC” and return only the remaining value.
Result: MSC150
Invoice: MSC50501
Expected return: 50501
 
Prompt: Department Number (String) Extract only the numeric portion of the department number. Must start with “Department Number” or “Dept. No”. If not present, leave blank.
Result: 8640
Invoice: Department NO 8606
Expected return: 8606
 
Prompt: PO Number (String)Extract only if explicitly labeled “PO Number” or “Purchase Order Number”. Ignore fields labeled Order #, CUS P/O #, Customer PO, GAS P/O #, or any variation of these. If the PO Number is fewer than 10 characters, leave blank. Do not infer PO Number from unrelated fields.
Result: 879583 
Invoice: 879583
Expected return: ""
 
I have reconstructed my prompt several times and it's hit or miss. If I can't get this to be more stable by the end of Nov the business will want to continue using a very expensive OCR system when I know this solution will work.  If I don't get consistent accurate results my business users will lose faith in this product. 
 
Copilot recommends adding in conditional logic in my flow or use regex . My flow already has 7 conditions. I don't want the flow to be bogged down. 
 
I am looking for recommendations and tips because I don't know where else to turn. 
 
Thanks,
 
Shera H. 
 
 
 
Categories:
I have the same question (0)
  • Suggested answer
    SpongYe Profile Picture
    5,603 Super User 2025 Season 2 on at
    Extract Text Accurate Results
     
    It all comes down to prompt engineering clear, specific prompts produce consistently better outcomes.
    Something it just help to get better results to shows how to use five varied examples (positives + negatives) in the prompt.
    Here is my example of the prompt for Voucher BU:
    You are an extraction engine. Return only JSON that conforms to the schema.
    
    Target field
    
    voucherBU: From the invoice text, locate the Voucher BU value. If the value begins with the literal MSC (case-insensitive), remove exactly that leading MSC and any immediate spaces, then return the remaining characters. If it does not begin with MSC, return the value unchanged. If no Voucher BU value is present, return "". Do not infer or fabricate values.
    
    Important details
    
    Only strip MSC when it is a prefix of the value. Do not remove MSC appearing in the middle or after other characters.
    
    Do not remove any other letters; do not remove numbers.
    
    Trim surrounding whitespace.
    
    Few-shot examples
    
    Input: Voucher BU: MSC50501 → {"voucherBU":"50501"}
    
    Input: Voucher BU MSC 150 → {"voucherBU":"150"}
    
    Input: Voucher BU: 8640 → {"voucherBU":"8640"}
    
    Input: Voucher BU: XMSC50501 (MSC not at start) → {"voucherBU":"XMSC50501"}
    
    Input: voucher bu: msc 000123 → {"voucherBU":"000123"}
    
    Document text:
    {invoice_text}

    I hope this helps and gives you other ideas.
    Good luck! 

     

  • SheraHintzen Profile Picture
    26 on at
    Extract Text Accurate Results
    Thank you for that example. That is very helpful. I will update my prompt and see if that helps. Here is my entire prompt that I am using for this process. The other properties were populating correctly so I didn't include them in my original post. Can a prompt in AI Builder be too long?
     

    Task

    Extract and organize key details from invoices with 90% or higher confidence.

    Sections Required

    Vendor Name (String)

    - Extract the name of the company issuing the invoice (typically found in the header or footer of the document, not in the "Accounts Payable" section).

    - Remove all special characters except the ampersand (&).

    Department Number (String)

    - Extract only the numeric portion of the department number.

    - Must start with “Department Number” or “Dept. No” or "Department NO". If not present, leave blank.

    Invoice Amount (Decimal)

    - Extract as a decimal value.

    Invoice Date (Date)

    - Format as mm/dd/yyyy.

    Invoice Number (String)

    - If not present, leave blank.

    PO Number (String)

    - Extract only if explicitly labeled “PO Number” or “Purchase Order Number”.

    - Ignore fields labeled Order #, CUS P/O #, Customer PO, GAS P/O #, or any variation of these.

    - If the PO Number is fewer than 10 characters, leave blank.

    - Do not infer PO Number from unrelated fields.

    Contract Number (String)

    - May be labeled “Statement of Service” or “Customer PO”.

    - If PO Number is present, leave Contract Number blank.

    Voucher BU (String)

    - Extract only the numeric portion of the Voucher BU.

    - May start with "MSC”.

    Additional Rules

    - PO Number and Contract Number are not the same as Agreement Number.

    - If PO Number is present leave Contract Number blank.

    - Output must be structured in valid JSON format.

    Data Source

    <Invoice PDF>

    Output Format

    Return the extracted data in this JSON structure:

    {

      "Vendor Name": "",

      "Department Number": "",

      "Invoice Amount": 0.00,

      "Invoice Date": "mm/dd/yyyy",

      "Invoice Number": "",

      "PO Number": "",

      "Contract Number": "",

      "Voucher BU": ""

    }

     

     

     

    I did add a condition to my flow for the PO Number just in case the prompt fails.

     

    {
      "type": "If",
      "expression": {
        "and": [
          {
            "equals": [
              "@not(empty(body('Parse_JSON')?['PO Number']))",
              true
            ]
          },
          {
            "equals": [
              "@length(body('Parse_JSON')?['PO Number'])",
              10
            ]
          }
        ]
      }
    }
     
    This seems to help. 

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Responsible AI policies

As AI tools become more common, we’re introducing a Responsible AI Use…

Tom Macfarlan – Community Spotlight

We are honored to recognize Tom Macfarlan as our Community Spotlight for October…

Leaderboard > Power Apps

#1
WarrenBelz Profile Picture

WarrenBelz 829 Most Valuable Professional

#2
developerAJ Profile Picture

developerAJ 489

#3
Michael E. Gernaey Profile Picture

Michael E. Gernaey 395 Super User 2025 Season 2

Last 30 days Overall leaderboard