Has anyone figured out how to do automated tests for Copilot? Seeing that there are some knobs to tweak, I'd like to ask it a number of questions, and be able to objectively gauge what the results are depending on the tweak (and I'm sure as Copilot evolve and new versions comes out), would be good to do objective comparison.