Nice work! How do you verify correctness of the generated exercises and explanations? To me this looks the biggest risk in becoming a user: what if my _teacher_ teaches me subtle nonsense that I cannot easily detect since I'm learning and unfamiliar with the material (even if it's only in the 1-2% of cases)? Human teachers make mistakes too, but an LLM cannot _understand_ that it made one... So.. how do you solve for this issue of trust?
they dont verify, they just present a LLM app and the user suffers if the information is not correct. however most of the time it is correct but sometimes it is not. one way to verify correctness is to ask a bigger model like OpenAI o3
I’m not sure how the solution to “LLMs lie” is “more LLM”. I’ve personally had o3 tell me stuff like: “I ran this query against version 0.10 of DuckDB and verified that it works” when the query contains functions that don’t exist in DuckDB or “this version gets better PageSpeed Insights results” when I know it can’t check those. It happens surprisingly often and it’s super obvious. However, it’s made me seriously wary of the information that it gives which I can’t verify purely based on logic.