Out of interest.. does the resultant data get used by the LLM or just generating SQL, executing and returning separately?
PM on the project here - The results from the query are generally not used by the LLM. In agent mode though, during query planning, the agent may retrieve sample of the data to improve precision of the queries. For example, getting distinct values from dimensional table to resolve filter condition from natural language statement.
Thanks. I worry about these kind of tools connecting to production databases.. Especially considering how easy it is to switch out LLM endpoints, where that data is going, how it is retained, the context etc becomes a bit of a privacy nightmare..
Absolutely valid concern. Our extension connects to LLMs through Github Copilot. Github Copilot is Microsoft product and offers variety of enterprise plans, which enables your IT to approve what can be used for what kind of data. This gives you a clear path towards compliance with your enterprise requirements.
Makes sense. Appreciate the responses. Honestly though, as a person outside the US, I'm removing my dependence on US company IT tools and infrastructure, GitHub, VSCode, AWS etc, enterprise or otherwise.. Congrats on the project though.