Can you give me more resources to read about this? It seems like it would be very difficult to incorporate web search or anything like that in Cursor or another IDE safely.
It is. Nearly any communication with the outside world can be used to exfiltrate data. Tools that give LLMs this ability along with access to private data are basically operating on hope right now.
Web search is mostly fine, as long as you can only access pre-indexed URLs, and as long as you consider the search provider not to be in with the attacker.
It would be even better if web content was served from cache (to make side channels based on request patterns much harder to construct), but the anti-copyright-infringement crowd would probably balk at that idea.