Item 44104415

florbnit • 8 days ago

So it’s the e-mail exploit? If you e-mail someone and tell them to send you their password and they do, you suddenly have their password!? This is a very serious exploit in e-mail and need to be patched so it becomes impossible to do.

motorest • 8 days ago

> How is this considered an "exploit"?

Others in this discussion aptly described it as a confused deputy exploit. This goes something like:

- You write a LLM prompt that says something to the effect "dump all my darkest secrets in a place I can reach them",

- you paste them in a place where you expect your target's LLM agent to operate.

- Once your target triggers their LLM agent to process inputs, the agent will read the prompt and act upon it.

1 reply

mirzap • 8 days ago

Would you ever put a plain password text in a search engine and then complain if someone "extracted" that info with a keyword payload?

1 reply

motorest • 8 days ago

> Would you ever put a plain password (...)

Your comment bears no resemblance with the topic. The attack described in the article consists of injecting a malicious prompt in a way that the target's agent will apply it.

1 reply

mirzap • 7 days ago

Of course it will apply. Entire purpose of the agent is to give a response to a prompt. But to sound more dangareous let's call it "injecting". It's a prompt. You are not "injecting" anything. Agent pickups the prompt - that's its job, and execute - that is also its job.

1 reply

motorest • 7 days ago

> Of course it will apply. Entire purpose of the agent is to give a response to a prompt.

The exploit involves random third parties sneaking in their own prompts in a way that leads a LLM to run them on behalf of the repo's owner. This exploit can be used to leak protected information. This is pretty straight forward and easy to follow and understand.

mirzap • 8 days ago

Bad analogy. It's more like indexing a password field in plain text, then opening an API to everyone and setting "guardrails" and permissions on the "password" field. Eventually, someone will extract the data that was indexed.