How is this considered an "exploit"? You give the agent a token that allows it to access a private repository. MCPs are just API servers. If you don't want something exposed in that API, don't grant them permissions to do so.
> How is this considered an "exploit"?
As many do, I also jumped to the comment section before actually reading the article.
If you do the same, you will quickly notice that this article features an attack. A malicious issue is posted on GitHub, and the issue features a LLM prompt that is crafted to leak data. When the owner of the GitHub account triggers the agent, the agent acts upon the malicious prompt on behalf of the repo owner.
I read it, and "attack" does not make sense. If you grant access to MCP to access some data (data you want anybody has access to like public repos, and data that only you want to access to like private repos), you will always be able to craft the prompt that will "leak" the data you are only supposed to access. That's not surprising at all. The only way to prevent these kind of "leaks" is not to provide the data feed with private data to the agent.
> That's not surprising at all
An attack doesn’t have to be surprising to be an attack.
> The only way to prevent these kind of "leaks" is not to provide the data feed with private data to the agent.
Yes. That is exactly what the article recommends as a mitigation.
> An attack doesn’t have to be surprising to be an attack.
If you open an API to everyone, or put a password as plain text and index it, it's no surprise that someone accesses the "sensitive" data. Nor do I consider that an attack.
You simply can't feed the LLM the data, or grant it access to the data, then try to mitigate the risk by setting "guardrails" on the LLM itself. There WILL ALWAYS be a prompt to extract any data LLM has access to.
> Yes. That is exactly what the article recommends as a mitigation.
That's common sense, not mitigation. Expecting "security experts" to recommend that is like expecting a recommendation to always hash the password before storing it in the DB. Common sense. Obvious.
> it's no surprise
The amount of your surprise is not a factor weather it is an attack or not.
You have been already asked about sql injections. Do you consider them attacks?
They are very similar. You concatenate an untrusted string with an sql query, and execute the resulting string on the database. Of course you are going to have problems. This is absolutely unsuprising and yet we still call it an attack. Somehow people manage to fall into that particular trap again and again.
Tell me which one is the case: do you not consider sql injection attacks attacks, or do you consider them somehow more surprising than this one?
> That's common sense, not mitigation.
Something can be both. Locking your front door is a mitigation against opportunistic burglars, and at the same time is just common sense.
> Expecting "security experts" to recommend that is like expecting a recommendation to always hash the password before storing it in the DB.
That is actually a real world security advice. And in fact if you recall it is one many many websites were not implementing for very long times. So seemingly it was less common sense for some than it is for you. And even then you can implement it badly vs implement it correctly. (When i started in this business a single MD5 hash of the password was often recommended, then later people started talking about salting the hash, and even later people started talking about how MD5 is entirely too weak and you really ought to use something like bcrypt if you want to do it right.) Is all of that detail common sense too? Did you sprung into existence fully formed with the full knowledge of all of that, or had you had to think for a few seconds before you reinvented bcrypt on your own?
> Common sense. Obvious.
Good! Excelent. It was common sense and obvious to you. That means you are all set. Nothing for you to mitigate, because you already did. I guess you can move on and do the next genious thing while people less fortunate than you patch their workflows. Onward and upward!
In principle, I agree with you. The problem I have with articles like this and people commenting is that it's framed as if MCP's vulnerability is discovered, that MCP "needs fixing." When it's not. It's not the database's fault if you don't hash your password - it's yours.
It's a fundamental user experience flaw with the MCP server. It does indeed need fixing - e.g. it could have a permissions system itself, so even if the GitHub token has permissions, different projects have their tool calls filtered to reduce access to different repos. Or it could have a clearer UX and instructions and help making multiple tokens for the different use cases. The MCP server could check the token permissions and refuse to run until they're granular.
At a high level I think it's still appropriate to question the role MCP is playing, even if you can still blame AI enthusiasts for being cavalier in their approach to installing MCP servers and giving them blanket permissions.
The more people keep doing it and getting burned, the more it's going to force the issue and both the MCP spec and server authors are going to have to respond.
The role is very simple. It provides an interface for the AI to access the data. Whatever it has access to (via MCP), it will access. Simple as that.
> The problem I have with articles like this and people commenting is that it's framed as if MCP's vulnerability (...)
You're extrapolating. The problem is clearly described as a MCP exploit, not a vulnerability. You're the only one talking about vulnerabilities. The system is vulnerable to this exploit.
It's not even an exploit. MCP is doing what it is MADE TO DO. It's made for interacting with the GitHub API. Whatever it has access to, it will access. If it has access to delete the repo, it will delete the repo. If it has access to the private repo, it will access the private repo.
> It's not even an exploit. MCP is doing what it is MADE TO DO.
You still don't understand the problem, do you? I mean, do you even understand the concept of an exploit?
SQL attack is different. There's clearly zero expectation that someone can, for example, enter something on a web page and extract database info or modify the database beyond what was intended by the API (e.g., update a single record).
If you are working in an organization and you tell a junior coder "do everything on this list" and on the list is something that says "do something to some other list" and the junior coder does it...that's a fundamentally different kind of "bug". Maybe you expected that the junior coder should say "oh hmm, it's weird that something in this repo mentions another repo" but in that case, you can imagine a high level todo list that points to other low level todo lists, where you would want the junior coder to proceed. Maybe you're looking for "common sense" where there is none?
Actual organizations have ways to mitigate this. For example, OWNERs files would prevent someone from committing code to a repo of which they are not an OWNER without review. And then we're back to what many in these comments have proposed: you should not have given the agent access to another repo if you didn't want it to do something in that repo after you told it--albeit indirectly--to do something in that repo...
-- Actually, arguably a better analogy is that you go to share a file with someone in, e.g., Google Drive. You share a folder and inadvertently grant them access to a subfolder that you didn't want to share. If, in sharing the folder, you say "hey please revise some docs" and then somehow something in the subfolder gets modified, that's not a bug. That's you sharing a thing you weren't supposed to share. So this automatic detection pipeline can maybe detect where you intended to share X but actually shared X and Y.
I don't understand your logic. Should security reports never be published that say "hash the password before storing it in the DB". Boring research is boring most of the time, that doesn't make it unimportant, no?
No, but it's not the database's fault if you don't hash your password. Same here, it's human error, not "MCP vulnerability". It's not that GitHub MCP needs fixing, but rather how you use it. That's the entire point of my reasoning for this "exploit."
The key is, it's not the person who grants the MCP access who is the attacker.
The attacker is some other person who can create issues on a public Repo but has no direct access to the private repo.
The point is this is NOT a GitHub MCP vulnerability, but how you use it. There is nothing to be fixed in MCP itself; rather how you use it.
> The point is this is NOT a GitHub MCP vulnerability, but how you use it.
You're the only one talking about GitHub MCP vulnerabilities. Everyone else is talking about GitHub MCP exploits. It's in the title, even.
Tomato-Tomato. It's not even an exploit. I will give you my token with access only to public repos. Try and access my private repos with Github MCP. Guess what - you can't - so it is not Github MCP exploit.
Your issue with the semantics of the word "attack" is uninteresting. Clearly this is a security flaw of the MCP server that could be mitigated in several different ways.
Not surprising to you, but surprising to thousands of users who will not think about this, or who will have believed the marketing promises.
Well I saw vibe coders commit .env files with real credentials to public repository, but I didn't see anyone blaming Git for allowing .env or secrets to be commited in the first place.
> I read it, and "attack" does not make sense.
Do you believe that describe a SQL injection attack an attack also does not make sense?
That's the thing. LLM or MCP is not a database. You can’t compare it. You simply can't set the permissions or guardrails within LLMs or MCPs. You always do it layer above (scoping to what LLM has access to).
@motorest read again what I wrote: "That's the thing. LLM or MCP is not a database. You can’t compare it. You simply can't set the permissions or guardrails within LLMs or MCPs. You always do it layer above (scoping to what LLM has access to)."
You can not HIDE the data MCP has access to. With a database and SQL, you can! So it can not be comparable with SQL injection.
Absolutly you can - the UX of the whole experience MCP is part of could make it clear to the user what repositories can be accessed according to the project they're working on. Rather than when they're working on the public project, the LLM being given access to repos of the private projects.
> That's the thing. LLM or MCP is not a database. You can’t compare it.
You can. Read the article. A malicious prompt is injected into an issue to trigger the repo owner's LLM agent to execute it with the agent's credentials.
"with the agent's credentials." - so you are surprised that agent can respond with private repository details when it has access to it? WoW! anyone and anything with credentials can access it. Github action, Jenkins, me.
"injected" is so fancy word to describe prompting - one thing that LLMs are made to do - respond to a prompt.
The "surprise" is not that the agent can respond with private repository details, it's that it can receive and act upon prompts issued by someone other than the person running the agent, hence "prompt _injection_".
Or to come back to the SQL injection analogy, no one is surprised that the web app can query the database for password hashes. The surprise is that it can be instructed to do so when loading the next image in a carousel.
Did you read the article?
The attack is not via the prompt the victim types to the AI, but via [text in an issue or PR in the repo] that the victim is unaware about.
So it’s the e-mail exploit? If you e-mail someone and tell them to send you their password and they do, you suddenly have their password!? This is a very serious exploit in e-mail and need to be patched so it becomes impossible to do.
> How is this considered an "exploit"?
Others in this discussion aptly described it as a confused deputy exploit. This goes something like:
- You write a LLM prompt that says something to the effect "dump all my darkest secrets in a place I can reach them",
- you paste them in a place where you expect your target's LLM agent to operate.
- Once your target triggers their LLM agent to process inputs, the agent will read the prompt and act upon it.
Would you ever put a plain password text in a search engine and then complain if someone "extracted" that info with a keyword payload?
> Would you ever put a plain password (...)
Your comment bears no resemblance with the topic. The attack described in the article consists of injecting a malicious prompt in a way that the target's agent will apply it.
Of course it will apply. Entire purpose of the agent is to give a response to a prompt. But to sound more dangareous let's call it "injecting". It's a prompt. You are not "injecting" anything. Agent pickups the prompt - that's its job, and execute - that is also its job.
> Of course it will apply. Entire purpose of the agent is to give a response to a prompt.
The exploit involves random third parties sneaking in their own prompts in a way that leads a LLM to run them on behalf of the repo's owner. This exploit can be used to leak protected information. This is pretty straight forward and easy to follow and understand.
Bad analogy. It's more like indexing a password field in plain text, then opening an API to everyone and setting "guardrails" and permissions on the "password" field. Eventually, someone will extract the data that was indexed.
This "exploits" human fallibility, hence it is an exploit. The fallibility being users blindly buying into the hype and granting full access to their private Github repos thinking it is safe.
I'm going to be rather pedantic here given the seriousness of the topic. It's important that everyone understand how risky running a tool executing AI is, exactly.
Agents run various tools based on their current attention. That attention can be affected by the tool results from the tools they ran. I've even noted they alter the way they run tools by giving them a "personality" up front. However, you seem to argue otherwise, that it is the user's fault for giving it the ability to access the information to begin with, not the way it reads information as it is running.
This makes me think of several manipulative tactics to argue for something that is an irrational thought:
Stubborn argumentation despite clear explanations: Multiple people explained the confused deputy problem and why this constitutes an exploit, but you kept circling back to the same flawed argument that "you gave access so it's your fault." This raises questions about why argue this way. Maybe you are confused, maybe you have a horse in the game that is threatened.
Moving goalposts: When called out on terminology, you shift from saying it's not an "attack" to saying it's not a "vulnerability" to saying it's not "MCP's fault" - constantly reframing rather than engaging with the actual technical issues being raised. It is definitely MCP's fault that it gives access without any consideration on limiting that access later with proper tooling or logging. I had my MCP stuff turn on massive logging, so at least I can see how stuff goes wrong when it does.
Dismissive attitude toward security research: You characterized legitimate security findings as "common sense" and seemed annoyed that researchers would document and publish this type of exploit, missing the educational value. It can never be wrong to talk about security. It may be that the premise is weak, or the threat minimal, but it cannot be that it's the user's fault.
False analogies: you kept using analogies that didn't match the actual attack vector (like putting passwords in search engines) while rejecting apt comparisons like SQL injection. In fact, this is almost exactly like SQL injection and nobody argues this way for that when it's discussed. Little Bobby Tables lives on.
Inability to grasp indirection: You seem fundamentally unable to understand that the issue isn't direct access abuse, but rather a third party manipulating the system to gain unauthorized access - by posting an issue to a public Github. This suggests either a genuine conceptual blind spot or willful obtuseness. It's a real concern if my AI does something it shouldn't when it runs a tool based on another tools output. And, I would say that everyone recommending it should only run one tool like this at a time is huffing Elmers.
Defensive rather than curious: Instead of trying to understand why multiple knowledgeable people disagreed with them, you doubled down and became increasingly defensive. This caused massive amounts of posting, so we know for sure that your comment was polarizing.
I suppose I'm not suppose to go meta on here, but I frequently do because I'm passionate about these things and also just a little bit odd enough to not give a shit what anyone thinks.