BoorishBears 7 days ago

Which is strictly worse than just giving the LLM access to the source of truth for the database.

You're adding a round trip to the database and the LLM and inserting a tool call in the conversation before it even starts generating any code.

And the reference Postgres MCP implementation doesn't include Postgres types or materialized views, and is one of the most widely used packages: Zed.dev's MCP server for example, is seemingly just a port of it and has the same problem.

4
fwip 7 days ago

MCP also gives the LLM access to your example data, which can add clarity beyond what your schema alone provides.

tempaccount420 7 days ago

I don't see how a round trip of <500ms, which is equivalent to maybe 50 tokens, is worse than including many thousands more extra tokens in the prompt, just in case they might be useful. Not to mention the context fatigue.

If designed well - by suspending generation in memory and inserting a <function_result>, without restarting generation and fetching cache from disk - the round trip/tool call is better (costs the equivalent of 50 tokens for waiting + function_result tokens).

BoorishBears 6 days ago

You're dealing with the full TTFT x2 + the tokens all the prompts of all your MCPs before you even get to that round trip to the DB.

And you don't have to wonder about "if designed well": the reference implementation that's getting 20k downloads a week and getting embedded in downstream editors is is not designed well and will make the round trip every time and still not give the LLM the full information of the table.

Most MCP implementations are crappy half-assed implementations in similar fashion because everyone was rushing to post how they added <insert DB/API/Data Source> to MCP.

And if you're worried about "context fatigue" (you mean LLMs getting distracted by relevant information...), you should 100% prefer a well known schema format to N MCP prompt definitions with tool usage instructions that weren't even necessarily tuned for the LLM in question.

LLMs are much more easily derailed by the addition of extra tools and having to reason about when to call them and the results of calling them, than they are a prompt caching friendly block of tokens with easy to follow meaning.

nsonha 7 days ago

The schema in the db should be the source of truth and an MCP server like that is the most flexible, can work with any ORM set up

wredcoll 7 days ago

What source of truth? If you have access to the database then you have the actual truth right there.