The problem with shallow technical knowledge and, even worse, talking with confidence like the author does, is that it can propagate misinformation and loses a ton of nuance.
For example, the author talks very confidently about indexes, and makes a few conclusions, but they aren't as correct as his confidence suggests.
> That an index is only useful if it matches the actual query your application is making. If you index on “name plus email” and then start querying on “name plus address”, your index won’t be used and you’re back to the full table scan described in (1)
Not true. If you have single column indexes on both name and email, it could use the two indexes, though not as efficient as a single two-column index. If you query "name plus email" and the index is "name, email, age" then it could use the index.
> That indexes massively speed up reads, but must slow down database writes and updates, because each change must also be made again in the index “dictionary”
Must? No. The performance might be imperceptible and not material at all. If you have a ton of indexes, sure but not if you have a reasonable amount.
Shallow technical knowledge is fine but you should also have the humility to acknowledge that when you're dispensing said shallow knowledge. It can also lead to pretty bad engineering decisions if you think your shallow knowledge is enough.
For that matter, if you index `(name, email)` and query with `name` and `address` as the predicates, unless there's a better option, or the table is tiny, there's an excellent chance the planner will use that index to narrow down the initial result set to filter.
> Not true. If you have single column indexes on both name and email, it could use the two indexes, though not as efficient as a single two-column index. If you query "name plus email" and the index is "name, email, age" then it could use the index.
See, there are databases that implement clever optimizations like this, but those are going to vary widely by database and you would need some domain expertise with that system to know if such optimizations are working. By contrast, this mental model does help you ensure that you can create indexes that are actually helpful in the vast majority of databases.
So I think the author's mental model is working out pretty well for him here, honestly.
MySQL and Postgres both support it; that's a huge percentage of what most devs are ever going to encounter. MSSQL and Oracle may do so as well, but I'm not familiar with those beyond some trivial usage.
More to the point, this lack of knowledge will almost certainly drive people to over-index, which harms performance.
The point is that there's misinformation in the things that he's saying. There's a level of confidence that overexceeds the value of the information he is disseminating. If he were the lead engineer of a project, would he make bad decisions because of stuff like this? My guess is yes.
It's definitely an overly-broad generalization. But these mental models would still improve how many product engineers work with databases, at the expense of a very simple explanation.
I think the interesting question is like, if I have X amount of time and mental bandwidth to learn about a technology, what's the most helpful lossy compression of concepts that fits?
> I think the interesting question is like, if I have X amount of time and mental bandwidth to learn about a technology, what's the most helpful lossy compression of concepts that fits?
You cannot possibly know if your condensed version is accurate or sufficient if you can't point to the author of it and definitively state that they knew the original material well enough to summarize it.
I also continue to push back on the idea that [backend] devs shouldn't need to know SQL extremely well, and as an extension, their RDBMS vendor's implementation of the spec. You have to know your main programming language to get the job; why should the part of it that stores and retrieves all information for your application somehow be lesser? If you don't want to, then you don't get to write queries and design schemas, period. Access them via an API that's been designed by domain experts, otherwise you're putting everyone at risk.
> You cannot possibly know if your condensed version is accurate or sufficient if you can't point to the author of it and definitively state that they knew the original material well enough to summarize it.
I don't understand how this relates to the article. The author is not presenting himself as an expert on database indices, nor is the purpose of the article to educate people on database indices. If anything, he's illustrating techniques for dealing with technology when you're not an expert.
> I also continue to push back on the idea that [backend] devs shouldn't need to know SQL extremely well
The article isn't about any particular technology or type of software engineer, either -- this is just an example. We all have to use technologies that we're not world-class experts in, and part of being a professional engineer is learning ways to deal with that sad reality.
--
Edit to add: I do feel a certain sympathy/resonance for your claim that people should be really competent at the tools/tech they're using, and it's strictly "better" for everyone. But we also live in a world where the complexity and depth of software stacks is increasing rapidly, and developers often have to prioritize breadth over depth. (And yes I have seen a lot of people shoot their foot off with poorly-informed use of databases :/)
His entire article is about how to be a "good engineer", and "In my view, good engineering requires having reliable shallow intuitions about how things work."
I think he's wrong, but more importantly I think he needs to be more humble that it's not "good" engineering, it's probably bad. You can get by if you're a startup and just need to get stuff done, but don't start teaching people and writing blog posts on the topic when you have shallow knowledge.
I was replying to your statement, not the article. Did I miss that quote in it?
> We all have to use technologies that we're not world-class experts in
I’m not asking people to be experts, just to know how it works. If you write software that communicates over TCP, you should know how TCP works. If you write software that uses a deque, you should know how a deque works. Etc.
Re: the real world, perhaps that’s a good indication that we shouldn’t be unnecessarily increasing complexity, and use boring technology.
> I’m not asking people to be experts, just to know how it works
But if you "know how something works" in detail, such that you fully understand its workings and behavior, you're pretty much an expert. To really know how a database works is a project that takes hundreds of hours of dedicated study, and the deeper you look, the more nuance you find. Otherwise, you'll inevitably make the kinds of flawed generalizations that you dislike about the OP's mental model.
As I say, I have sympathy for your argument. I have spent a lot of time studying databases, I've contributed some patches to Postgres, I like understanding how things really work. But the reality is: full-stack development today is fractally complex. There are MANY components that each might require hundreds of hours to understand, and it's actually not economically valuable for you or your employer to rabbit-hole down each one before you start using it. You need to be able to pick up the key idea of a technology, using the appropriate resources, without fully studying it out.
--
I think that perhaps we understand the article differently. I think you understand it as a tradeoff between "understanding a system a little bit" vs. "understanding deeply," in which case, sure, it's easy to argue that we should all understand technologies deeply. But I think the real tradeoff is for beginners -- "understanding only the apparent outer workings of a system" vs. "having a first-order model of the components that lead to that behavior." Going one level down is the first step to going all the way, and there is a big difference in even going one level down.
> full-stack development today is fractally complex. There are MANY components that each might require hundreds of hours to understand
This is precisely why I maintain that the entire notion of full stack engineering is flawed. It’s absurd to think that one person should be able to meaningfully understand front end, backend, networking, and infra. Even if you abstract away networking and infra (spoiler, you’ve just kicked the can down the road), I’d argue that expecting someone to be good at frontend and backend is ridiculous. Maybe if the industry didn’t have such insane abstractions and frameworks, it would be doable, but that’s not how it is.