> [Clarc] is much faster and also will easily let HN run on multiple cores
This was all running on a single core??
Modern CPUs are crazy fast. 4chan was serving 4 million users with a single server, a ten year old version of PHP and like 10000 lines of spaghetti code. If you do even basic code quality, profiling and optimization you can serve a huge number of users with a fraction of a CPU core.
I/O tends to be the bottleneck (disk IOPS and throughput, network connections, IOPS and throughput). HN only serves text so that's mostly an easy problem.
I still can't wrap my head around how the conventional wisdom in the industry to work around that problem is to add even more slow network I/O dependencies.
4chan is a special case, because all of its content pages are static HTML files being served by nginx that are rewritten on the server every time someone makes a post. There's nothing dynamic, everyone is served the exact same page, which makes it much easier to scale.
It's not a special case at all. 20 years ago this was standard architecture (hell, HN still caches static versions of pages for logged-out users).
No, what changed is the industry devolved into over-reliance on mountains of 'frameworks' and other garbage that no one person fully understands how it all works.
Things have gotten worse, not better.
The "this won't scale" dogma pushed by cloud providers via frameworks has actually scared people into believing they really need a lot more resources than they actually do to display information on the web.
It's really dumbfounding that most devs fell for it even as raw computing power has gotten drastically cheaper.
I was having a conversation with some younger devs about hosting websites for our photography hobbies. One was convinced hosting the photos on your own domain would bankrupt you in bandwidth costs. It's wild.
I very much enjoyed the Vercel fanboys posting their enormous bills on Twitter, and then daring people to explain how they could possibly run it on, you know, a server for anything close to the price.
I took the bait once and analyzed a $5000 bill. IIRC, it worked out to about the compute provided by an RPi 4. “OK, but what about when your site explodes in popularity?” “I dunno, take the other $4900 and buy more RPis?”
Sounds like the real web scale was all of the AWS bills we paid along the way
Static HTML and caching aren't special cases by any means, but a message board where literally nothing changes between users certainly seems like a special case, even twenty years ago. You don't need that in order to make a site run fast, of course, but that limitation certainly simplifies things.
I worked at at company near the top of https://en.wikipedia.org/wiki/List_of_the_largest_software_c... for a while. It was extremely common that web services only used about 1/20th of a CPU core's timeshare. These were dynamic web services/APIs. (We did have to allocate more CPU than that in practice to improve I/O latency, but that was to let the CPU idle to be ready to quickly react to incoming network traffic.)
This was many years ago on hardware several times slower than the current generation of servers.
... which, again, shows just how much power you can get out of a 10 year old server if you're not being a sucker for the "latest and greatest" resume-driven-development crap.
Just look at New Reddit, it's an insane GraphQL abomination.
Every time a dev discovers how tremendously bloated and slow modern software is, an angel gets its wings.
Modern CPUs are stupid fast when you use them the right way. You can take scale-up surprisingly far before being forced to scale out, even when that scale out is something as modest as running on multiple cores.
Based on context, you are insinuating that a discussion board like HN _can_ be hard on the CPU alone? If so, how? My guess would be _also_ be that the CPU would have little to do by itself, but that I/O would take the brunt?
Negotiating TLS handshakes is one way. But I'd imagine the rest is largely IO-bound like you said.
It still puts into perspective what a big pile of dogshit consumer software has become that stuff like this comes as a surprise. Also, the last time I checked, Let's Encrypt also ran on a single system. As did the Diablo 2 server (I love reading about these anecdotes.)
For every incremental change in HW performance, there is an order-of-magnitude regression in SW performance.
If nothing else, handling interrupts from the NIC to pull packets out of its receive buffer, though that should be usually be isolated to a couple of cores.
Also, re: I/O, the CPU usually also has to handle interrupts there, as well as whatever the application might be doing either that I/O.
> If nothing else, handling interrupts from the NIC to pull packets out of its receive buffer,
Interrupts? Interrupts? We don't need no stinking interrupts! https://docs.kernel.org/networking/napi.html#poll
Yet GitHub can't show more than a dozen comments on the same page. Needing you to click "view more" to bring them in 10 at a time.
HN is an island of sanity in a sad world.
In fairness, HN wouldn't show more than what, twenty-ish thread roots at a time, requiring you to click "more" to bring in more... which could contain the same set of thread roots you'd been looking at, depending on upvote activity.
(I assume that this update has removed that HN restriction, but haven't bothered to go look to verify this assumption.)
The update appears to have come with unlimited or much higher page size. I don't think anyone has found a thread that is still split into multiple pages.
I was going to reply that this is pretty common for web apps, e.g. NodeJS or many Python applications also do not use multi-threading, instead just spawning separate processes that run in parallel. But apparently, HN ran as 1 process on 1 core on 1 machine (https://news.ycombinator.com/item?id=5229548) O_O
HN is not really that much of a workload. Links with text only comments, each link gets a few hundred comments at most, and commenting on stories ends after they are old enough.
Probably everything that's current fits easily in RAM and the older stories are candidates for serving from a static cache.
I wouldn't say this is an astounding technical achievement so much as demonstrating that simplicity can fall out of good taste and resisting groupthink around "best practices".
I think NodeJS apps typically rely on JavaScript event-loop instead of starting new processes all the time.
Spawning new processes for every user is possible but would probabaly be less scalable than even thread-switching.
> I think NodeJS apps typically rely on JavaScript event-loop instead of starting new processes all the time.
> Spawning new processes for every user is possible but would probabaly be less scalable than even thread-switching.
I’d just like to note/clarify that there is, in fact, multi-threading happening under the hood when running Node.js. libuv, the underlying library used for creating and managing the event loops, also creates and maintains thread pools that are used for some concurrent and parallelizable tasks. The fact that JavaScript (V8 in the case of Node.js) and the main event loop are single-threaded doesn’t mean that multi-threading isn’t involved. This is a common source of confusion.
NodeJS apps usually use multiple processes, since JS event loop is limited to a single core. However, this means that you cannot share data and connection pools between them.
Text only processing is amazingly fast, as are static websites. Javascript is heavy, man.