brundolf 7 days ago

> [Clarc] is much faster and also will easily let HN run on multiple cores

This was all running on a single core??

8
haiku2077 7 days ago

Modern CPUs are crazy fast. 4chan was serving 4 million users with a single server, a ten year old version of PHP and like 10000 lines of spaghetti code. If you do even basic code quality, profiling and optimization you can serve a huge number of users with a fraction of a CPU core.

I/O tends to be the bottleneck (disk IOPS and throughput, network connections, IOPS and throughput). HN only serves text so that's mostly an easy problem.

Tabular-Iceberg 6 days ago

I still can't wrap my head around how the conventional wisdom in the industry to work around that problem is to add even more slow network I/O dependencies.

EasyMark 6 days ago

Yo dawg, I hear you want to cache your cache of shards.

bakugo 7 days ago

4chan is a special case, because all of its content pages are static HTML files being served by nginx that are rewritten on the server every time someone makes a post. There's nothing dynamic, everyone is served the exact same page, which makes it much easier to scale.

donnachangstein 6 days ago

It's not a special case at all. 20 years ago this was standard architecture (hell, HN still caches static versions of pages for logged-out users).

No, what changed is the industry devolved into over-reliance on mountains of 'frameworks' and other garbage that no one person fully understands how it all works.

Things have gotten worse, not better.

pmdr 6 days ago

The "this won't scale" dogma pushed by cloud providers via frameworks has actually scared people into believing they really need a lot more resources than they actually do to display information on the web.

It's really dumbfounding that most devs fell for it even as raw computing power has gotten drastically cheaper.

haiku2077 6 days ago

I was having a conversation with some younger devs about hosting websites for our photography hobbies. One was convinced hosting the photos on your own domain would bankrupt you in bandwidth costs. It's wild.

sgarland 6 days ago

I very much enjoyed the Vercel fanboys posting their enormous bills on Twitter, and then daring people to explain how they could possibly run it on, you know, a server for anything close to the price.

I took the bait once and analyzed a $5000 bill. IIRC, it worked out to about the compute provided by an RPi 4. “OK, but what about when your site explodes in popularity?” “I dunno, take the other $4900 and buy more RPis?”

nssnsjsjsjs 6 days ago

Or get a hundred Hetzner dedis

DrillShopper 6 days ago

Sounds like the real web scale was all of the AWS bills we paid along the way

actuallyalys 6 days ago

Static HTML and caching aren't special cases by any means, but a message board where literally nothing changes between users certainly seems like a special case, even twenty years ago. You don't need that in order to make a site run fast, of course, but that limitation certainly simplifies things.

haiku2077 6 days ago

I worked at at company near the top of https://en.wikipedia.org/wiki/List_of_the_largest_software_c... for a while. It was extremely common that web services only used about 1/20th of a CPU core's timeshare. These were dynamic web services/APIs. (We did have to allocate more CPU than that in practice to improve I/O latency, but that was to let the CPU idle to be ready to quickly react to incoming network traffic.)

This was many years ago on hardware several times slower than the current generation of servers.

agumonkey 6 days ago

Here goes all your software engineering classes. So bare it's hilarious

bawolff 6 days ago

I wouldn't call that a special case, just using a good tool for the job.

mschuster91 7 days ago

... which, again, shows just how much power you can get out of a 10 year old server if you're not being a sucker for the "latest and greatest" resume-driven-development crap.

Just look at New Reddit, it's an insane GraphQL abomination.

sgarland 6 days ago

Every time a dev discovers how tremendously bloated and slow modern software is, an angel gets its wings.

quotemstr 7 days ago

Modern CPUs are stupid fast when you use them the right way. You can take scale-up surprisingly far before being forced to scale out, even when that scale out is something as modest as running on multiple cores.

whalesalad 6 days ago

Most apps aren’t suffering from computation. They suffer from I/O

thatwasunusual 7 days ago

Based on context, you are insinuating that a discussion board like HN _can_ be hard on the CPU alone? If so, how? My guess would be _also_ be that the CPU would have little to do by itself, but that I/O would take the brunt?

grg0 7 days ago

Negotiating TLS handshakes is one way. But I'd imagine the rest is largely IO-bound like you said.

It still puts into perspective what a big pile of dogshit consumer software has become that stuff like this comes as a surprise. Also, the last time I checked, Let's Encrypt also ran on a single system. As did the Diablo 2 server (I love reading about these anecdotes.)

For every incremental change in HW performance, there is an order-of-magnitude regression in SW performance.

sgarland 6 days ago

If nothing else, handling interrupts from the NIC to pull packets out of its receive buffer, though that should be usually be isolated to a couple of cores.

Also, re: I/O, the CPU usually also has to handle interrupts there, as well as whatever the application might be doing either that I/O.

quotemstr 6 days ago

> If nothing else, handling interrupts from the NIC to pull packets out of its receive buffer,

Interrupts? Interrupts? We don't need no stinking interrupts! https://docs.kernel.org/networking/napi.html#poll

bawolff 6 days ago

Servers can also serve small text files out of memory incredibly fast.

kevincox 6 days ago

Yet GitHub can't show more than a dozen comments on the same page. Needing you to click "view more" to bring them in 10 at a time.

HN is an island of sanity in a sad world.

simoncion 6 days ago

In fairness, HN wouldn't show more than what, twenty-ish thread roots at a time, requiring you to click "more" to bring in more... which could contain the same set of thread roots you'd been looking at, depending on upvote activity.

(I assume that this update has removed that HN restriction, but haven't bothered to go look to verify this assumption.)

kevincox 6 days ago

The update appears to have come with unlimited or much higher page size. I don't think anyone has found a thread that is still split into multiple pages.

JW_00000 7 days ago

I was going to reply that this is pretty common for web apps, e.g. NodeJS or many Python applications also do not use multi-threading, instead just spawning separate processes that run in parallel. But apparently, HN ran as 1 process on 1 core on 1 machine (https://news.ycombinator.com/item?id=5229548) O_O

mbac32768 5 days ago

HN is not really that much of a workload. Links with text only comments, each link gets a few hundred comments at most, and commenting on stories ends after they are old enough.

Probably everything that's current fits easily in RAM and the older stories are candidates for serving from a static cache.

I wouldn't say this is an astounding technical achievement so much as demonstrating that simplicity can fall out of good taste and resisting groupthink around "best practices".

galaxyLogic 6 days ago

I think NodeJS apps typically rely on JavaScript event-loop instead of starting new processes all the time.

Spawning new processes for every user is possible but would probabaly be less scalable than even thread-switching.

jay-barronville 6 days ago

> I think NodeJS apps typically rely on JavaScript event-loop instead of starting new processes all the time.

> Spawning new processes for every user is possible but would probabaly be less scalable than even thread-switching.

I’d just like to note/clarify that there is, in fact, multi-threading happening under the hood when running Node.js. libuv, the underlying library used for creating and managing the event loops, also creates and maintains thread pools that are used for some concurrent and parallelizable tasks. The fact that JavaScript (V8 in the case of Node.js) and the main event loop are single-threaded doesn’t mean that multi-threading isn’t involved. This is a common source of confusion.

watermelon0 6 days ago

NodeJS apps usually use multiple processes, since JS event loop is limited to a single core. However, this means that you cannot share data and connection pools between them.

xnx 6 days ago

It's amazing what's possible when you don't use microservices

EasyMark 6 days ago

Text only processing is amazingly fast, as are static websites. Javascript is heavy, man.