You got a long way to go. Writing a rasterizer from scratch is a huge undertaking.
What's the internal color space, I assume it is linear sRGB? It looks like you are going straight to RGBA FP32 which is good. Think how you will deal with denormals as the CPU will deal with those differently compared to the GPU. Rendering artifacts galore once you do real world testing.
And of course IsInf and NaN need to be handled everywhere. Just checking for F::ZERO is not enough in many cases, you will need epsilon values. In C++ doing if(value==0.0f){} or if (value==1.0f){} is considered a code smell.
Just browsing the source I see Porter Duff blend modes. Really, in 2025? Have fun dealing with alpha compositing issues on this one. Also most of the 'regular' blend modes are not alpha compositing safe, you need special handling of alpha values in many cases if you do not want to get artifacts. The W3C spec is completely underspecified in this regard. I spent many months dealing with this myself.
If I were to redo a rasterizer from scratch I would push boundaries a little more. For instance I would target full FP32 dynamic range support and a better internal color space, maybe something like OKLab to improve color blending and compositing quality. And coming up with innovative ways to use this gained dynamic range.
You didn't mention one of the biggest source of 2d vector graphic artifacts: mapping polygon coverage to the alpha channel, which is what virtually all engines do, and is the main reason why we at Mazatech are writing a new version of our engine, AmanithVG, based on a simple idea: draw all the paths (polygons) at once. Well, the idea is simple, the implementation... not so much ;)
> Just browsing the source I see Porter Duff blend modes. Really, in 2025? Have fun dealing with alpha compositing issues on this one.
What should we be using in 2025? I thought pre-multiplied alpha is essentially what you go for if you want a chance of alpha compositing ending up correct, but my knowledge is probably outdated.
You absolutely want premult alpha when dealing with multiple transparent layers in graphics.
Right - maybe I'm mistaken but doesn't Porter-Duff compositing encompass premultiplied alpha?
Yes, they introduced the concept in their paper. See this article for an in-depth guide to the topic: https://ciechanow.ski/alpha-compositing/
Great article! I'm going to have to take my time on that one.
But I guess I'm confused why @morio said "Just browsing the source I see Porter Duff blend modes. Really, in 2025?" because I'm not sure what they were expecting to be used in 2025.
It's device sRGB for the time being, but more color spaces are planned.
You are correct that conflation artifacts are a problem and that doing antialiasing in the right color space can improve quality. Long story short, that's future research. There are tradeoffs, one of which is that use of the system compositor is curtailed. Another is that font rendering tends to be weak and spindly compared with doing compositing in a device space.
Yeah, there is an entire science on how to do font rendering properly. Perceptually you should even take into account if you have white text on black background or the other way as this changes the perceived thickness of the text. Slightly hinted SDFs kind of solve that issue and look really good but of course making that work on CPUs is difficult.
What's difficult with font SDFs on the CPU? The bezier paths?
I made myself a CPU SDF library last weekend, primarily for fast shadow textures. It was fun, and I was surprised how well most basic SDFs run with SIMD. Except yeah Beziers didn't fair well. Fonts seem much harder.
SIMD was easy, just asked Claude to convert my scalar Nim code to Neon SIMD version and then to an sse2 version. Most SDFs and gaussian shadowing got 4x speedup on my macbook m3. It's a bit surprising the author has so much trouble in Rust. Perhaps fp16 issues?
I haven't looked at this recently but from what I remember rendering from SDF textures instead from simple alpha textures was 3-4 times slower, including optimizations where fully outside and inside areas bypass the per pixel square root. Of course SIMD is a must, or at least the use _mm_rsqrt_ss.