ithkuil 8 days ago

Geometry in higher dimensions is not only hard to imagine, it's straight up weird.

Take a cube on N dimensions and pack N dimensional spheres inside that cube. Then fit a sphere inside the cube so that it touches but doesn't overlap with any of the other spheres.

In 2D and 3D is easy to visualize and you can see that sphere in the center is smaller than the other spheres and of course it's smaller than the cube itself; after all, it's surrounded by the other spheres that are by construction inside the cube.

Above 10 dimensions the size of the inner hypersphere is actually bigger than the size of the hypercube despite being surrounded by hyperspheres that are contained inside the hyper-cube!

The math behind it is straightforward but the implication is as counterintuitive as it gets

2
godelski 8 days ago

Or how the volume of an n-ball goes to 0[0,1]

Or how gaussian balls are like soap bubbles[2]

The latter of which being highly relevant to vector embeddings. Because if you aren't a uniform distribution, the density of your mass isn't uniform. MEANING if you linearly interpolate between two points in the space you are likely to get things that are not representative of your distribution. It happens because it is easy to confuse a linear line with a geodesic[3]. Like trying to draw a straight line between Los Angeles and Paris. You're going to be going through the dirt most of the time. Looks nothing like cities or even habitable land.

I think the basic math is straight forward but there's a lot of depth that is straight up ignored in most of our discussions about this stuff. There's a lot of deep math here and we really need to talk a lot about the algebraic structures, topologies, get deep into metric theory and set theory to push forward in answering these questions. I think this belief that "the math is easy" is holding us back. I like to say "you don't need to know math to train good models, but you do need math to know why your models are wrong." (Obvious reference to "all models are wrong, but some are useful") Especially in CS we have this tendency to oversimplify things and it really is just arrogance that doesn't help us.

[0] https://davidegerosa.com/nsphere/

[1] https://en.wikipedia.org/wiki/Volume_of_an_n-ball

[2] https://www.inference.vc/high-dimensional-gaussian-distribut...

[3] https://en.wikipedia.org/wiki/Geodesic

5tk18 8 days ago

I’m having trouble understanding you. I would be curious to see a representation of this in 2d or 3d. Do you know of any good resources?

godelski 8 days ago

Check out some of my links in my sister post.

For the sphere inside cube you can draw these out and get some intuitions about how the ratio of empty space changes.

One thing you can think out is how if you pick two random vectors in a high dimension they are almost certainly orthogonal. In 2D, pick a random vector. There will only be 2 vectors orthogonal to it, right? Now do the same in 3D. There's a whole plane orthogonal to your vector! That's a hell of a lot more than 2! Move up into 4D and you have a 3D-hyperplane that's orthogonal.

The spikes on the hypercube might have some visual intuition. In 2D you have 4 corners and you can imagine the smooth rounding off into a circle. 3D we now have 8 corners and the sphere again looks rounded off but you can see here how the hypercube gets spikey but whatever you think the hypersphere looks like you're likely wrong. Even the cube's intuition still fails you with this thinking so you need to be careful

ithkuil 7 days ago

There is a good numberphile video about that https://www.youtube.com/watch?v=mceaM2_zQd8

It contains also a hilarious reference to the "parker circle:.