That's the size of the largest, most capable, open source models. Specifically Llama 3.1 has 405B parameters. Deepseek's largest model is 671B parameters.
Small corrections. Llama 3.1 is not an Open Source model, but a Llama 3.1 Licensed model. Neither is DeepSeek apparently https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/LIC... which I was of the false opinion that it is. Though I never considered using it, so haven't checked the license before.
You can just ignore the license since the existence of these models is based on piracy at a scale never before seen. Aaron Swartz couldn’t have even imagined violating copyright that hard.
If you live in a glass house, you won’t throw stones. No one in the LLM space wants to be litigious
It’s an open secret that DeepSeek used a ton of OpenAI continuations both in pre training and in the distillation. That totally violates openAI TOS. No one cares.