"Looks like each container gets its own lightweight Linux VM."
Not a container "as such" then.
How hard is it to emulate linux system calls?
> How hard is it to emulate linux system calls?
It’s doable but a lot more effort. Microsoft did it with WSL1 and abandoned it with WSL2.
Note that they didn't "do it" for WSL1, they started doing it, realized it is far too much work to cover eveything, and abandoned the approach in favor of VMs. It's not like WSL1 was a fully functioning Linux emulator on top of Windows, it was still very far from it, even though it could do many common tasks.
I've always wondered why only Linux can do 'true' containers without VMs. Is there a good blog post or something I can read about the various technical hurdles?
> I've always wondered why only Linux can do 'true' containers without VMs.
Solaris/illumos has been able to do actual "containers" since 2004[0] and FreeBSD has had jails even before that[1].
[0] https://www.usenix.org/legacy/event/lisa04/tech/full_papers/... [1] https://papers.freebsd.org/2000/phk-jails.files/sane2000-jai...
Many OS's have their own (sometimes multiple) container technologies, but the ecosystem and zeitgeist revolves around OCI Linux containers.
So it's more cultural than technical. I believe you can run OCI Windows containers on Windows with no VM, although I haven't tried this myself.
BSD can do BSD containers with Jails for more than a decade now?
Due to innate features of a container, it can be of the same OS of the host running on the system, since they have no kernel. Otherwise you need to go the VM route.
In this context (OCI containers) that seems very inaccurate. For instance, ocijail is a two year old project still considered experimental.
FreeBSD has beta podman (OCI) support right now, using freebsd base images not Linux. It is missing some features but coming along.
Windows can do “true” containers, too. These containers won’t run Linux images, though.
Can it? As far as I understood windows containers required Hyper-V and the images themselves seem to contain an NT kernel.
Not that it helps them run on any other Windows OS other than the version they were built on, it seems.
Source?
The following piece of documentation disagrees:
https://learn.microsoft.com/en-us/virtualization/windowscont...
> Containers build on top of the host operating system's kernel (...), and contain only apps and some lightweight operating system APIs and services that run in user mode
> You can increase the security by using Hyper-V isolation mode to isolate each container in a lightweight VM
Yes, it is based on Windows Jobs API.
Additionally you can decide if the images contain the kernel, or not.
There is nothing in OS containers that specifies the golden rule how the kernel sharing takes place.
Remember containers predate Linux.
I'm not sure about MacOS, but otherwise all major OSs today can run containers natively. However, the interest in non-Linux containers is generally very very low. You can absolutely run Kubernetes as native Windows binaries [0] in native Windows containers, but why would you?
Note that containers, by definition, rely on the host OS kernel. So a Windows container can only run Windows binaries that interact with Windows syscalls. You can't run Linux binaries in a Windows container anymore than you can run them on Windows directly. You can run Word in a Windows container, but not GCC.
[0] https://learn.microsoft.com/en-us/virtualization/windowscont...
I wouldn't think there are many use cases for Windows, but I imagine supporting legacy .NET Framework apps would be a major one.
Is there any limitation in running older.NET Framework on current Windows? Back when I was using it, you could have multiple versions installed at the same time, I think.
You can, but there are companies that also want to deploy different kinds of Windows software into Kubernetes clusters and so.
Some examples would be Sitecore XP/XM, SharePoint, Dynamics deployments.
Containers are essentially just a wrapper tool for a linux kernel feature called cgroups, with some added things such as layered fs and the distribution method.
You can also use just use cgroups with systemd.
Now, you could implement something fairly similar in each OS, but you wouldn't be able to use the vast majority of contained software, because it's ultimately linux software.
cgroups is for controlling resource allocation (CPU, RAM, etc). What you mean is probably namespaces.
Every OS can theoretically do 'true' containers without VMs - for containers which match the host platform.
You can have Windows containers running on Windows, for instance.
Containers themselves are a packaging format, and do rather little to solve the problem of e.g. running Linux-compiled executables on macOS.
> How hard is it to emulate linux system calls?
FreeBSD has linuxulator and illumos comes with lx-zones that allow running some native linux binaries inside a "container". No idea why Apple didn't go for similar option.
FreeBSD Linux emulation is being developed for 20 (may be even 30) years. While Apple can throw some $$$ to get it implemented in a couple years using virtualisation requires much less development time (so it’s cheaper).
Apple's already got the Virtualization framework and hypervisor already (https://developer.apple.com/documentation/virtualization), so adding the rest of the container ecosystem seems like a natural next step.
It puts them on par with Windows that has container support with a free option, plus I imagine it's a good way to pressure test swift as a language to make sure it really can be the systems programming language they are betting that it can and will be.
OrbStack has a great UX and experience, so I imagine this will eat into Docker Desktop on Mac more than OrbStack.
Because that‘s a huge investment for something they have no reason or desire to productivize.
syscalls are just a fraction of the surface area. There are many files in many different vfs you need to implement, things like selinux and ebpf, iouring, etc. It's also a constantly shifting target. The VM API is much simpler, relatively stable, and already implemented.
Emulating Linux only makes sense on devices with constrained resources.
> How hard is it to emulate linux system calls?
Just replace the XNU kernel with Linux already.