upmind 4 days ago

This is a great idea! This is the code right' https://github.com/leela-zero/leela-zero

I have two beginner (and probably very dumb) questions, why do they have heavy c++/cuda usage rather than using only pytorch/tensorflow. Are they too slow for training Leela? Second, why is there tensorflow code?

2
henrikf 4 days ago

That's Leela Zero (plays Go instead of Chess). It was good for its time (~2018) but it's quite outdated now. It also uses OpenCL instead of Cuda. I wrote a lot of that code including Winograd convolution routines.

Leela Chess Zero (https://github.com/LeelaChessZero/lc0) has much more optimized Cuda backend targeting modern GPU architectures and it's written by much more knowledgeable people than me. That would be a much better source to learn.

throwaway81523 4 days ago

As I remember, the CUDA code was about 3x faster than the tensorflow code. The tensorflow stuff is there for non-Nvidia GPU's. This was in the era of the GTX 1080 or 2080. No idea about now.

upmind 4 days ago

Ah I see, thanks a lot!