Item 43998728

oac • 1 day ago

I might be mistaken, but I think this is partly because of the undirected structure of RBMs, so you can't build a computational graph in the same way as with feed-forward networks.

alimw • 1 day ago

By "undirected structure" I assume you refer to the presence of cycles in the graph? I was taught to call such networks "recurrent" but it seems that that term has evolved to mean something slightly different. Anyway yeah, because of the cycles Gibbs sampling is key to the network's operation. One still employs gradient descent during training, but the procedure to calculate the gradient itself involves Gibbs sampling.

Edit: Actually was talking about the General Boltzmann Machine. For the Restricted Boltzmann Machine an approximation has been assumed which obviates the need for full Gibbs sampling during training. Then (quoting the article, emphasis mine) "after training, it can sample new data from the learned distribution using Gibbs sampling."