I think it depends on your learning style. For me, learning something with a concrete implementation and code that you can play around with is a lot easier than trying to study the abstract general concepts first. Once you have some experience with the code, you start asking why things are done a certain way, and that naturally leads to the more general concepts.
It has got nothing to do with "learning styles". Parallel Computing needs knowledge of three things; a) Certain crucial architectural aspects (logical and physical) of the hardware b) Decomposing a problem correctly to map to that hardware c) Algorithms using a specific language/framework to combine the above two. CUDA (and other similar frameworks) only come in the last step and so a knowledge of the first two is a prerequisite.