Computer processors are designed to handle pretty much anything. However, CPUs are very restricted and as such, can only perform certain mathematical calculations. Highly complicated combinations are off the table due to very long processing time. Graphics cards, on the other hand, have become so specialized that they surpass traditional processors when it comes to rendering large amounts of complex calculations.
Some examples include pedestrian detection for autonomous driving, medical imaging, supercomputing and machine learning. This comes as no surprise, because GPUs offer 10 to 100 times more computational power than traditional CPUs, which is one of the main reasons why graphics cards are currently being used to power some of the most advanced neural networks responsible for deep learning.
What exactly makes GPUs ideal for deep learning?
The most tech-savvy reader might think that it has to do with parallelism, but you would be wrong my friend. The real reason is simpler than that and has to do with memory bandwidth. CPUs are capable of fetching small packages of memory quickly whereas GPUs have a high latency which makes them slower at this type of work. But GPUs are ideal when it comes to fetching very large amounts of memory and the best GPUs can fetch up to 750GB/s, which is huge when you compare it to the best CPU which can handle only up to 50GB/s memory bandwidth. But how do we overcome the latency issues?
Image source: NVidia.com
Simple, we use more than one processing unit. GPUs are comprised of thousands of cores unlike CPUs and to solve a task involving large amounts of memory and matrices, you would only have to wait for the initial fetch to take place. Every subsequent fetch will be significantly faster, due to the unloading process taking so much time that all the GPU have to queue in order to continue the unloading process. With so much processing power, the latency is effectively masked in order to allow the GPU to handle…