AI revolution shines the spotlight on GPU potential
The division of labor in the semiconductor world used to be simple. CPUs did the lion’s share of the work and got the lion’s share of the glory.
Aside from gamers and a few professionals, hardly anyone thought about specialized processors, including the graphics chips known as GPUs. But times have changed. The artificial intelligence revolution has made GPUs the talk of the tech industry.
Not long ago, the leading GPU manufacturer, NVIDIA, created a special operating system to run general compute tasks on their processors. Not long after, data scientists realized that GPUs’ parallel architecture and massive bandwidth were perfectly suited to the matrix calculations used to train neural networks for AI. By one estimate, a single GPU can run a training algorithm 40X faster than a CPU.
The combination of GPUs, deep learning and huge data sets unleashed a wave of innovation in applications like financial modeling, cutting-edge scientific research and oil and gas exploration. NVIDIA has become the darling of the business world. It’s clear GPUs can do a lot more than push pixels. The question now is, how much more can they do, and how much more will this market grow?
Quite a bit, it would seem. Training is just the last mile in a long AI pipeline. Before that, you first need to select training data from an existing large dataset, potentially filtering, aggregating, and joining billions of records. Then you need to explore it to weed out anomalies and identify predictive correlations, before building your training model. It usually takes several iterations to find the best combination of inputs (or ‘features’ in data science lingo).
As the training progresses, you have to analyze the output—usually by visualizing it —to figure out how well the system is learning. All this data prep and analysis often takes longer than the training itself.
All these tasks are essential to AI, and GPUs can perform all of them faster than CPUs. That’s largely because of the speed with which they can transfer data between one another. A server stocked with NVIDIA Pascal GPUs can scan roughly 6 terabytes of data per second between them. That’s more than 60X faster than a typical two-socket server (i.e. one with GPU and CPU), which generally can’t process information at more than 100 gigabytes per second.
Nevertheless, most AI setups still replicate the old division of labor, with the training on GPUs, and querying, cleaning and analysis running on CPU-based platforms.
Why? In a word, integration. The CPU world had years to develop tools for exchanging data between applications. AI data pipelines on CPUs may have been relatively slow, but they worked. An equivalent GPU data pipeline didn’t really exist. The only way to connect the different processes and algorithms running on GPUs was to move the data back to CPU and possibly even over the network.
Faced with a choice between speed and convenience, data scientists picked the latter.
But there are signs that status quo is starting to change. An open-source end-to-end AI pipeline for GPUs is starting to take shape (full disclosure, MapD is a cofounder of the project, along with Continuum and H2O). Once it matures, the rationale for leaning so heavily on CPUs for machine learning will become less compelling.
Consider the implications. According to analyst firm IDC, the AI market will reach $12.5B this year, with roughly $2B going to dedicated servers and storage. It further predicts the total market will grow to $46B by 2020. If the share that goes into hardware remains the same, we might predict that the worldwide market for AI-dedicated hardware will reach more than $7B in less than three years, a figure that represents about 12% of Intel’s total revenue ($59.4B) in 2016—hardly chump change, and a figure that’s likely to grow.
Will all those dollars really flow into the GPU market? Probably not at first. Many organizations will want to leverage their existing investments in CPUs. It’s worth noting that some of that new CPU investment will be funneled into alternatives to X86 chips, such as IBM’s Power architecture which can support increased bandwidth to GPUs via NVLink.
It’s also worth noting that GPUs are just part of the ferment in the semiconductor world. A new breed of even more specialized chips, like Google’s Tensor Processing Unit, could affect the market in unpredictable ways, along with cloud offerings that deliver the new chips’ computing cycles as a service.
But while the details may be hazy, one thing is clear: the unchallenged reign of King CPU is over, and computing is getting a lot more interesting as a result.