Published: September 15, 2016 | China
Nvidia has recently unveiled the latest additions to its Pascal™ architecture-based deep learning platform, with new NVIDIA® Tesla® P4 and P40 GPU accelerators and new software that deliver massive leaps in efficiency and speed to accelerate inferencing production workloads for artificial intelligence services.
Modern AI services such as voice-activated assistance, email spam filters, and movie and product recommendation engines are rapidly growing in complexity, requiring up to 10x more compute compared to neural networks from a year ago. Current CPU-based technology isn’t capable of delivering real-time responsiveness required for modern AI services, leading to a poor user experience.
The Tesla P4 and P40 are specifically designed for inferencing, which uses trained deep neural networks to recognize speech, images or text in response to queries from users and devices.
Based on the Pascal architecture, these GPUs feature specialized inference instructions based on 8-bit (INT8) operations, delivering 45x faster response than CPUs1 and a 4x improvement over GPU solutions launched less than a year ago.2 The Tesla P4 delivers the highest energy efficiency for data centers.
It fits in any server with its small form-factor and low-power design, which starts at 50 watts, helping make it 40x more energy efficient than CPUs for inferencing in production workloads.3 A single server with a single Tesla P4 replaces 13 CPU-only servers for video inferencing workloads,4 delivering over 8x savings in total cost of ownership, including server and power costs.
The Tesla P40 delivers maximum throughput for deep learning workloads. With 47 tera-operations per second (TOPS) of inference performance with INT8 instructions, a server with eight Tesla P40 accelerators can replace the performance of more than 140 CPU servers.5 At approximately $5,000 per CPU server, this results in savings of more than $650,000 in server acquisition cost.
Ian Buck, general manager of accelerated computing at NVIDIA, said:
With the Tesla P100 and now Tesla P4 and P40, NVIDIA offers the only end-to-end deep learning platform for the data center, unlocking the enormous power of AI for a broad range of industries. They slash training time from days to hours. They enable insight to be extracted instantly. And they produce real-time responses for consumers from AI-powered services.
Other than these developments, Nvidia has also launched DeepStream SDK, which taps into the power of a Pascal server to simultaneously decode and analyze up to 93 HD video streams in real time compared with seven streams with dual CPUs.6 This addresses one of the grand challenges of AI: understanding video content at-scale for applications such as self-driving cars, interactive robots, filtering and ad placement. Integrating deep learning into video applications allows companies to offer smart, innovative video services that were previously impossible to deliver.
The NVIDIA Tesla P4 and P40 are planned to be available in November and October, respectively, in qualified servers offered by ODM, OEM and channel partners.