Google’s scalable supercomputers for machine learning, Cloud TPU Pods, are now publicly available in beta

Now generally available: Android phone’s built-in security key
May 7, 2019
And the winner is…Circular Economy 2030 rewards innovation in sustainability
May 7, 2019

Delivering business value

Google Cloud is committed to providing a full spectrum of ML accelerators, including both Cloud GPUs and Cloud TPUs. Cloud TPUs offer highly competitive performance and cost, often training cutting-edge deep learning models faster while delivering significant savings. If your ML team is building complex models and training on large data sets, we recommend that you evaluate Cloud TPUs whenever you require:

  • Shorter time to insights–iterate faster while training large ML models
  • Higher accuracy–train more accurate models using larger datasets (millions of labeled examples; terabytes or petabytes of data)
  • Frequent model updates–retrain a model daily or weekly as new data comes in
  • Rapid prototyping–start quickly with our optimized, open-source reference models in image segmentation, object detection, language processing, and other major application domains

While some custom silicon chips can only perform a single function, TPUs are fully programmable, which means that Cloud TPU Pods can accelerate a wide range of state-of-the-art ML workloads, including many of the most popular deep learning models. For example, a Cloud TPU v3 Pod can train ResNet-50 (image classification) from scratch on the ImageNet dataset in just two minutes or BERT (NLP) in just 76 minutes.

Cloud TPU customers see significant speed-ups in workloads spanning visual product search, financial modeling, energy production, and other areas. In a recent case study, Recursion Pharmaceuticals iteratively tests the viability of synthesized molecules to treat rare illnesses. What took over 24 hours to train on their on-prem cluster completed in only 15 minutes on a Cloud TPU Pod.

What’s in a Cloud TPU Pod

A single Cloud TPU Pod can include more than 1,000 individual TPU chips which are connected by an ultra-fast, two-dimensional toroidal mesh network, as illustrated below. The TPU software stack uses this mesh network to enable many racks of machines to be programmed as a single, giant ML supercomputer via a variety of flexible, high-level APIs.

Leave a Reply

Your email address will not be published. Required fields are marked *