From optimizing mobile games to detecting diseases to 3D modeling houses, businesses are constantly finding new, creative uses for machine learning. With such diverse applications, data scientists need a wide range of hardware and software to solve their unique business challenges using AI. Of course, data scientists also want a fast and simple system for getting the job done.
Today, we’re announcing numerous updates to AI Platform that make it faster and more flexible for running machine learning workloads.
Robust new backend to serve models, now with NVIDIA GPUs
AI Platform Prediction allows data scientists to serve models for online predictions in a serverless environment. That way, application developers can access AI without having to understand ML frameworks, and data scientists don’t have to manage the serving infrastructure.
Still, some ML models are so complex that they only run with acceptable latency on machines with many CPUs, or with accelerators like NVIDIA GPUs. This is especially true of models processing unstructured data like images, video, or text.
AI Platform Prediction now lets you choose from a set of Compute Engine machine types to run your model. You can now add GPUs, like the inference-optimized, low latency NVIDIA T4. Best of all, you still don’t need to manage the underlying infrastructure–AI Platform handles all of your model’s provisioning, scaling, and serving. Previously, Online Prediction only allowed you to choose from one or four vCPU machine types.
AI Platform Prediction’s new machine types are also easier to monitor and debug. You can now log your prediction requests and responses to BigQuery, where you can analyze them to detect skew and outliers, or decide if it’s time to retrain your model.
This feature uses a new backend built on Google Kubernetes Engine, which let us build a reliable and fast serving system with all the flexibility that machine learning demands.
A great real-world example of how AI Platform Prediction is helping solve complex problems in an easier, more user friendly way, is Wildlife Insights. Conservation International is a Washington D.C.-based organization with the mission “to responsibly and sustainably care for nature, our global biodiversity, for the well-being of humanity,” and is one of the partners in the Wildlife Insights collaboration.
“Wildlife Insights will turn millions of wildlife images into critical data points that help us better understand, protect and save wildlife populations around the world,” explains Eric H. Fegraus, Senior Director, Conservation Technology. “Google Cloud’s AI Platform helps us reliably serve machine learning models and easily integrate their predictions with our application. Fast predictions, in a responsive and scalable GPU hardware environment, are critical for our user experience.”
Using these new machine types is simple. Simply set the “machineType” field in your model creation request (in the UI, API or in gcloud) to a Compute Engine machine type. Here’s an example gcloud command to deploy to a machine with an Nvidia T4: