From the data warehouse: Urs H?lzle explains how data analytics and ML can transform your business

Microsoft Azure Developer Camp – Dallas
May 2, 2019
FormBook | Yet Another Stealer Malware
May 2, 2019

Quentin: Urs, as you’ve expanded infrastructure and capacity to process information at a higher velocity, process data from multiple angles, and think of data as a much more dynamic asset, how do today’s larger quantities of data change the way people work?

Urs: The ecosystem really changed a lot, because previously, you had to do a lot of planning: you had to carefully pick which insight you wanted to go after. Now, a data analyst with a simple SQL query can at least prototype this insight at their own pace-maybe in half a day or a week. And they don’t need a software team, they don’t need an analyst, and it’s not actually a software development project anymore, and that means that the number of questions you can answer from your data just explodes.

Quentin: So you can have far more projects, you can think in novel ways, you can test at a deeper level.

Urs: Often, you’re going after the right thing, but your initial understanding is actually incorrect. As you go through it iteratively, your understanding of the problem improves. At that point, you’re asking better questions than you asked on day one. And if you can do that every day, and ask a better question every day, then just in a matter of two weeks, you might actually fundamentally change how you think about a particular customer segment–because you have a much deeper understanding of how it behaves.

Quentin: One could see AI and machine learning as a kind of a natural outgrowth of cloud computing, right? Because it’s a fundamentally better way to sort through the data, find patterns, and test things?

Urs: Yes, and in fact we’re starting to see [the worlds of machine learning and cloud infrastructure] merge. Traditionally when you had data, then you wrote the data processing, or maybe you had queries, that was the first step: “I’m just trying to find a data point again.” That was databases. Then came analytics: “let me actually analyze the data, compute statistics on it.” But, it was still relatively manual. Now, ML gives you a more powerful way to look at the data, that also does well with unstructured data like images, sound, or other data types, where traditional analytics just doesn’t work at all.

[Modern data analytics tools] really make sense and make use of the data you already have. So on BigQuery today, our data warehouse, you actually have [built-in] ML functionality in your data analytics warehouse. It’s a very natural way to say, “Gee, I have this data here, can I actually make a prediction function for things where I don’t have the data?” And the answer is that yes you can, and it’s actually very easy. You can do it in a SQL statement that is roughly 10 lines long, so you don’t even need to understand how machine learning works.

Quentin: What are some of the most interesting ML problems that customers are bringing to you these days?

Urs: I think the biggest problems that companies have are in two main areas.

First, they believe that ML is the biggest opportunity, but they need to be able to translate that into actual outcomes. So it’s essential that we offer tools in our stack that make it much easier for you to use ML without being an expert. BigQuery can actually do predictions with ML, without you needing to know too much about the underlying techniques. For example, AutoML, our ML [training tools]: you can take your set of images in which you want to recognize objects, and we can automatically construct a machine learning system that recognizes them with very high accuracy. Only a year ago, you needed an expert to do that.

The second problem is really how to deal with the transition to the cloud. Every large user is going to run in a hybrid configuration for a while. Now you have two environments, and they have different rules, so you need to have two different teams and train them differently in order to figure out how these things work together.

Quentin: Doesn’t putting out a cloud management tool like Kubernetes help with coordination?

Urs: Yes, absolutely. That is one of the hardest problems, and our answer to that is Kubernetes and Google Kubernetes Engine (GKE). Now you can use Kubernetes to manage your workloads both on premise and in the cloud–with not just the same code, but of equal importance, the same configuration.

Integrated machine learning is core to Google’s products, helping businesses turn data into insights and make smarter decisions. Learn more about BigQuery or read about our broader suite of data analytics solutions. If you already use BigQuery and you’re interested in generating ML-based insights, you can read about BigQuery ML.

Leave a Reply

Your email address will not be published. Required fields are marked *