Analyze BigQuery data with Kaggle Kernels notebooks

Introducing Workload Identity: Better authentication for your GKE applications
June 24, 2019
Red Hat Helps Pave Road to Open Hybrid Cloud for APAC Enterprises
June 25, 2019
Introducing Workload Identity: Better authentication for your GKE applications
June 24, 2019
Red Hat Helps Pave Road to Open Hybrid Cloud for APAC Enterprises
June 25, 2019

We’re happy to announce that Kaggle is now integrated into BigQuery, Google Cloud’s enterprise cloud data warehouse. This integration means that BigQuery users can execute super-fast SQL queries, train machine learning models in SQL, and analyze them using Kernels, Kaggle’s free hosted Jupyter notebooks environment.

Using BigQuery and Kaggle Kernels together, you can use an intuitive development environment to query BigQuery data and do machine learning without having to move or download the data. Once your Google Cloud account is linked to a Kernels notebook or script, you can compose queries directly in the notebook using the BigQuery API Client library, run it against BigQuery, and do almost any kind of analysis from there with the data. For example, you can import the latest data science libraries like Matplotlib, scikit-learn, and XGBoost to visualize results or train state-of-the-art machine learning models. Even better, take advantage of Kernel’s generous free compute that includes GPUs, up to 16GB of RAM and nine hours of execution time. Check out Kaggle’s documentation to learn more about the functionality Kernels offers.

With more than 3 million users, Kaggle is where the world’s largest online community of data scientists come together to explore, analyze, and share their data science work. You can quickly start coding by spinning up a Python or R Kernels notebook, or find inspiration by viewing more than 200,000 public Kernels written by others.

For BigQuery users, the most distinctive benefit is that there is now a widely used Integrated Development Environment (IDE)–Kaggle Kernels–that can hold your querying and data analysis all in one place. This turns a data analyst’s fragmented workflow into a more seamless process instead of the previous way, where you would first query data in the query editor, then export the data elsewhere to complete analysis.

In addition, Kaggle is a sharing platform that lets you easily make your Kernels public. Kaggle lets you disseminate your open-source work and also discuss data science with the world’s top-notch data scientist professionals.

Getting started with Kaggle and BigQuery
To get started with BigQuery for the first time, enable your account under the BigQuery sandbox, which provides up to 10GB of free storage, 1 terabyte per month of query processing, and 10GB of BigQuery ML model creation queries. (Find more details on tier pricing in BigQuery’s documentation).

To start analyzing your BigQuery datasets in Kernels, sign up for a Kaggle account. Once you’re signed in, click on “Kernels” in the top bar, followed by “New kernel” to immediately spin up your new IDE session. Kaggle offers Kernels in two types: scripts and notebooks. For this example, the notebooks option is selected.

Leave a Reply

Your email address will not be published. Required fields are marked *