Bring 20/20 vision to your pipelines with enhanced monitoring

Easily upgrade Windows Server 2008 R2 while migrating to Google Cloud
February 11, 2020
Now updated: Our Data Engineering Learning Path
February 11, 2020
Easily upgrade Windows Server 2008 R2 while migrating to Google Cloud
February 11, 2020
Now updated: Our Data Engineering Learning Path
February 11, 2020

Stream analytics is bringing data to life in a way that was previously unimaginable, unlocking new use cases, from connected medical devices in healthcare to predictive maintenance on the factory floor. But with new uses comes new challenges that, if left unaddressed, can lead to unintended behaviors for end-user applications.

Before the days of modern stream analytics, you could guarantee the reliability of your batch data processing by re-executing your data workflows. Plus, since batch processing latency was a lesser concern, ensuring that your data was delivered within your SLOs was a manageable task.

Stream processing is a different beast, however. Stream analytics shrinks the time horizon between a user event and an application action, which means it is more important than ever to quickly respond to performance degradations in your data pipelines. To that end, Dataflow, Google Cloud’s fully managed batch and stream data processing service, now includes new observability features that will allow you to identify, diagnose, and remediate your pipelines faster than ever. With better observability, you can spend less time fixing problems and more time getting value out of your data.

Introducing Dataflow observability

With this launch, we are introducing new charts into the Dataflow monitoring UI and streamlined workflows with the Cloud Monitoring interface. You will find these charts in the new “Job metrics” tab located at the top of the screen when you navigate to the job details page within Dataflow.

In addition to the data freshness, system latency, and autoscaling graphs that have historically been a part of the Dataflow monitoring experience, you’ll now also see throughput and CPU utilization charts. Throughput charts, shown below, show how many elements (or bytes) are flowing through your pipeline. The time-series graph contains a line for each step of your pipeline, which can quickly illustrate which step(s) of your pipeline could be slowing down the overall processing of your job. Our new time selector tool allows you to drag your cursor over interesting points in the graph to zoom in for higher fidelity.

Leave a Reply

Your email address will not be published. Required fields are marked *