All you need to know about Datastream

Building with Looker made easier with the Extension Framework
July 29, 2021
two men putting puzzle pieces together
Skyline Quick Tips: New Findings Notifications
July 30, 2021
Building with Looker made easier with the Extension Framework
July 29, 2021
Skyline Quick Tips: New Findings Notifications
July 30, 2021

Datasteam use cases

Datastream captures change streams from Oracle, MySQL, and other sources for destinations such as Cloud Storage, Pub/Sub, BigQuery, Spanner and more. Some use cases of Datastream:

  • For analytics use Datastream with a pre-built Dataflow template to create up-to-date replicated tables in BigQuery in a fully-managed way.
  • For database replication use Datastream with pre-built Dataflow templates to continuously replicate and synchronize database data into Cloud SQL for PostgreSQL or Spanner to power low-downtime database migration or hybrid-cloud configuration.
  • For building event-driven architectures use Datastream to ingest changes from multiple sources into object stores like Google Cloud Storage or, in the future, messaging services such as Pub/Sub or Kafka
  • Streamline real-time data pipeline that continually streams data from legacy relational data stores (like Oracle and MySQL) using Datastream into MongoDB.

How do you set up Datasteam?

  1. Create a source connection profile.
  2. Create a destination connection profile.
  3. Create a stream using the source and destination connection profiles, and define the objects to pull from the source.
  4. Validate and start the stream.

Once started, a stream continuously streams data from the source to the destination. You can pause and then resume the stream.

Connectivity options

To use Datastream to create a stream from the source database to the destination, you must establish connectivity to the source database. Datastream supports the IP allowlist, forward SSH tunnel, and VPC peering network connectivity methods.

Private connectivity configurations enable Datastream to communicate with a data source over a private network (internally within Google Cloud, or with external sources connected over VPN or Interconnect). This communication happens through a Virtual Private Cloud (VPC) peering connection.

For a more in-depth look into Datastream check out the documentation.

Leave a Reply

Your email address will not be published. Required fields are marked *