Businesses today have a growing demand for data analysis and insight-based action. More often than not, the valuable data driving these actions is in mission critical operational systems. Among all the applications that are in the market today, SAP is the leading provider of ERP software and Google Cloud is introducing integration with SAP to help unlock the value of SAP data quickly and easily.
Google Cloud native data integration platform Cloud Data Fusion now offers the capability to seamlessly get data out of SAP Business Suite, SAP ERP and S/4HANA. Cloud Data Fusion is a fully managed, cloud-native data integration and ingestion service that helps ETL developers, data engineers and business analysts efficiently build and manage ETL/ELT pipelines that accelerate the building of data warehouses, data marts, and data lakes on BigQuery or operational reporting systems on CloudSQL, Spanner or other systems. To simplify the unlocking of SAP data, today we’re announcing the public launch of the SAP Table Batch Source. With this capability, you can now use Cloud Data Fusion to easily integrate SAP application data to gain invaluable insights via Looker. You can also leverage the best in class machine learning products on Google Cloud to help you gain insight into your business by combining SAP data with other datasets. Some examples include running machine learning on IoT data joined with ERP transactional data to do predictive maintenance, application to application integration with SAP and CloudSQL based applications, fraud detection, spend analytics, demand forecasting etc.
Let’s take a closer look at the benefits of the SAP Table Batch Source in Cloud Data Fusion:
As Cloud Data Fusion is a complete, visual environment, users can use the Pipeline Studio to quickly design pipelines that read from SAP ECC or S/4HANA. With Data Fusion’s prebuilt transformations, you can easily join data from SAP and non SAP systems, and perform complex transformations like data cleansing, aggregations, data preparation, and lookups to rapidly get insights from the data.
Time to Value
In traditional approaches, users are forced to define models on data warehousing systems. In Cloud Data Fusion, this is automatically performed for the users when using BigQuery. After you design and execute a data pipeline that writes to BigQuery, Data Fusion auto generates the schema in BigQuery for you. As users don’t need to pre build models, you get insight into your data faster, which results in improved productivity for your organization.
Performance and Scalability
Cloud Data Fusion scales horizontally to execute pipelines. Users can leverage the ephemeral clusters or dedicated clusters to run the pipelines. The SAP Batch Source plugin automatically tunes the data pipelines for optimal performance when it extracts data from your SAP systems, based on both SAP application server resources and Cloud Data Fusion runtime resources. If parallelism is misconfigured, a failsafe mechanism in the plugin prevents any issues in your source system.
Transfer full table data from SAP to BigQuery or other systems
In the Pipeline Studio, you can add multiple SAP source tables to a data pipeline, and then join the other SAP source tables with joiner transformations. As the joiner is executed in the Cloud Data Fusion processing layer, there is no additional impact on the SAP system. For example, To create a Customer Master data mart, you can join all relevant tables from SAP using the plugin, and then build complex pipelines for that data in Cloud Data Fusion’s Pipeline Studio.