Ingesting Data from SAP using DataStudio

ERP or “enterprise resource planning” software helps automate many core business areas, such as procurement, production, materials management, sales, marketing, finance, and human resources. These data sources play a pivotal role in enterprise digital transformation initiatives. SAP is the market leader in ERP software. This whitepaper explains how DeepIQ’s DataStudio enables crossfunctional analytics by simplifying the integration of your SAP data with the rest of your data ecosystem

SAP was one of the first companies to develop and market standard software for business solutions successfully and continues offering industry-leading ERP solutions. Solving complex problems in manufacturing, energy, mining, and other industries require ERP data. These include

Combining Operational Data (SCADA and other time series sources) with ERP data will ensure machine learning models are complete and accurate. Traditionally, ingesting ERP data along with time series, geospatial, unstructured, and structured data has been manual and error prone.

DeepIQ’s DataStudio has built-in connectors to ingest SAP data into your cloud platform of choice and export data from your cloud platforms into SAP. This whitepaper explains how DeepIQ DataStudio provides multiple ways to integrate your SAP data with the rest of your cloud ecosystem.

DeepIQ is a self-service {Data + AI} Ops app built for the industrial world. DeepIQ simplifies industrial analytics by automating the following three tasks:

  1. Ingesting time series, structured and geospatial data at scale into your cloud platform.
  2. Implementing sophisticated time series and geospatial data engineering workflows; and
  3. Building state-of-the-art ML models using these datasets.

In addition to reading directly from the underlying database, DeepIQ’s DataStudio supports three modes of integration with SAP:


IDoc (Intermediate Document) is a standard data structure used in SAP applications to transfer data or information from SAP to other systems and vice versa. Outbound IDocs generated in SAP can be saved as XML files in the cloud or a different storage location.

DataStudio can read these XML files, parse them, and create flattened structures required for persistence in your cloud databases. As IDocs are processed asynchronously, they can be used to read and write an enormous number of transactions without overloading the SAP systems.

In the example below, we parse Material IDocs and write to the data lake of your choice.

Material IDocs and write to the data lake

The following is an example of the Simple Object Access Protocol (SOAP) message that is received from SAP.

Simple Object Access Protocol (SOAP) message that is received from SAP

DeepIQ’s DataStudio converts the above SOAP message into an analytics ready dataset as shown below.

SOAP message into an analytics ready dataset

Using the IDoc approach has two advantages. Impact on the SAP server is minimal, and exporting data from SAP is relatively simple.


SAP’s BAPIs (Business Application Programming Interface) are methods for SAP business objects. These objects are stored in the Business Object Repository (BOR) of the SAP system and are used for carrying out business tasks.

BAPIs have standard business interfaces which enable DeepIQ’s DataStudio to access SAP processes, functions, and data. DataStudio uses BAPIs to use the synchronous communication method to return data. SAP’s BAPI uses the standard SOAP protocol to expose methods.

DataStudio can read SOAP Webservices that are exposed by SAP.

Thus, the ‘Read SOAP’ component makes the consumption of complex SAP web services straightforward. The following image shows DeepIQ’s DataStudio ingesting SAP data through the BAPI interface.

SOAP message into an analytics ready dataset

In this example, we read Equipment data from SAP and persist to the data lake.

The BAPI method is high performing, guarantees constancy across SAP versions, and is business object-oriented. BAPI can be configured to retrieve data on a schedule to ensure that the data in DataStudio is current. This method may have performance implications on the SAP server.

SAP OData is a standard Web protocol used for querying and updating data present in SAP. In SAP, we use a transaction code to create an OData Service. OData is used to build and consume RESTful APIs. DataStudio can read SAP OData Web services using the ‘Read Rest’ component.

As the SAP OData web services have a custom authentication mechanism, we use DataStudio’s data transformation capabilities to create the HTTP request payloads, parse the JSON response and generate the data frame.

The following shows a workflow to ingest and process SAP Material data, including authentication. We can also create or update SAP data in a similar way.

SOAP message into an analytics ready dataset

The OData method is high performing, stateless and lightweight. Workflows can be scheduled to ensure timely data in DataStudio while minimizing the impact on the SAP server


Combining ERP data with your time-series, geospatial, unstructured, and structured data will allow you to build robust analytic systems. DeepIQ’s DataStudio provides comprehensive capabilities to augment your digital transformation with SAP data./p>

For further information, please contact us at