How to replicate Heap data to a data warehouse using Stitch

More than a year ago we announced a partnership between Heap and Stitch. Today, let's talk about how you can use Stitch to replicate Heap data and data from the other SaaS tools and local databases into a data warehouse, from which you can create reports to surface insights and improve performance.

A data warehouse consolidates all of your data in a single location — generally a cloud-based platform nowadays. You can extract the data from your SaaS applications and on-premises databases and load it to your data warehouse using an ETL (extract, transform, load) tool such as Stitch. Once the data is available, analysts can use it to create reports.

In this post, we'll walk through the process of connecting Heap to Stitch, setting up replication to a data warehouse, and accessing the data with a business intelligence (BI) tool to create reports.

Three tiers of the data analytics architecture

A data analytics stack comprises three tiers: ETL software, data warehouse, and BI software.

Stitch provides a simple, powerful ETL service for businesses of all sizes. Signup is simple — you can be moving data from one or more sources to a data warehouse in five minutes.

The last few years have seen the emergence of cloud-native data warehouses. Because they run on cloud infrastructure that scales quickly and cost-effectively to meet performance demands, they can handle transformation using the same hardware on which the data warehouse runs.

Finally, to unlock the value of your data, you can connect a BI or data visualization tool to your data warehouse and create reports that analyze data from multiple sources, which you can share via browser-based dashboards.

Setting up a data warehouse

We'll set up our data analytics stack starting with the data warehouse. If you don't already have a data warehouse, choose one that meets your needs. If you choose Amazon Redshift, Google BigQuery, Snowflake, Azure Synapse Analytics, or one of the other destinations Stitch supports, you can follow the setup steps for your data warehouse in the Stitch documentation.

Setting up Stitch for ETL

The next step is setting up an ETL pipeline to move data from your sources to the data warehouse. Stitch makes extracting data from a source and loading it into a data warehouse easy. To get started, visit the signup page, enter your email address, then enter your name and a password.

Add an integration

Next, add Heap as an integration within Stitch. Click on the Heap icon to get started:

Enter a name for the integration. The name will display on the Stitch dashboard for the integration and will be used to create the name in your destination. You must also enter an S3 bucket and AWS account ID – follow the documentation.

The other two settings ask how much historical data you want to replicate to your data warehouse and how often you want to replicate new data.

When you click Save, Stitch displays another screen that lets you choose what tables you want to replicate.

Each time you check a table, Stitch will display all the fields in the table, so you can choose the ones you want.

When you're done with one table, click the name of the integration in the breadcrumb bar to return to the previous screen, where you can click additional tables. When you have all you want, click Finalize Your Selections, and voilà, your integration has been added. All new fields and records of the types you've selected will be replicated to your data warehouse — but first you have to connect the data warehouse you set up to Stitch as a destination.

Add a destination

Suppose you've chosen an Amazon Redshift data warehouse. Clicking on the Redshift icon brings you to a screen where you can enter your credentials:

When you visit your Stitch dashboard, the status for your new integration may show as pending while Stitch schedules the initial replication job. If you refresh the screen after a few minutes the status will change to active.

From the dashboard you can also do things like adding integrations from other data sources. The Stitch documentation walks through the process for each one.

Connecting BI software to your data warehouse

Now you can connect an analytics platform to your data warehouse. If you don't already use BI software, you have dozens of analysis tools to choose from, including popular options such as Tableau, Microsoft Power BI, Google Data Studio, Looker, Chartio, and Mode.

That's all there is to it. Using an ETL tool like Stitch to move data from Heap and other sources into a data warehouse lets you employ BI tools to correlate and report on data from all of your sources.

How to replicate Heap data to a data warehouse using Stitch

How to replicate Heap data to a data warehouse using Stitch

Three tiers of the data analytics architecture

Setting up a data warehouse

Setting up Stitch for ETL

Add an integration

Add a destination

Connecting BI software to your data warehouse

Recommend

The Best Way To Learn AWS Is To Start Using It

How To Find, Work With and Emerge As A New Great Mentor

3 Things You Need To Do When Setting Up Your First AWS Account

3 Valuable Secrets On Teamwork From A Night Out At The Chefs Counter

Adding Free SSL Certificates To Static Websites via AWS Certificate Manager

How to Build Wealth in Your Career over the Years

How to Deploy a Next.Js App to AWS ECS with HashiCorp Waypoint

Now You See Me: How NICE and PDQ plots Uncover Model Behaviors Hidden by Partial...

fre2 Fiber（时间切片+超时队列）

Best Posts on DanylkoWeb for 2020

About Joyk