47

Tableau and PostgreSQL: Two Ways of Working Together

 4 years ago
source link: https://blog.panoply.io/tableau-and-postgresql-two-ways-of-working-together
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Tableau And PostgreSQL: Two Ways Of Working Together

By Moran Gilad Halevi  |  October 30, 2018

The PostgreSQL database serves two roles in the Tableau ecosystem:

  • As the Tableau Server Repository—the Tableau Server Repository is the primary way to optimize performance of Tableau servers and provide a better experience for users. Learn how to obtain data about user operations in Tableau for performance monitoring and detailed analysis of usage.
  • As a data source for analysis—PostgreSQL is one of the world’s most popular data sources, and it is extremely common to find large data sets stored in Postgres format. You can easily analyze these data sets in Tableau. Learn how to connect to PostgreSQL from Tableau, load data, analyze it and keep PostgreSQL data up to date.

PostgreSQL as the Tableau Server Repository

The Tableau Server Repository is the Tableau PostgreSQL DB which contains data about user interactions, jobs and data extracts. See full documentation of the data tables included in the repository.

If you’re only interested in performance monitoring for Tableau, you can directly access the preselected database tables in the repository, which can shed light on the volume of Tableau usage across areas of functionality. You can also access the database via Tableau Desktop, using two built in admin usernames, and create Custom Admin Views, which let you query and explore the data in the repository.

Some Tableau administrators need more control than the views provided by the built-in users. There is a way to unlock one of these users and gain full access to all tables in the database, known as the “grant select hack”. The hack is detailed in an unofficial document referenced by the Tableau community; it is not supported by Tableau.

PostgreSQL as Data Source for Analysis

There are several stages involved in extracting data from PostgreSQL and analyzing it in Tableau:

Connecting to PostgreSQL Directly from Tableau

  1. In Tableau Desktop, select Connect > PostgreSQL. Type in the name of the server hosting the PostgreSQL database, the name of the database and authentication details.
  2. Click Sign In.
  3. Create a data source in Tableau with a descriptive name, and select a table, or a specific SQL query, to load into that data source.

Analyzing PostgreSQL Data in Tableau

Once you have connected to the database and created a data source, you can drag a table from the data source into your canvas and start your analysis. Before you begin the analysis, however, you must also verify that the data is structured according to Tableau’s requirements, and clean up the data for analysis.

However, you must first make sure the data is organized the way Tableau requires. According to a Tableau webinar on the subject, “Tableau likes data to be structured with each different field/variable in your data set to be in its own individual column and each unique data point in each row, with all the column headers being in the first row.”

In addition, the data may contain a lot of “noise”, inconsistencies or data integrity issues, that need to be cleaned up before analysis. If the database is a transactional or OLTP database, organized according to the needs of day-to-day transaction processing, you will need to significantly reorganize the data and clean up data irrelevant for your analysis.

cta-streamlined-icon-8.svg

Get your guide to Modern Data Management

Download Now

Updating PostgreSQL Data in Tableau

Tableau works with data extracts, so it loads all the relevant data from the PostgreSQL database and all analysis is performed on the extract. When the data is updated, you’ll need to refresh the extract, which can have high latency. The bigger the PostgreSQL database, the longer it will take data to refresh in Tableau, the operation can take minutes or even hours in extreme cases.

In addition, any data preparation or cleaning you have done will be lost when the extract is refreshed. This is one of the reasons why larger organizations move data from production databases to a data warehouse, preparing and treating it along the way, and only then accessing it with BI tools like Tableau.

Loading PostgreSQL Data to an Automated Data Warehouse

A common solution for the above challenges is to load PostgreSQL data to a data warehouse, cleaning and preparing it using an Extract Transform Load (ETL) process. This was traditionally a complex and expensive process which was out of reach for many organizations. However, modern data warehouse technology makes it much easier to achieve.

Panoply is an automated data warehouse that uses machine learning and natural language processing to automatically prepare and clean data, as it is loaded to the data warehouse. You connect your PostgreSQL data set to Panoply with one click, automatically prep the data for analysis, and leverage Panoply’s Tableau integration to start analyzing the data in Tableau.

When data is updated, Panoply can repeat the data preparation processes, allowing you to work on fresh data without repeated data prep work and without having to set up a complex ETL process.

Get a free trial of Panoply and see how to effortlessly analyze PostgreSQL data in Tableau.

cta-get-a-consultation.svg
Set up your data warehouse in minutes!
Every feature and all your data, free for 14 days.
No credit card required.
Read more in:
Share this post:

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK