Looking for data preparation tools for Tableau? This article presents a list of data preparation tools that work particularly well with Tableau, enable self-service data preparation for users that don't have advanced SQL skills, run on-premise and cloud, and connect to core databases, such as Oracle and SQL Server.
Why data preparation tools for Tableau?
Tableau's data source interface allows us to do some basic data preparation tasks, including simple joins, unions, filters, data type changes, and various other things. However, if you have tried using Tableau on real data, you definitely have got to the point where you needed to do more. Here are some example scenarios to help you justify your requirements for data preparation tools for Tableau.
For example, let's say you are working on an analysis that involves 10 different database tables. All of these tables are huge, with over 20 columns and millions of rows of data. If you are a general Tableau user, without the advanced SQL skills to match the task, you are going to end up with a massive Tableau workbook with 200+ dimensions and metrics. This will load very slowly, and you really won’t get very far with it.
Another scenario: working with derived tables. It isn't possible to create sub-queries using the Tableau data source interface. In cases where you need to join the result of a query to another table, you have to write custom SQL. This limits user control over the structure and granularity of the dataset.
There are other reasons to use data preparation tools for Tableau. For example, you may want to bring in a data type not recognized by Tableau, pivot data in a very specific way, work around missing data in your data source, etc. Data in the real world is complicated. It is frequently dirty and inconsistent. Repeating the same cleaning tasks, workbook after workbook, is time-consuming. If you are dealing with data preparation issues, you may need a data preparation tool for Tableau.
A full list of data preparation tools for Tableau
The choice of data preparation tools is huge. If you google "ETL tools", "data preparation tools", "BI tools", "data cleansing tools", and so on, you will find a lot of options. Which one to choose?
To compile the list, I've used the following criteria:
Self-service data preparation Don't have advanced SQL skills? No problem. These tools allow you to work with data using a drag-and-drop user interface.
A native integration with Tableau You don't want to have to find workarounds to bring the data into Tableau, once it's ready. The ideal solution is a data source on Tableau Server, Tableau Online or on your desktop.
Verified by Tableau Tableau has a technical partners program. They admit to the program the tools that complement Tableau and provide functionality that Tableau doesn’t offer.
Works behind firewall You may be comfortable with loading data into the cloud, but in case you are not, these tools can run on-premise.
It's more than an Excel plug-in You will be able to connect to core databases, such as Oracle and SQL Server, as well as other data sources.
Without further ado, here is the list of data preparation tools for Tableau, listed alphabetically.
Alteryx is a robust tool. It provides a user interface that makes even the most complex data preparation possible without a single line of code. It is one of the leading BI tools. Alteryx was considered a Niche Player in BI in 2017 by the Gartner report and a Leader in Data Science and Machine Learning.
Apart from traditional data preparation, it also offers geospatial data preparation and analysis as well as predictive data analysis. If you are looking for a comprehensive data preparation tool for Tableau, one that can also help with geospatial and predictive analytics, Alteryx would be your best bet.
Individual license: starts at $5195 per user per year
Analytics Canvas allows you to perform data preparation using an intuitive, clear user interface. It provides data preparation for core databases, web data sources, social media, and files. Analytics Canvas allows for not only creating data extracts, but also the publishing of data sources (TDE and Hyper) and workbooks from templates (TWB) directly to Tableau Server and Online.
Analytics Canvas' strength is web analytics: connecting to all Google Analytics APIs, AdWords, Search Console, as well as BigQuery, and creating complex data preparation workflows, is a breeze with Analytics Canvas. If you are looking for an inexpensive tool that allows you to combine data from Google Analytics with your internal data to build complex data ingestion workflows, this tool is for you.
ClearStory is an integrated Apache Spark-based tool. It is visual, with data-intelligence built in, point-and-click, and a fast and easy way to discover patterns in data. It provides automated data preparation using semantics.
The strength of ClearStory is the automated pattern discovery in the data preview, which allows you to dive right in with Tableau (desktop, server and online) as a Web Data Connector.
CloverETL is a visual, interactive data preparation and automation tool. It is Java-based data integration software data preparation and automation of data transformations; data cleansing and data quality; data migration; and distribution of data into applications, databases, cloud and data warehouses.
CloverETL comes with its own scripting language for defining complex data transformation rules. You can create your own extensions of CloverETL and custom components. If you are looking for a tool for visual data preparation, but want to keep the option of writing your own code to enable custom applications, CloverETL is a good option.
Monarch by Datawatch is a self-service data preparation tool designed for business users. It connects to all major databases and supports automation of repeat data preparation tasks.
The strength of Datawatch Monarch is in its ability to extract data from multi-structured sources and web pages. For example, it can extract data tables from lengthy PDF documents, eliminating the need for manual data retyping and saving users valuable time.
Informatica provides self-service data preparation and analytics with an emphasis on data governance, data quality, and security. Informatica was recognized as an Industry Leader in Data Integration by the 2017 Gartner report.
Infomatica’s advantage is most obvious on the cloud: Its cloud solution allows you to access data from one of the numerous supported datasources, combine, clean and push to Tableau. If you are looking for a cloud data preparation tool, with strong data governance and data management functionality, Informatica is a good solution.
Individual license: starts at $100 per user per month (limited)
Tableau Prep is Tableau's own new data preparation tool. It is recently out of beta and is designed specifically for Tableau users.
It allows basic data preparation workflows, including connecting to Tableau extracts as a data source, output to TDE, Hyper and Open to Tableau. It doesn't yet support all the data sources available in Tableau.
Individual license: $70 per user per month as part of Tableau Creator license
Snaplogic offers a quick, no code, snap-and-assemble, drag-and-drop operation. It provides integrations with over 450 data sources and integrated AI to help with repetitive, low-level development tasks. Gartner listed SnapLogic as the leader of Enterprise Integration Platform as a Service (iPaaS) in 2017.
Data integration is the core of SnapLogic. If you are looking for a solution to provide as many connectors as possible to satisfy current and future data integration needs across the entire organization, this tool is a good choice.
Trifacta provides self-service data preparation with an emphasis on efficiency, by offering machine-intelligence suggestions on data preparation. The user interface of Trifacta lets the user get to work with the data without having to learn how the traditional nodes and connectors work.
Trifacta has a hybrid architecture where your organization's data, application, and processing engine stays local; whereas the metadata, scripts and functions are up in the cloud. This approach has its benefits, but it may not be for everyone.
If you are looking for a tool that allows analysts to quickly get started, while also satisfying the needs of more experienced data scientists and engineers, this tool is a good choice.