What Is the Modern Data Stack?
A radically new approach to data integration saves engineering time, allowing engineers and analysts to pursue higher-value activities.
The modern data stack (MDS) is a suite of tools used for data integration. These tools include, in order of how the data flows:
- a fully managed ELT data pipeline
- a cloud-based columnar warehouse or data lake as a destination
- a data transformation tool
- a business intelligence or data visualization platform.
The goal of an MDS is to analyze your business’s data to proactively uncover new areas of opportunity and improve efficiency. As the MDS is a relatively new development, we will answer some common questions about its capabilities.
What separates a modern data stack from a legacy data stack?
The most important difference between a modern data stack and a legacy data stack is that the modern data stack is hosted in the cloud and requires little technical configuration by the user. These characteristics promote end-user accessibility as well as scalability to quickly meet your growing data needs without the costly, lengthy downtime associated with scaling local server instances.
Ultimately, the modern data stack lowers the technical barrier to entry for data integration. The components of the modern data stack are built with analysts and business users in mind, meaning that users of all backgrounds can not only easily use these tools, but also administer them without in-depth technical knowledge.
What are the benefits of a modern data stack?
The modern data stack saves time, money and effort. The low and declining costs of cloud computing and storage continue to increase the cost savings of a modern data stack compared with on-premise solutions. Off-the-shelf connectors save considerable engineering time otherwise spent designing, building and maintaining data connectors, leaving your analysts, data scientists and data engineers free to pursue higher-value analytics and data science projects.
Here’s what customers have said after building a modern data stack with Fivetran:
“Fivetran enabled us to sync our product, finance, customer service and marketing data into the warehouse in under a day — without engineering support. Now our users can focus on uncovering insights instead of data validation and troubleshooting.” – Brendon McKeon, Director of Data & Analytics, MessageYes
“The real benefit of Fivetran is that people can refocus their time and maximize the value of data. We can now build more comprehensive analysis more quickly. Our focus is on data modelling and data analysis versus ETL.” – Gustavo Rada, Head of BI, Exporo
“Fivetran has easily saved us 20 hours a month in human capital that we can instead use to develop strategic insights to help drive the business forward instead of working on data extraction and loading.” – Sean Rober, Head of Analytics at Zenefits
How hard is it to set up?
It’s easy! As it is hosted in the cloud and abstracts away complications from configuring infrastructure, modern data stacks today can easily be set up in less than an hour.
What should I look for in each component of the modern data stack?
The modern data stack encompasses a data pipeline, a destination, a transformation layer and a business intelligence/data visualization platform. These are important features for each component:
Look for a tool that has prebuilt connectors to all of your company’s data sources, can be set up quickly to allow for scaling data integration, and is fully managed to account for API changes or schema changes.
Your data destination should easily be able to scale both compute and storage without long downtime to accommodate your data storage and analytics needs. Additional features should be considered on a case-by-case basis, such as how you would set up and provision future role-based access control or run your analytics models. There are several technical criteria for evaluating cloud data warehouses.
Your transformation tool should be compatible with your destination, and have features that make it easy to trace back your data lineage, such as version control and/or documentation that helps outline transformation impact on your tables.
Business intelligence/data visualization
In general, you should consider the technical implementation (such as defining variables for users), visualization flexibility, and user accessibility. Additional criteria, such as whether you want to have end users self-serve on the tool, are dependent on your internal data structure. We have previously written about how to find a good BI platform.
When should I upgrade to the modern data stack?
There are two common use cases to upgrading to the modern data stack. Larger or older companies often have on-premise infrastructure to migrate to the cloud. As startups and small businesses grow, they may build a modern, cloud-based data stack to replace small-scale, manual or within-app reporting.
The last thing you want is to fall too far behind the curve. The modern data stack is continually evolving, and will only continue to bring analytics to greater heights.
Determine how long it takes for your engineering resources to build new data connectors and maintain or upgrade the ones that you have. How does it impact downstream analytics timelines? How much time and money does it cost? Companies today are moving real-time or near-real-time data models to monitor their businesses. This often requires a modern, fully managed ELT tool.
This is typically tied to compute and storage usage. It’s common to hear of legacy databases frequently crashing from bloated storage or running queries that span a lunch break. When those happen, it’s time to upgrade. 73% of companies state that their cloud migration projects will take over a year, but there is simply no substitute to easy scalability and centralization. It’s better to upgrade sooner than later.
In some cases you can leverage your destinations’ or data visualization tools’ native features to run transformations, such as repeatable SQL scripts. For the sake of scale and greater transparency, you should consider getting a transformation tool that is technically accessible to your team members and has features such as version control to track the history of your transformations.
Business intelligence/data visualization
More and more questions are asked of successful data teams every single day, and business intelligence tools rapidly become the gateway to answers. Consider how stakeholders consume data (such as visualizations, dashboards and reports) and what you will need to make access to data as self-service as possible.