Lower cost + higher performance through data warehouse restructuring and script optimization.
Nuvocargo decreased their data expense 63% and accelerated their dashboards by restructuring their Snowflake setup and optimizing script execution frequencies, enhancing both efficiency and cost-effectiveness.
Background
Nuvocargo is a modern logistics company focusing on trade between the United States and Mexico. The company was our client until January 2023 but re-engaged with us this October to address a specific challenge: optimizing Snowflake cost and speed.
Problem
In September, their Snowflake usage surged to over 4,000 credits, quadrupling their average monthly usage compared to previous months. Notably, after January, when we ceased managing their data pipeline, their monthly consumption increased sixfold and data users reported slow dashboard loading times.
Solution
Our initial step in addressing such issues is to explore how resizing or splitting the warehouse could enhance performance. The data team at Nuvocargo was utilizing a medium-sized warehouse for multiple services, including dbt, Airtable, Airflow, and Looker Studio. Consolidating these services into a single warehouse led to query concurrency, slowing down performance and increasing costs due to continuous warehouse activity.
To tackle this, we divided the medium warehouse into several extra-small warehouses, each dedicated to a specific service. This approach enabled us to monitor resource consumption for each service and make adjustments as needed. It also increased speed and reduced costs by minimizing queuing and warehouse run times. Additionally, we implemented Resource Monitors to set and adhere to quotas for each resource.
For accounts with Enterprise Edition or higher, Snowflake supports multi-cluster warehouses, which address queuing and performance issues related to concurrency.
We also reviewed the frequency of Airflow script executions to optimize them while fulfilling stakeholder requirements. Of the total 74 DAGs, many were running more frequently than necessary. After discussions, we revised their frequency and schedule to align with both the data and business teams. Scheduling these scripts less frequently decreases warehouse active time and query concurrency, ultimately saving costs and enhancing speed performance.
A critical point to note is that Nuvocargo’s data team allows their business team to write Airflow scripts due to the data team’s limited capacity. However, this practice is not ideal as it can compromise data pipeline performance in terms of both cost and speed, especially when non-data personnel write SQL scripts.
Nuvocargo, founded in 2018 by Deepak Chhugani and Sam Blackman, is a pioneering logistics company specializing in U.S.-Mexico cross-border freight services. With its headquarters in New York, the company offers a comprehensive range of services including customs brokerage, freight forwarding, cargo insurance, and supply chain financing. Distinctive for its all-in-one digital platform, Nuvocargo integrates various logistics functions into a seamless, user-friendly solution, catering to the specific needs of businesses engaged in North American trade. This innovative approach, combined with their expertise in facilitating cargo movement, positions Nuvocargo as a significant player in the logistics and freight forwarding industry.
- Optimizing Snowflake in Logistics: NuvoCargo Case Study - November 22, 2023