Common pitfalls in ML projects and how to avoid them.
The start of the story… Your organization has reached its maturity regarding data analytics. A robust & scalable data pipeline was built. A series of dashboards were put to good use by all departments. Self-service analytics was implemented. The CTO decided time had come for expanding data use-cases to machine learning and artificial intelligence. The […]
Data Observability with Elementary
1. What is Data Observability? Data observability is the ability to observe data and data-related jobs in your data system. This will help you monitor your data pipelines, understand whether the system is operational, where the data are coming from and flowing into, which transformations are fast, which jobs are resource-consuming, where the data errors […]
Data Deduplication with ML
Problem Statement Briefly, we work with a company and they allow their customer to sign up for account. The company has so many branches, and one customer can open one (or more) account(s) at any branch. As a result, duplication happens, so here we are. When signing up, we need this kind of information from […]
RFM model and user segmentation built on Looker
As Data and Business analysts, we have all encountered situations where we need to segment customers based on their engagement with the business. An RFM model has a few benefits. It enables marketers to increase revenue by targeting specific groups of existing customers called “segments”. It also gives the business the ability to send targeted […]
Exploratory Data Analysis (EDA) using SQL and Datagrip
Exploratory Data Analysis (EDA) is something that we do pretty frequently. This is the first and foremost step to do at the beginning of any project, before we jump into more sophisticated work like refactoring or modeling. It’s like saying “Hi” to your lovely dataset so that we can gain confidence in every extracted information […]