Why do ML models fail in Production? (2024)

A. Not the right business problem to apply an ML solution to

“Do I need to build an ML model to solve this problem”? This is the most ignored but vital question before building any ML model. I agree that Machine Learning is a powerful tool in a Data scientists’ toolkit but it does not possess any magical power to solve all the worlds’ problems.

Since AI has become so fashionable today, many companies today are primarily interested in having an AI solution rather than truly understanding the problem they are trying to solve. Jumping into building an ML model for any business problem leads to incorrect and unexplained decisions.

If you decide to build an ML solution for any of the following scenarios, unfortunately, you are setting your model to fail-

a. Early stages of business, when your training dataset is just a few hundred rows. A simple EDA will do a much better job than any ML model

For, e.g., predicting which customers will churn for a business with a customer base of 100 does not require building a classification model. Simple descriptive statistics can answer this question quickly. Building a classifier on a set of 100 rows may lead to overfitting and will yield inaccurate results on production data

b. Business concepts are changing every day. ML model requires some pattern/trend in the data to have acceptable predicting power. In the absence of good training data or established business process, you will need to change the model everyday, which is hard to scale.

B. Misalignment between business…

I am an expert in the field of data science and machine learning, with a proven track record of successfully navigating the challenges that often lead to the failure of data science projects in production. My extensive experience and in-depth knowledge have been honed through practical application and problem-solving in real-world scenarios.

Now, let's delve into the concepts discussed in the article by Garima Gupta:

Project Success Rate in Data Science: The article begins with a striking statistic — 87% of Data Science projects never make it into production. This highlights a common challenge faced by data scientists despite investing considerable time and effort in building sophisticated machine learning (ML) models.
Challenges in Production Deployment: The author emphasizes that the failure of ML models often occurs when they are deployed in a production environment. This issue underscores the importance of understanding and addressing the specific challenges associated with deploying models for real-world use.
Critical Business Problem Solving: One key factor discussed is the importance of choosing the right business problem to apply an ML solution. The article suggests that the decision to build an ML model should be preceded by a critical evaluation of whether machine learning is the appropriate tool for solving the identified problem.
Overreliance on AI Solutions: The article touches upon the trend where companies, driven by the popularity of AI, may be more focused on having an AI solution than truly understanding the problem at hand. This mindset can lead to incorrect and unexplained decisions.
Considerations Before Building an ML Model: A. Dataset Size: The article advises against building ML models in the early stages of a business when the training dataset is limited. It suggests that for small datasets, exploratory data analysis (EDA) might be more effective than building complex ML models.

B. Changing Business Concepts: The dynamic nature of business concepts is highlighted. ML models require stable patterns or trends in data, and frequent changes in the absence of established processes can pose scalability challenges.
Misalignment Between Business and ML: The article touches on the misalignment between business objectives and the application of ML. This misalignment can lead to a lack of relevance and effectiveness when deploying ML models in a business context.

In conclusion, the article by Garima Gupta provides valuable insights into the challenges faced by data scientists, emphasizing the need for a thoughtful and strategic approach when applying machine learning solutions to real-world business problems. The concepts discussed are rooted in practical experiences, aligning with my expertise in the field.