‘We Use Machine Learning Algorithms To Save Millions Of Dollars’

Using Tableau/Spotfire/Qlikview and D3 for visualization: All industry verticals need to analyze interdependencies and complex relationships between indicators including KPIs by using lucid and powerful visualization capabilities including network graphs, tree-maps, surface and contour plots and hexbin plots.

What are your prime responsibilities in General Mills as a (supply chain) data scientist?

Forecasting demands of products each week accurately is the most important job of Supply Chain Management. Understanding the linear or nonlinear relationships between different KPIs (key performance indicators) to get the best possible insights is the second most important job. Understanding the customers, suppliers based on data and then classification or clustering them is another important job. Here analysts use R and python languages for doing predictive analytics and optimization.

How do these initiatives translate in terms of adding to the business?

We try and solve business problems more accurately with help of machine learning. Having solved it and built the confidence of stakeholders, our team has started new projects on prediction, classification, clustering and optimization. All these initiatives have helped our company save millions of dollars with the help of accurate prediction of demands and understanding of complex relationship between different KPIs. Please note that one percent accuracy improvement in demand forecasting translates to huge savings for a company. Plus, minimization of the production cost and transportation cost also brings in a lot of savings in supply chain management.

Data science being in the nascent stage in India yet, what all challenges do you face today?

Data format and management are not consistent throughout the system. The main challenge lies in making it consistent and clean before applying various modeling and optimization techniques. In addition, we need to be ready to know the business more holistically and more deeply before we start working on any project.

Can you share some insights on the tools that you use?

Time series clustering helps us cluster products based on various parameters. Bayesian networks and markov models help us get big insights out of data pool. (Bayesian networks is a statistical model that represents a set of random variables and their dependencies via a directed acyclic graph. For example, Bayesian network could represent the relationships between diseases and symptoms. Markov model is a random probability pattern describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.) Understanding the delivery networks through graph theory also  helps us reduce transportation cost significantly.

What’s your advice for IT Managers on how to leverage analytics tools?

IT Managers are supposed to be aware of strengths and limitations of all popular machine learning algorithms and should be thinking where they can do classification, clustering, prediction , associate mining, optimization and mathematical modeling, so as to help business leaders take better decisions.

Leave a Comment

Your email address will not be published. Required fields are marked *