Machine Learning Platforms: A Quick Introduction
Data fuels all software-driven transformations, from creating new experiences to framing new business models. If data is the fuel in this analogy, then Machine Learning and AI are the engines that power up your business and wheel it in the direction and the pace you want it moving.
At Persistent, we believe that ML is the perfect tool to ensure maximum impact to organizations and businesses by allowing them to,
- Automate business processes
- Take adaptive decisions without human intervention
- Optimize Operations and free up bandwidth for creative work
- Do things efficiently and quickly, leading to time and money savings
- Gain insights with data analysis
- Predict future behavior based on past patterns, thereby reducing risk
- Recognize trends from vast amounts of data and take informed decisions
- Enhance features, functions, and performance with a goal to improve customer satisfaction
- Engage with customers/employees
- Improve engagement with intuitive personalization
- Create new products
- Pursue new markets
Our collaborative work with customers in their ML Programs and Data Science journeys has always been mutually beneficial. These experiences make me believe that a significant part of being able to achieve the maximum impact from ML programs lies in an organization’s ability to deploy Machine Learning model versions continuously, consume them as a part of their business workflows/business applications and scale them with all the changes that are happening.
Without re-playing the typical ‘model building’ processes which almost all of us are familiar with, I will list some of the intrinsic complexities that accompany Machine Learning,
- Understanding the business domain
- Selecting the appropriate model
- Selecting the right features and weights
As companies start adopting Machine Learning, they get to a point where they have moved away from experimentation to actually running a few models in production. These come with their own set of challenges. Some of them are listed below,
- Integration with existing data warehouse/data sources to pull required data
- Scaling model training, i.e. training with huge datasets and a repeatable process
- Serving model to make an inference interface available
- Keeping consistency between: Prototyping & Production, Training & Inference, etc.
- Keeping track of multiple models, versions, and experiments
- Supporting iterations on ML models (A/B testing etc.)
Typically, ML models take anywhere between 8 to 12 weeks to build. However, ML workflows tend to be slow, fragmented, and brittle. The need for robust ML workflows becomes more and more important as organizations mature in their ML programs. Therefore, what organizations and developers need are more powerful tools that allow them to become more productive and build + deploy models faster.
The need of the day are answers on how to manage the Machine Learning model lifecycles. This means thinking about how to bring in concepts like source control and CICD to machine learning.
I have tried to capture aspects that we believe an ML platform should contain and be able to address. An ML platform should support the full life cycle of the mode and be multi-dimensional enough to cater to the needs of expert data scientists, citizen data scientists and application developers alike.
Choosing one platform needs amongst the many existing ML platforms to perfectly suit your needs is a humongous task. Since the needs of customers vary widely, the criteria of selection also tend to vary. ML and Data Science platforms are still evolving and hence not all address the above out of the box questions and needs. There is still an aspect of customizations needed to suit the requirements of each organization.
As an independent service provider, we are in a unique position to, on one hand, understand the expectations from the ML platform spread across our customers and on the other, get exposure to some of best and latest enterprise ML platforms.
On that note, I am delighted to present this blog series, written by my team of Data Scientists at Persistent, covering our experiences with various ML platforms. We plan on starting with the three biggies in this space before we move to the niche players. The very first one is a favorite of many – Amazon SageMaker. Please stay tuned as Amogh Tarcar from our CTO team indulges you with this one!