Machine learning (ML) is a vital technology for companies seeking a competitive advantage, as it can process large volumes of data fast that can help businesses overcome challenges such as more effectively make recommendations to customers, hone manufacturing processes or anticipate changes to a market. Machine Learning as a Service (MLaaS) is defined in a business context as companies designing and implementing ML models to provide a continuous and consistent service to customers.
Businesses today are dealing with huge amounts of data and the volume is growing faster than ever. At the same time, the competitive landscape is changing rapidly and it's critical for commercial organizations to make decisions fast. Business success comes from making quick, accurate decisions using the best possible information. This is critical in areas where customer needs and behaviours change rapidly. For example, in 2020, people had to change how they shop, work and socialize as a direct result of the COVID-19 pandemic, and businesses have had to shift how they service customers to meet their needs. This means that the technology they are using to gather and process data also needs to be flexible and adaptable to new data inputs, allowing businesses to move fast and make the best decisions.
Appier observes that one current challenge of taking ML models to MLaaS has to do with how we currently build ML models and how we teach future ML talent to do it. Most research and development of ML models focuses on building individual models that use a set of training data (with pre-assigned features and labels) to deliver the best performance in predicting the labels of another set of data (normally we call it testing data). However, if we're looking at real-world businesses trying to meet the ever-evolving needs of real-life customers, the boundary between training and testing data becomes less clear. The testing or prediction data for today can be exploited as the training data to create a better model in the future.
Based on Appier's years of practical experience, the data used for training a model will no doubt be imperfect for several reasons. Besides the fact that real-world data sources can be incomplete or unstructured (such as open answer customer questionnaires), they can come from a biased collection process. For instance, the data to be used for training a recommendation model are normally collected from the feedback of another recommender system currently serving online. Thus, the data collected are biased by the online serving model.
Additionally, sometimes the outcome we care about most is the hardest to evaluate. Let's take digital marketing for ecommerce as an example. The most straightforward customer journey would be 'click item, view item, add item to cart, purchase item'. However, the process is rarely this simple- people might look at an item several times on different devices, and they may remove it from the cart before putting it back in or abandon the purchase altogether.
Usually, the actions in the deeper funnel (i.e. purchase) are much harder to obtain than the ones on the upper funnel. For example, If the consumer does not complete the purchase on your platform, you will never know if he has lost interest in the product, or if there's another reason he didn't buy the item. If an MLaSS model relies only on the simplest metrics (i.e. clicks and view), its suggestion (e.g. when to send out marketing messages) will not align with the ultimate business goal.
Finally, for a B2B AI company that provides machine learning services, they normally need to serve thousands or even more customers from different domains. This means there will always be at least multi-thousand models serving online. Furthermore, for those models to consistently perform to meet ongoing and constantly shifting business goals, they need to be retrained or updated every day to keep up with evolving real-world scenarios. To achieve those goals, one needs to design not only an automated training pipeline but also to guarantee that models will have close to zero probability to converge to a bad local optimal.
In many cases when an unexpected outcome is delivered by Machine Learning model, it's not the machine learning that has broken down but some other part of the chain. For example, a recommendation engine may have offered a product to a customer, but the connection between the sales system and the recommendation could be broken, and it takes time to find the bug. In this case, it would be hard to tell the model if the recommendation was successful. Troubleshooting issues like this can be quite labor intensive and is a capability that companies adopting MLaaS need to have in place.
Ensuring the overall stability and robustness of MLaaS models is critical. It requires significant ongoing investment, research and experimentation, but the rewards for businesses can be huge, allowing them to quickly adapt and pivot to changing business environments and allowing them to stay ahead of the game. For more artificial intelligence and machine learning information, please refer to the Appier blog.
|About Appier |
Appier helps businesses solve their most challenging problems with artificial intelligence. It is a partner to some of the world's leading brands, providing a suite of enterprise-grade products to support data-driven decisions and accelerate business growth. Established in 2012 by a passionate team of computer scientists and engineers, Appier now has more than 450 employees across 15 offices in APAC and Europe, and is recognized as a Top 50 AI company by Fortune Magazine. Appier has raised US$162 million in funding from investors including Sequoia, Softbank, and Line. Learn more at www.appier.com.