Things you may not have considered when planning your Machine Learning product
The whole world is moving at a fast pace into AI-based solutions. Even the fight with coronavirus has an AI element. Most recently, the Filipino government has asked its data scientists for help with battling COVID-19. And as so many sectors are turning towards AI, you might be asking yourself: how could it support your company too?
As Machine Learning is quite a buzzword nowadays, I thought I would share with you some aspects of ML-powered products that may not be immediately obvious.
First things first – define your problem
So, you have money and you want a powerful AI tool. That’s good. However, one very important question needs to be answered: “What problem do you want to solve?”
For example, would you like to discover factors that make your clients buy more or do you need a chatbot to support your customer service? Maybe you would like to have an auto-tagging solution for your text content, sentiment classification on your live teleconferences or fraud detection system that can react faster than a human being?
It’s crucial to think this through carefully. Having this answered, you can ask yourself the next question: “If I have the solution, how much money will I earn or save?” Now you know how worthy the solution is.
Machine Learning doesn’t exist without data
You may read on Wikipedia:
Machine learning algorithms build a mathematical model based on sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to perform the task.
If your business is digitalized then probably you already have data. But the important question to ask here is: do you have the right data? This is where Machine Learning Engineers or Data Scientists step in, as they should support you in this process and figure out how to transfer your business problem into a scientific problem and in doing so figure out what the process of training models may look like.
On the other hand, it may turn out that you have no data or very, very little of it. Again, your specialists should consider whether it’s possible to use some open data, take advantage of transfer learning, data augmentation, artificial data or simply start with the data gathering process first. They should present you with available options.
The point is, you need good data to make a good enough model.
Creating a good enough model
The problem is known and data exists. So now you’ve reached the core piece of your project, your ML model or more precisely, the whole process that leads your team to choose a given model.
Why is it a process? Because there is a high probability that there is more than one way to solve the issue. Even better, the ways may be very different and all of them seem to be valid for your business. Therefore, your team first explores the current best known approaches to similar areas of interest. Then, a few most promising directions are selected, implemented and experimented upon.
Let’s shed more light on implementation and experimenting. Here, two main strategies can be distinguished: quick and dirty Proof of Concept (PoC) or a long term living set of tools which is closer to proper software development. Neither of the two is better or worse by definition. Moreover, this is not an either or choice, rather a combination.
The quick and dirty PoC is good if you need a very fast answer. Such speed may be needed when:
- The examined algorithm (e.g. a type of neural network) is of high risk and you have no idea whether it is suitable for your problem at all.
- There are so many potential leads to check that you need more data to make an educated guess of further moves.
- Researchers (but not developers), who are very good at working on such tasks, are available.
- Initial results are available very fast
- There is a high chance of discovering potential blockers or risks that were unknown before
- People are well suited to the tasks, or assigned the tasks they perform the best
- Quick and dirty PoC is a kind of technical debt. When it turns out that the direction is worth continuing and far more experiments are needed, then the debt will need to be paid (at least to some extent) and you will notice some slowdown of development.
- Most likely, such a piece of work is not repeatable out-of-the-box as it was built to prove a hypothesis at a specific point in time, with some set of assumptions and constraints.
- Could become a mess in term of file management
- Deploying results of PoC is a rather daunting and time consuming process
Long term living software, on the other hand, could be a better choice when stability and repeatability is needed. This may be a better fit if:
- the product is of high complexity and you are prepared for months of development.
- Your data will change in the future (e.g. you have only a few samples at the start, but expect to gather more but this requires time)
- You start with artificial data and will collect real when the alpha version of your product is in use
- The number of possible factors that can improve or worsen the final performance is huge and you need to track carefully the history of experiments.
- The first, good enough, model should be in use as soon as possible. When a better one is discovered, it should replace the previous one.
- repeatability of experiments, crucial when new data is available
- Proper software development process (with code review, tests, etc.), the code base is as important as results.
- Speed and number of experiments increase with time
- Automation of exploitation
- Computations could be easier to move to remote machines
- Deployment is much easier as most parts of the chosen algorithm should be reusable
- Overheads formed of building tools and an experimenting framework or research to find existing online platforms that are tailored to your needs
- Need of experienced developers, a team with a wider skillset
- A danger of losing focus on the final goal and start experimenting not as soon as possible
- An inexperienced team of developers may not be ready for such a dynamic project
- Models management process (it can be actually advantage if solved at the early stage of the development process)
One more thing that needs explanation, what does “good enough” mean? It means don’t wait for a perfect solution that solves all the business problems. Pick the one that starts to bring value to your company despite its pitfalls (e.g. 70% of edge cases are not handled but it’s already better than not having any solution).
Deployment: let the world see your AI
Finally, the long awaited moment arrives. Your team said that a solution with a selected model is performing well. Now, it’s time to deploy it and reap the profits.
But where’s the catch? There is a reason why the term “solution” was used as opposed to just “a model”. Between real data (or raw data) and the model there is pre-processing that transforms the program’s input into a form which can be digested by the model. The same idea holds for the final output presented to an end user, the model will return some numbers that need to be translated into a meaningful shape which is understandable to humans. If your team followed the long term living software strategy, then there is a high probability that most of the job has been already done.
Still, you have the AI-core, but for the final product more may still be far away. Examples:
- If the application should work as a web service, then a web part must be delivered and adapted to work with the core.
- If the end platform is mobile, a proper application must be delivered. Communication to a web-server that stores the model has to be handled.
- Again, mobile is used by the end user, and the final product is an application working offline and syncing when Internet connection is available. In such a case, your AI-core has to be migrated to mobile which brings additional restrictions (your team should be aware of this fact at the research phase otherwise the whole project is at risk of failure).
- You want to deliver an offline desktop application for Windows, Mac and Linuxes. Developers have the challenge of building them and plug-in AI.
One more question that should also remember to ask yourself: when a new, better model is discovered, how will it be updated in the final product?
Demands to face or how fast must it be?
The initial business problem you want to solve brings its own demands. The data that you have gathered plus the algorithm that was chosen have some assumptions. In addition, the platform you deliver on adds some restrictions.
Now, there are also some challenges that may not have been considered yet but could have an impact on success. For example, how fast does the end user interact with the application? Is it less than a second? Or maybe the process is asynchronous due to heavy video processing and a notification should be sent when results are available? You just need common sense to answer these.
It’s obvious that the application should run as quickly as possible, but it’s not helpful for researchers. They may stick to looking for a solution that delivers results as fast as possible, and at the end of the day, they have nothing, as they have no idea when to stop. However, if you know that the threshold for losing customers is 3 seconds, whilst one second is optimal for usage and everything below brings no value (clients won’t notice the difference), then scientists have a clear goal to work towards.
Another example could be the number of requests (web-based systems) to handle and therefore the number of machines you must set up or their computation power. The more heavy your processing is (also pre-processing and post-processing) the more expensive the hardware is. Especially if GPUs are needed. Assuming that you’re paying per hour for such computers, expenses could be tremendously high if machines are up and idle (no profits, but costs).
In the end, both the infrastructure and the staff should be factored into your costs, as the latter are necessary to prepare such a setup and monitor it later on.
…and they lived happily ever after. Or the future of your models.
You have everything set up. The models are running on powerful machines. People are happy with the final product. Your company has more revenue than expected. It seems that the job is finished and actually, that may be totally true. However, there may be one more aspect that is worth considering. The future can’t be predicted 100% and all that has been done was on some historical data. Therefore, keep track of your product performance so that you’re not left surprised. If you discover that you’re getting worse results over time and it is still crucial to keep the original quality your team accomplished on evaluation, then you need to question “what has happened which has meant that results started to change?” Your team should help you with the answer.
I hoped I managed to reveal a few aspects of software development powered by a ML core. Especially, that it is now clearer what the next step of delivering your product may bring and what flavors they have.
As the final word, let me cite Gorge E.P. Box, who said that ‘All models are wrong but some are useful,’ and that’s what we must always remember.