Why AI code optimisation will be a game-changer

For years, we’ve been aware that AI is set to be one of the world’s biggest if not the biggest technological and economic game-changers. With PwC estimating that by 2030 AI will grow the global economy by nearly $16 trillion, we’ve become used to claims that it will be a transformative technology from the media. For those of us who actually work with AI though, it’s clear that some of this optimism needs to be tempered. That’s because right now many of the processes to develop, test, deploy, and monitor AI models are not as efficient as they could be.

In practice, most people who’ve worked with AI or ML in industry know that the technology requires a great deal of manual intervention to be able to smoothly run in a production environment. To take one example, the data scientists who help develop and train models end up finding most of their time consumed on manual and repetitive tasks around data preparation – around 45% of their working hours. By contrast, the real value-add part of their job – model training, scoring, and deployment – only consumes 12% of a data scientist’s working time.

But I think one of the most difficult bottlenecks for commercialisation of AI is code optimisation. Speed is not a word that is regularly associated with machine learning teams. When we talk and write about accomplishments in machine learning, there is often a focus on the problem, an algorithm’s approach, and the results — but no mention of the time that it took to get there. Whether it be removing redundant lines of code or reordering processes to better use compute or storage resources, scaling AI deployments requires software engineers to dedicate vast amounts of time to parse through models and make hundreds or thousands of individually minute changes.

This process is absolutely vital work: whether it be to sense-check code to remove segments that may potentially introduce errors, cut down the risk of memory leaks, or save CPU core cycles and countless kilowatt hours of power consumption. However, it’s manual work which is often repetitive and tedious, and not often the most efficient use of human capital.

This begs the question: why is this still the case, and why haven’t we discovered ways to automate code optimisation? Well, code optimisation isn’t a process that can be easily captured by a simple, linear set of rules. Rather, it demands a degree of intelligent judgement, as it will always involve navigating trade-offs between accuracy, efficiency, cost, memory, explainability, and ethics.

To illustrate this, let’s consider a few examples:

  1. Consider an AI model that identifies the items of clothing in a photo. Optimising that model to be as accurate as possible and to precisely identify what item of clothing it is shown may mean building in a large library of clothing categories, which in turn means that such a model will require a much heftier amount of compute and storage resources. On the other hand, optimising that model to save on CPU resources and storage capacity may mean having to prune categories and reduce the accuracy of the model (e.g., it may fold polo shirts and t-shirts into one category of ‘shirts’).
  2. Imagine an AI model that identifies fake news or hate speech. Such a model may be used on millions of messages generated on a social network every second. Having a very accurate model means that it’s most probably a complex one – which in turn, is very slow. A slow model in production may introduce such a huge load to the system that makes it impractical. Thus, there is a need to balance accuracy and speed.
  3. Many high-frequency trading teams are adopting AI models in their decision making. Even though the accuracy of a model is very important, slow models are impractical for trades that happen in the order of milliseconds, as such a delay means they may miss a trading opportunity. This means teams will need to compromise a model’s accuracy for better speed.Many companies are thus investing heavily in huge teams of software and hardware engineers to do manual code optimisation to improve the speed of models while not compromising on accuracy. But, at present, there’s also a big push to make sure that AI is explainable, and that its decisions or rules can easily be articulated to both subject experts and laypeople. But such capacity for explainability will necessarily mean choosing between making the model more resource-intensive, or making the model itself simpler so as to free up resources for such a process.

    Because of these trade-offs, it’s historically not been possible to automate the AI code optimisation – it’s simply something that needs a human to judge what sort of optimisation is appropriate for a given context. For example, do you want to optimise for speed, ethics, accuracy, or so on? How do you rank objectives?

    However, it’s finally becoming possible to solve this problem – through the use of AI itself to model these different approaches, and in turn evaluate and optimise code according to the objectives of the organisation that wants to deploy the model. Through instructing an AI optimising agent about your priorities, it can develop a profile of what the code should look like before proceeding to parse through code and inputting your values into the optimisation process itself.

    This sort of multi-objective code optimisation will enable teams to dramatically expedite the time-to-market of their ML models and accelerate the adoption of the technology throughout industry. In turn, this will ultimately lead to more efficient code and greener, cheaper, and fairer AI for all. And ultimately, it’ll finally make AI viable at scale through taking away the most mundane and repetitive part of the process – and freeing up devs and teams to do far more interesting work.

    Written by Rick Hao, deep tech partner at SpeedInvest