Microsoft unveils serverless fine-tuning for its Phi-3 small language model

Microsoft is a major backer and partner of OpenAI, but that doesn’t mean it wants to let the latter company run away with the generative AI ballgame.

As proof of that, today Microsoft announced a new way to fine-tune its Phi-3 small language model without developers having to manage their own servers, and for free (initially).

What is Phi-3?

The company unveiled Phi-3, a 3 billion parameter model, back in April as a low-cost, enterprise grade option for third-party developers to build new applications and software atop of.

While significantly smaller than most other leading language models (Meta’s Llama 3.1 for instance, comes in a 405 billion parameter flavor parameters being the “settings” that guide the neural network’s processing and responses), Phi-3 performed on the level of OpenAI’s GPT-3.5 model, according to comments provided at that time to VentureBeat by Sébastien Bubeck, Vice President of Microsoft generative AI.

Specifically, Phi-3 was designed to offer affordable performance on coding, common sense reasoning, and general knowledge.

It’s now a whole family consisting of 6 separate models with different numbers of parameters and context lengths (the amount of tokens, or numerical representations of data) the user can provide in a single input, the latter ranging from 4,000 to 128,000 — with costs ranging from $0.0003 USD per 1,000 input tokens to $0.0005 USD/1K input tokens.

However, put into the more typical “per million” token pricing, it comes out to $0.3/$0.9 per 1 million tokens to start, exactly double OpenAI’s new GPT-4o mini pricing for input and about 1.5 times as expensive for output tokens.

Phi-3 was designed to be safe for enterprises to use with guardrails to reduce bias and toxicity. Even back when it was first announced, Microsoft’s Bubeck promoted its capability to be fine-tuned for specific enterprise use cases.

“You can bring in your data and fine-tune this general model, and get amazing performance on narrow verticals,” he told us.

But at that point, there was no serverless option to fine-tune it: if you wanted to do it, you had to set up your own Microsoft Azure server or download the model and run it on your own local machine, which may not have enough space.