Microsoft’s new tiny language model can read images — here’s what you can use it for

Microsoft image for Phi-3 language model
(Image credit: Microsoft)

During Build 2024, Microsoft announced a new version of the company’s small language AI model, Phi-3, which is capable of analyzing images and telling users what’s in them.

The new version, Phi-3-vision, is a multimodal model. For those unaware, especially with OpenAI’s GPT-4o and Google’s updates to Gemini, a multimodal model means that the AI tool can read text and images. 

Phi-3-vision is meant for use on mobile devices as it features a 4.2 billion-parameter model. An AI model’s parameters are a shorthand for understanding how complex a model is and how much of the training it receives it understands. Microsoft has been iterating the Phi model on previous versions. So, Phi-2, for example, learned from Phi-1 and grew with new capabilities, and Phi-3 is similar to Phi-2, trained on Phi-2 and added capabilities.

Phi-3-vision can perform general visual reasoning tasks, such as analyzing charts and images. Unlike other more well-known models, like OpenAI’s DALL-E, Phi-3-vision can only “read” an image; it cannot generate images. 

Microsoft has released several of these small AI models. They’re designed to run locally and on a wider range of devices than larger models like Google’s Gemini or even ChatGPT. No internet connection is required. They also reduce the computing power needed to run certain tasks, like solving math problems, as Microsoft’s small Orca-Math model does.

The first iteration of Phi-3 was announced in April when Microsoft released the tiny Phi-3-mini. In benchmark tests, it performed quite well against larger models like Meta’s Llama 2. The mini model has just 3.8 billion parameters. There are two other models, Phi-3-small and Phi-3-medium, which feature 7 billion parameters and 14 billion parameters, respectively. 

Phi-3-vision is available in preview right now. The three other Phi-3 models, Phi-3-mini, Phi-3-small and Phi-3-medium, are accessible via the Azure Machine Learning model catalog and collections. To utilize them, you’ll need a paid Azure account and Azure AI Studio hub. 

More from Tom's Guide

TOPICS
Scott Younker
West Coast Reporter

Scott Younker is the West Coast Reporter at Tom’s Guide. He covers all the lastest tech news. He’s been involved in tech since 2011 at various outlets and is on an ongoing hunt to build the easiest to use home media system. When not writing about the latest devices, you are more than welcome to discuss board games or disc golf with him. 

Read more
DeepSeek R1 illustrations
Microsoft just announced that it's bringing DeepSeek R1 models to Windows 11 Copilot+ PCs
OpenAI logo on a phone screen in front of a blurred image of Sam Altman
OpenAI confirms launch of 'o3 Mini’ AI model that pauses to ‘think’ — here's how it works
DeepSeek R1 illustrations
DeepSeek’s Janus Pro AI image generator is here to take on Midjourney and DALL-E
chatgpt logo on phone and blurred image of Sam Altman
OpenAI just released 03-mini to fight DeepSeek — the first 'reasoning model' that's free in ChatGPT
OpenAI logo on a phone screen in front of a blurred image of Sam Altman
OpenAI CEO Sam Altman just shared a massive update on what's next for ChatGPT
OpenAI logo
OpenAI ChatGPT-4.5 is here and it's the most human-like chatbot yet — here's how to try it
Latest in AI
A nervous woman looking at her phone
Is ChatGPT making us lonely? MIT/OpenAI study reveals possible link
AI in man's hand
AI
AI Madness faceoff logo
I just tested Grok vs. DeepSeek with 7 prompts — here's the winner
ChatGPT on iPhone
ChatGPT was down — updates on quick outage
Claude AI on phone sitting on keyboard
Claude 3.7 Sonnet now supports real-time web searching — but there's a catch
The Dnsys X1 Exoskeleton being worn
I tested an AI exoskeleton to help treat my immune arthritis — here’s what happened
Latest in News
Nintendo Switch 2
Nintendo Switch 2 tipster may have just leaked release month and launch plans
Disney Plus logo
Disney Plus upgrade just fixed one of my biggest problems with the home page
Tom Hiddleston as Robert Laing in "High Rise" now streaming on Netflix
5 best Netflix movies in March you haven't watched yet
iPhone 16 with Apple Intelligence logo for iOS 18.1
iOS 18.4: All the newest Apple Intelligence features coming to your iPhone
Maria Debska in "Just One Look" now streaming on Netflix
3 best Netflix shows in March you haven't watched yet
Split image featuring the Galaxy S25 Edge (left) and Galaxy S25 Ultra (right)
Samsung Galaxy S25 Edge just tipped for two Galaxy S25 Ultra-level features