Apple working to cram massive Gemini model into iPhone to power new Siri

May 28, 2026 • Technology

Summary

Apple is working to improve Siri using Google’s large AI model called Gemini, but the biggest parts of this AI will run on cloud servers rather than directly on iPhones. Apple aims to balance AI performance and user privacy by running smaller AI tasks on the device while relying on cloud computing for more complex work.

Key Facts

Apple delayed its AI-enhanced Siri several times but will merge it with Google’s Gemini AI later in 2024.
Gemini is a large AI model with trillions of parameters, far bigger than what current smartphones can fully run.
iPhones will run some AI tasks locally, but many complex AI processes will be handled in the cloud.
Running AI in the cloud often raises privacy concerns, though Apple plans to use encrypted computing on Nvidia’s platform to protect user data.
Apple’s own AI processing chip, the Neural Engine, is optimized for smaller AI tasks, not huge models like Gemini.
Apple struggled to run Google’s full Gemini models on its own private cloud servers and will rely more on Google and Nvidia’s infrastructure.
Google uses smaller AI models called Gemini Nano for mobile features, but conversational assistants like Siri need bigger models.
Siri will use a technique called distillation to create smaller, faster AI models from the large Gemini model to run on iPhones.

Read the Full Article

This is a fact-based summary from The Actual News. Click below to read the complete story directly from the original source.

Ars Technica