Ahead of the announcement of iOS 18, which is expected to be packed with AI features, Apple researchers published a paper highlighting how they’re training a new large language model (LLMs).
Called MM1, this LLM can integrate text and visual information as one. The paper was submitted last week and offers an interesting look at the importance of various architectural components and data choices. The researchers say they were able to “demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data is crucial for achieving state-of-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published pre-training results.”
In addition, they showed that “the image encoder together with image resolution and the image token count has a substantial impact, while the vision-language connector design is of comparatively negligible importance.”
Apple’s MM1 AI model uses a family of multimodal models with up to 30B parameters, consisting of both dense models and mixture-of-experts (MoE) variants, that are SOTA in pre-training metrics and achieve competitive performance after supervised fine-tuning on a range of established multimodal benchmarks.
Apple’s AI features could include Google’s or OpenAI’s functions
Apple has teased its AI applications for almost a year now. In the past two earning calls, the company’s CEO has said they have many features to announce. More interestingly, while Apple has been publishing papers and teasing upcoming AI features, Bloomberg’s Mark Gurman shared that Apple is also in talks to use Google Gemini with iOS 18.
Apple is apparently in talks with Google to license Gemini after having previously considered OpenAI’s ChatGPT.
While there’s no telling if Apple will partner with Google, the move isn’t necessarily surprising. Gemini already powers generative AI features on the Pixel 8 and the Galaxy S24. The latter certainly made an impression earlier this year. One of the Galaxy S24’s highlights comes from Google.
That said, there’s a lot to expect from Apple. BGR will make sure to let you know about all the company’s upcoming AI features.