Apple has released a detailed technical report on its new foundation AI models, offering a rare look into the architecture, training processes, and data strategies powering Apple Intelligence in iOS 18 and beyond. The report outlines significant advances in both on-device and cloud-based models, emphasizing efficiency, scalability, and privacy. Notably, Apple’s on-device model is split into two blocks to optimize memory and speed without sacrificing performance. The cloud model introduces a Parallel-Track Mixture-of-Experts (PT-MoE) architecture, enabling faster, more accurate responses by activating specialized subnetworks as needed. Apple also reports a 275% increase in multilingual representation, expanding support for non-English languages through improved data sourcing and tokenizer enhancements. Data for training was sourced from public web content, licensed material, synthetic data, and over 10 billion image-caption pairs, with a strong focus on privacy and content quality. This technical transparency signals Apple’s commitment to closing the AI gap while maintaining its reputation for privacy and device-centric intelligence.

Key Points - On-device model split into two blocks, reducing memory usage and latency by 37.5% while preserving output quality. - Cloud-based model uses a Parallel-Track Mixture-of-Experts (PT-MoE) architecture for modular, efficient, and scalable AI processing. - Multilingual data in training increased from 8% to 30%, with tokenizer size expanded by 50% to 150K tokens. - Training data sources include public web, licensed content, synthetic data, and over 10 billion image-caption pairs. - Apple maintains a privacy-first approach, filtering low-quality data and respecting web crawler exclusion protocols.