Introducing Apple's On-Device and Server Foundation Models #
This document summarizes Apple's approach to developing on-device and server-based foundation models for its new personal intelligence system, Apple Intelligence.
Key Features #
- Specialized for everyday tasks: Apple Intelligence models are trained to perform specific tasks like writing, summarizing, and image generation.
- On-device and server-based models: A 3 billion parameter on-device model is designed for speed and efficiency, while a larger server-based model is used for more complex tasks.
- Dynamic adaptation: Adapters, small neural networks, are used to fine-tune models for specific tasks, allowing them to specialize on-the-fly.
- Emphasis on privacy: Apple does not use user data for model training and employs on-device processing and Private Cloud Compute to protect privacy.
- Responsible AI principles: Apple follows a set of principles to ensure AI is used responsibly, including empowering users, preventing bias, designing for safety, and protecting privacy.
Model Development #
Pre-Training #
- Apple uses its open-source AXLearn framework for efficient model training.
- Training data is sourced from licensed data, publicly available data, and AppleBot web-crawling, with options for web publishers to opt out.
- Personal data is filtered out, and content is curated for quality.
Post-Training #
- Human-annotated and synthetic data are used in training.
- Novel algorithms, including rejection sampling fine-tuning and reinforcement learning from human feedback (RLHF), are used to improve instruction-following capabilities.
Optimization #
- Numerous techniques are employed for speed and efficiency, including grouped-query-attention, shared embedding tables, and low-bit palletization.
- On-device inference achieves 0.6 milliseconds per prompt token latency and a generation rate of 30 tokens per second.
Model Adaptation #
- Adapters are used to fine-tune models for specific tasks.
- Adapter parameters are trained efficiently with a rapid retraining infrastructure.
Performance and Evaluation #
- Human evaluation is used to assess model performance, focusing on user experience.
- Evaluation includes feature-specific adapters and general capabilities.
- Human graders preferred Apple's models over comparable open-source and commercial models for safety, helpfulness, and instruction-following.
- Adversarial prompts are used to assess model robustness and safety.
Conclusion #
Apple's foundation models and adapters power Apple Intelligence, a new AI system integrated into iOS, iPadOS, and macOS. These models are designed to help users with everyday tasks while prioritizing privacy and responsible AI development.
"Our models have been created with the purpose of helping users do everyday activities across their Apple products, and developed responsibly at every stage and guided by Apple’s core values."
Important Quotes #
"Apple Intelligence is comprised of multiple highly-capable generative models that are specialized for our users’ everyday tasks, and can adapt on the fly for their current activity."
“We have applied an extensive set of optimizations for both first token and extended token inference performance.”
"We continue to adversarially probe to identify unknown harms and expand our evaluations to help guide further improvements."
Other information #
- Apple also has a voice replicator tool called Personal Voice available on iOS 17.
- Apple hosted a Natural Language Understanding workshop in 2023.