Explore the new system architecture of Apple silicon Macs

More Videos

Explore the new system architecture of Apple silicon Macs

Discover how Macs with Apple silicon will deliver modern advantages using Apple's System-on-Chip (SoC) architecture. Leveraging a unified memory architecture for CPU and GPU tasks, Mac apps will see amazing performance benefits from Apple silicon tuned frameworks such as Metal and Accelerate. Learn about new features and changes coming to boot and security, and how these may affect your applications.

Resources
Related Videos

WWDC 2020
- Optimize Metal Performance for Apple silicon Macs
- Port your Mac app to Apple silicon
Download

Hello and welcome to WWDC.
Hi. I'm Gavin. I'm in the Core OS group, and my team have been working on bringing macOS to Apple Silicon.
So I'm delighted to get to introduce some of the changes coming in these systems.
We're going to talk about new features and how to take advantage of them in your macOS applications. We'll go over some security enhancements, and we'll touch on application compatibility. Then I'll hand over to my colleague, Anand, who'll be taking you through boot features and recovery. Intel-based Macs contain a multi-core CPU, and many have a discrete GPU, and recent Macs also have a T2 chip which enables features such as Apple Pay, TouchID and Hey Siri.
Machines with a discrete GPU have separate memory for the CPU and GPU.
Now, the new Apple Silicon Macs combine all these components into a single system on a chip, or SoC.
Building everything into one chip gives the system a unified memory architecture.
This means that the GPU and CPU are working over the same memory. Graphics resources, such as textures, images and geometry data, can be shared between the CPU and GPU efficiently, with no overhead, as there's no need to copy data across a PCIe bus.
Using Apple Silicon in the Mac also allows us to bring unique technologies developed for the iPhone and iPad over to the Apple Mac. Apple Silicon contains coprocessors, including powerful and efficient video encoders and decoders, the Neural Engine and matrix multiplication machine learning accelerators.
The Mac has had a multi-core CPU for years, but for Intel-based Macs, all cores have similar performance.
Apple Silicon Macs have a mix of performance cores for when your application needs the maximum performance, and more power-efficient cores for less CPU-intensive tasks.
We call this asymmetric multiprocessing, or AMP.
The cores support the same architectural features and command all the same software. macOS will use all these cores simultaneously, and applications are scheduled onto the appropriate cores depending on their current performance requirements.
So, how should your applications take advantage of these new capabilities from macOS? You might be expecting us to announce new APIs for you to adopt in your applications. But we've been working for years to build a consistent set of APIs across all our platforms and to optimize those frameworks for Apple Silicon.
To run work on the GPU, you should be using the Metal API on both Intel-based and Apple Silicon Macs.
On Apple Silicon, you'll just see a significant speed boost when running tasks that benefit from the unified memory architecture.
To take advantage of the hardware video encoders and decoders, you can use the same AVFoundation and VideoToolbox frameworks that are in macOS today. To get the very best performance, you'll want to use the pixel formats that the hardware is optimized for. Apple Silicon is particularly efficient at handling BiPlanar formats, such as this one.
I'm not going to attempt to read that, but just look out for the ones with BiPlanar in the name.
Your same Core ML code can run on any Mac. The functionality is available on Intel-based Macs too, but on Apple Silicon, Core ML is much faster and more efficient, and it takes advantage of the Neural Engine and the machine learning accelerators.
Your Core ML code should just run on the Neural Engine without you needing to make any changes.
You might want to check that you're not explicitly configuring your model to run on cpuOnly, or cpuAndGPU. To be eligible to run on the Neural Engine, you want computeUnits set to "all," which is also the default.
On Apple Silicon, you can also leverage the machine learning accelerators more directly using the accelerate framework.
And, of course, everything in the accelerate, compression and SIMD frameworks all have highly tuned implementations for both Intel-based and Apple Silicon Macs.
We have two key pieces of advice when it comes to AMP. First, make sure you're setting the quality of service, or QoS, on all of your work items. These QoS properties are an indication to macOS of how work should be prioritized.
Whether an action needs to be completed at the highest performance possible, or whether the OS should be prioritizing power efficiency.
Setting QoS correctly is important on all our platforms, but it's particularly important on platforms with AMP, as QoS is a factor in determining which core a task will be run on.
My second piece of advice is to use Grand Central Dispatch.
Again, this is just good advice on all our platforms, but, again, it's particularly important on AMP systems.
Why? Well, dividing up work across multiple cores is particularly tricky when those cores have very different performance characteristics. For optimal performance, you need to distribute the right proportion of the task to each thread.
API in Grand Central Dispatch, like concurrentPerform, can help with the hard work of distributing tasks optimally to run in parallel across all cores.
When using API like this, make sure you're breaking your task over a large enough number of iterations. This will help the system to load balance effectively.
These frameworks have been in macOS for years, so there's plenty more documentation if you'd like to learn more. A great starting point will be these WWDC sessions.
And the Metal team have a couple of new sessions this year that's all about Metal on the Apple Silicon Macs.
Okay, so that was macOS on Apple Silicon. Now let's move on to talking about security.
Building our own Silicon has enabled us to develop awesome security features for the iPhone, and we're excited to bring these protections to the Mac while making sure not to lose any of the capability that makes a Mac what it is.
These features include write XOR execute, kernel integrity protection, pointer authentication and device isolation.
Apple Silicon enforces a restriction called write XOR execute. That means that memory pages can be either writable or executable, but never both at the same time.
Pages that are both writable and executable can be a dangerous security vulnerability. However, many modern applications embed just-in-time compilers to support languages such as Java or JavaScript.
These JIT compilers frequently rely on memory being both writable and executable.
So, we're adding new API that allows memory to be quickly toggled between writable and executable permissions. What's really cool is that this works per-thread, so two threads can see different permissions for the same page. This makes it easy to adopt in multi-threaded JITs. And it's going to enable JIT compilers that are both fast and secure.
Apple Silicon has hardware support in the memory controller to make the OS kernel code immutable. Once the kernel has been loaded into memory, kernel integrity protection prevents pages containing kernel code from being modified or additional pages from being made executable. This blocks attacks that would inject new code into the kernel while it's running.
Pointer authentication prevents misuse of pointers, and it can harden against attacks such as return-oriented programming. Unused bits in 64-bit pointers are used to store a pointer authentication code, which is then checked when the pointer is used.
Right now, we're enabling use of this in our kernel, system applications and system services. We're not yet ready for you to start distributing your applications with pointer authentication. But if you're interested to experiment, then there's a boot-arg that you can set so you can try this out for yourself.
PCIe devices access system memory through an IOMMU.
On Intel-based Macs, macOS gives all devices a shared view of system memory.
On Apple Silicon, all devices are given separate memory mappings. This restricts devices to only accessing memory that they were intended to. And it prevents devices from snooping on each other.
To set up a DMA transfer in a PCIe device driver, you should use the IOMapper and IODMACommand API.
Make sure you're getting the IOMapper from your device and then passing that when you're configuring an IODMACommand.
Some older drivers don't use this API and just use getPhysicalSegment on ioMemoryDescriptor directly. That's not going to work, and those drivers will need updating to the newer API before porting over to the new platform. Now, you'd only be using this old API in an IOKit driver written with a kernel extension.
Kernel extensions are still supported, but you're going to see increased inconvenience for both you, as a developer, and for your users. The last three security features I introduced all impact kernel extension development.
To be able to support kernel integrity protection, we had to change how macOS loads kernel extensions.
Which means this now requires a reboot.
And point authentication. If you develop a kernel extension, you are going to need to enable point authentication. And as we continue to improve the platform, you should expect to see more friction around kernel extensions.
We introduced DriverKit last year in Catalina to enable you to build drivers that run in user space, which improves system stability and security.
If you're not already looking into DriverKit for any drivers you develop, then now's the time to do so.
Here are some resources to help you learn more and get started with DriverKit.
Okay, that was security. Now, let's take a look at application support on this platform.
Rosetta is our translator to run existing x86_64 applications. It runs all kinds of apps: macOS apps, Catalyst apps, games and complicated apps like web browsers with embedded JIT compilers.
Apps using Metal will directly generate the right commands for the Apple GPU, and translated apps that use Core ML get to run on the Neural Engine.
The performance and compatibility of Rosetta was only possible through Apple Silicon and software teams working closely together.
Now, Rosetta sets to work right from the moment your application is being installed. Triggered by the App Store or the package installer, Rosetta will start translating all the executable code in your application. If your application doesn't use one of our installers, then you may see an extra bounce or two in the dock the first time it's launched, as we'll start translating it then.
And security is deeply integrated into this translation process.
Translations of your application are all code-signed, tied to a single machine, securely stored, and get refreshed over OS updates.
When your application is launched, we load our stored translation. Rosetta then fully emulates a x86_64 process, right down to the system call interface. Everything in the process is translated, including all system frameworks.
If Rosetta newly encounters code that haven't been translated at install time, then we'll compile it on the fly.
And Rosetta maintains the security you'd expect with hardened run-time protections, all fully enforced on processes running in Rosetta.
Now, hopefully, everything should just work, but if you do need to debug or profile your app, well, that's all fully supported. You can build and run translated apps directly from Xcode, and you can profile from Instruments. You could also use command-line tools like LLDB.
There are some differences between processes running on an Intel-based and Apple Silicon Mac. Page size, memory ordering rules, the frequency of mach_absolute_time and some details of floating point behavior, these all change.
For applications running in Rosetta, we've made sure that everything matches behavior on an Intel-based Mac. Now, Rosetta does not support the AVX vector extensions to x86.
Applications should already be checking whether the machine supports AVX before trying to use it. There's a sysctl you can use if you need to do so.
Also, you will see some limitations running on the Developer Transition Kit, as there are some compatibility restrictions on that hardware. The DTK release notes have more information.
And finally, if your application does need to know when it's being run in Rosetta, then we have added a sysctl.proc_translated to check for this.
Okay, that's Rosetta. Of course, what your customers really want is a native arm64 port of your application.
We have a ton of great information for you on porting and optimizing your applications on the developer documentation website.
And there's a whole session full of advice around porting your applications, so please go check that out, and please get started on a native port.
And for the first time, compatible iPad and iPhone apps will also be available on the Mac.

Mar	APR	May
	09
2020	2021	2022

Resources

Related Videos

WWDC 2020