Defending the Apple Neural Engine (ANE)

In a discussion about Apple Research’s open-source MLX machine learning framework on Hacker News yesterday, a comment proclaims-

ANE is probably the biggest scam “feature” Apple has ever sold.

This is a recurring observation, with many inferring that because Apple’s very own MLX doesn’t use ANE, therefore, the logic goes, ANE is useless. An “if they don’t eat their own dogfood” sort of deal.

As some background, Apple added the ANE subsystem in 2017 with the iPhone X. It is silicon dedicated to running quantized, forward-pass neural networks for inference. Apple’s intention with the circuitry was to enable enhanced OS capabilities, and the immediate use of the chip was to power Face ID. That first silicon offered up some 0.6 TOPS (trillion operations per second), and was built with power efficiency as a driving requirement.

The next year Apple released a variant with 5 TOPS, an 8x speed improvement. Then 6, 11, 16, 17, and then 35 TOPS (though that last jump is likely just switching the measure from FP16 to INT8). In all cases the ANE is limited to specific types of models that fit within its limitations. It was never intended to power NN training tasks, massive models, and so on.

It was some NN-dedicated hardware to enable low power but high (enough) performance assistance for OS features and functions. Other chipmakers started adding similar neural engines into their chips to address the same need: Qualcomm, Intel, AMD, Huawei, Samsung … everyone got in on NPUs for the same reasons. You aren’t going to run ChatGPT on it, but it stills hold loads of utility for the platform.

And the system heavily uses the ANE now. Every bit of text and subject-matter is extracted from images, both in your freshly-taken photos and even just browsing the web, courtesy of the ANE (many don’t even realize this, and it’s a barely heralded feature. You can search your photo library for a random snippet of text, even heavily distorted text. You can highlight and copy text off of your photos and even on images found on random websites in Safari when using Apple Silicon, at virtually zero power cost. ANE). After you’ve triggered Siri with Hey Siri, voice processing and TTS is handled by the ANE. Some of the rather useless genAI stuff is powered by the ANE. Computational photography, and even just things like subject detection for choosing what to focus on, is powered by the ANE hardware.

All of this happens with a marginal impact on battery life and without impeding the CPU or GPU cores in performing other tasks.

It’s pretty clear that Apple intended the ANE as hardware for the OS to use, and third party apps just weren’t a consideration or priority, nor did they make it a part of their messaging. In 2018 they did enable CoreML to leverage the ANE for some limited cases, and even then the OS throttles the capacity you can use to ensure that the OS is never left waiting when it demands it.

So why doesn’t MLX use ANE at all? The authors specifically stated why. The only public way of using the ANE subsystem is by creating and running models through CoreML, which is entirely orthogonal to the purpose and mandate of MLX. Obviously Apple Research could just reach into the innards and use it if they wanted, but MLX is an open-source project so that simply isn’t viable.

Apple added some tensor cores to the GPU in their most recent chips (M5 and A19 Pro), calling them “neural accelerators”. These are fantastic for training and complex models (including BF16), at the cost of magnitudes more power usage. It also gives Apple a path to start massively scaling up their general-purpose AI bona fides, adding more and more NA cores per GPU core, and GPU cores per device — copy/paste scaling — especially on the desktop path where they can achieve enormous levels of performance where power isn’t as much of a concern and active cooling is available.

Apple is unlikely to move existing OS NNs to these new tensor cores. Their purposes and driving philosophies are very different, and they serve different roles and purposes.

Nor is there any indication that Apple is abandoning CoreML (another parallel claim made in MLX-related discussions). Apple Research put out MLX to rightfully try to get some of the attention of the Pytorch et al. community, and it has been wildly successful, but it doesn’t supplant or replace CoreML.

If you have a consumer app for Apple devices, and you run NNs for inference to enable features, odds are high that your best bet is CoreML (which will use the GPU, GPU NA, ANE, and CPU as appropriate and available).

People seem prone to all-or-nothing stuff like this, thinking it’s all losers and winners and everything is binary. It’s reminiscent of Google unveiling Fuchsia, where every tech board like HN had the prognosticator of all prognosticators declaring that the day of Linux, ChromeOS, Android, and so on was over. It’s a Fuchsia world now, baby.

Years later and Fuchsia powers a Nest device, and largely seems to be a dead project. So…maybe not?