We talk about the Mac, but ultimately the Mac was a vehicle for GUIs – icons, mice, scrolling, toolboxes, painting. We talk about the iPhone, but the iPhone turned out to be about mobile photography and apps and touch. Get too hung up on the (very pricey) initial hardware, and you might miss some of the potential of what Apple is doing with what they call spatial computing. And you’ll also miss how some of these threads in interaction design started – as with those other examples – long before Apple.
This launch didn’t quite sell us on the Vision (Pro)
Okay, first – having watched decades of Apple launch events, I think it’s worth collectively admitting that this one didn’t really land its opening arguments. There was no clear sense of why you’d actually want the Vision Pro in that opening salvo. Disney streaming Disney+ into your eyes didn’t help; nor did seeing floating versions of apps or (eesh) Word and Excel documents. No one, post-pandemic, is excited about Zoom calls. (Precious time I could be playing that Death Stranding port you just promised!) And the main feature they chose to humanize VR and show off the device – generated eyes that display on the device – seems to have come across to most folks as creepy and dystopian, not futuristic. (See more notes below.)
I guess there was this audience, though they were – also a bit unhappy about that price.
They did like the product, at least – which makes me wonder why Apple didn’t broadcast audience sound.
Don’t worry too much; Nintendo accidentally had the awkward name of the Wii become the story when it first launched, until people got used to it and it became a revolutionary game console launch. Some of the best products are slow burns. But this explains why a lot of the everyday initial reaction was “this is an expensive thing that I have no use for,” because it is an expensive thing they have no use for. Many tech journalists do seem to be failing to acknowledge that.
Now, it’s pretty clear that the price is really likely for early adopters only. The “Pro” moniker, the fact that it isn’t shipping until next year, and that they launched at a developer conference should make that abundantly clear.
But let’s get back to whether Vision Pro is relevant – including to those of us who probably will not be shelling out money any time soon for futuristic ski goggles.
A faster horse
Jobs was regularly quoted as saying “I think Henry Ford once said, “If I’d asked customers what they wanted, they would have told me, ‘A faster horse!'” It’s a perfect Jobsian quote, in that it’s convenient but … fabricated. Henry Ford almost certainly never said that and… sheesh, no one should be quoting Henry Ford anyway.
I’d flip this assumption around. It’s often developers and users who really define what technologies come to mean.
Back to that iPhone launch, it’s worth reflecting on how far the initial presentation was from what the device would become. Remember when the Safari browser was going to be “the SDK”? Or the discussion of patents for interaction? (Jobs stopped the keynote to mention patents, but that didn’t stop Google from copying the platform – much to Jobs legendary personal chagrin.) At the time, those already pricked my ears up just because I’d known people who had worked on multi-touch and touch gestures for years, and open development seemed a must (and both of those things would prove important in the end). Vision Pro seems a similar moment – with a similarly long history in interaction design.
It’s worth rewatching the iPhone launch, for anyone wanting to make historical comparisons. It’s apparent what was in this launch that was missing yesterday – Jobs personally, but also a lot of the way the keynote worked. Having an audience made a huge difference; I think Vision Pro suffered from having to be launched in a virtual void. Jobs empathized with his audience and used pacing, punchlines … humor. And of course there was the “Three things… a widescreen iPod touchscreen, a revolutionary mobile phone, and a breakthrough Internet communications device…”
But listen closely to that moment and the confusion over “breakthrough Internet communications device.” It helps make the punchline work better that the audience went quiet, but it also says both a) Apple got this when the audience didn’t, but also b) Apple didn’t know what to call it. And yet that was of course what the thing really was – a breakthrough internet device. We didn’t have a name for it then. Now we’d just say… an iPhone.
Revolutionary as they were, when Apple introduced the Macintosh, the iMac, the iPod, the iPhone, the iPad, even the Apple Watch, people already knew essentially what the product category was (more or less) and why they wanted it. They’d seen typewriters, phones, watches. No one says “I want an immersive pair of ski goggles to put giant TVs in my face and that projects an AI avatar of my eyes to terrify my family.”
I remain unconvinced if this sort of headset will ever catch on, or if it’ll be a niche device that’s constrained by physics. Apple’s made a better demo of the tech, but not enough to really resolve that question in either direction. But I do think the enabling tech underneath could prove useful in some form – and having a slick package like this helps make the argument for that.
(Oh, and as for that “faster horse” thing. Please. I’m from Louisville, Kentucky. People are absolutely looking for a faster horse. More than a Model T, as it happens.)
Vision Pro: spatial computing showcase
Like the iPhone, Vision Pro is the sum of a lot of parts – a lot of core enabling tech. They’re bundled into these these big goggles, but that doesn’t mean they’ll always be in that form – that original Macintosh is now in a new form in your pocket and maybe on your wrist. Don’t expect magical miniaturization, either – displays and optics don’t just “inevitably” start shrinking. (That’s also why Apple took the battery off the headset.) But these interactions themselves might appear in other forms.
Think about what we saw yesterday – with some emphasis here:
- An immersive, high-resolution dual display made to be wearable
- Optics to fit those displays to your eyes (including evidently if you have glasses, keenly interesting to me)
- Low-latency image performance to reduce motion sickness (again, a major point)
- Eye tracking interface
- Predictive, behavior-based UI elements
- Device-free, gesture-based interactions (like using your hands)
- Spatial/immersive audio (which we’ve seen before)
- Ray-traced 3D audio (which we’ve also seen before, but from GPU makers, not from a company like Apple that can implement this in hardware)
Some of this gets potentially creepy, meaning you do want to be careful about whose device you buy and how they’re using this interaction and its data. But there has been years of work on how those interactions would work:
And then you have the platform:
- visionOS, sharing APIs with iOS apps in particular and Apple platforms (watchOS, tvOS, macOS) more generally
- Common toolsets like Unity – jeez, I should have bought some stock before yesterday, huh? (Expect Unreal Engine to follow, too, whatever bad blood there’s been between Epic and Apple)
- App Store for distribution
- Apple Silicon (M2 architecture, plus a specialized chipset)
For anybody who says that Apple isn’t the same without Steve Jobs, the Vision Pro sure feels like it has some relatives in Jobs’ product history. It’s fitting that just last week, Verge released a documentary on the Lisa. A ton of Lisa technology ultimately made its way to the Mac, along with most of the product team – and got popularized on a more affordable, less business-oriented, more consumer-oriented, better-looking platform. (Sound familiar?)
Or there’s the NeXT platform, which failed (despite a boost from H. Ross Perot) before … turning into the very foundation of the OS you see today on Vision Pro, apart from the iPhone and Mac. (I can’t find the quote, but I’m pretty sure Bill Gates had reportedly said “Develop for it? I’ll piss on it” of the NeXT cube… only to later absolutely develop for it after it became the Mac operating system.)
Or there’s the Pixar Image Computer, a high-end visualization machine that Jobs oversaw as a product after he acquired the company, which failed (see the pattern), only to lead to some of the foundational CGI tech that later drove Pixar and Disney and are still available today. (Pixar’s Universal Scene Description, while not directly related to that box, had its provenance in development efforts from around the same time.)
Even the iPhone only became possible after the Blackberry, the Palm Pilot, and yeah, the Newton and General Magic.
It’s accessibility, stupid
The most exciting applications of all of these technologies right now is not an expensive gadget, but greater development of natural interactions with machines. That’s an old, old, old, old idea – but it continues to improve.
You can interact with eyes, voice, or both. You don’t need all fingers and two hands.
Part of the fundamental problem with ableism is just how entitled all of us can be when we haven’t encountered an obstacle. Plenty of products don’t even work when you’re left-handed. What if you mash your hand in a car door? What if you have unexpected vision impairment? We just experienced a global pandemic that brought a tidal wave of health changes.
Here’s a good illustration of how important this is, and how it’s elevating accessibility in the Apple developer conversation – not just for visionOS, but beyond:
And audio ray tracing also is making people perk up:
More on that from a couple years back:
CDM readers along include both blind and deaf users. (Yes, there are absolutely deaf musicians – and if that doesn’t make sense to you, that’s probably because your conception of what deafness is comes from ableism.)
Myles de Bastion is a deaf musician and technologist, and sure enough also hones in on these features. Must-read:
He has a ton of great ideas – any one of them more interesting than what Apple showed yesterday, but then that’s why it’s so important to get these technologies to developers.
Consider, too, the potential benefits for individuals with fine motor disabilities who struggle with traditional computer inputs such as mice and keyboards. If the Vision Pro’s eye-tracking technology are harnessed in a way to go completely hands free to control the device, it could unlock a new realm of digital accessibility and independence.
And to Myles’ specific use case:
For me, as a Deaf technology user, the most groundbreaking aspect of the Vision Pro, however, lies in its capacity to display facial expressions on its front screen. Facial expressions are a critical component of ASL (American Sign Language), providing critical context, grammar and emotional tone. By capturing and showcasing these expressions on the front screen, the Vision Pro offers a richer, more authentic Sign Language conversation experience than ever before.
He also points out that the disability community has nearly “half a trillion dollars in disposable income.”
Those on Wall Street selling off Apple stock may be making a mistake.
Just don’t make isolation a selling point
Tech right now in general seems fixated on finding the next iPhone-like launch, and selling people on expensive gadgets. That can get ahead of whether anyone wants the things.
But after years of pandemic and social fragmentation, selling people in the middle of layoffs and economic downturns on an expensive gadget they use alone in their room seems a little misguided. I’m not the only one who feels that way:
Contrast Nintendo’s recent marketing for Switch, which emphasizes face-to-face interaction and sharing. The Switch is at face value really unimpressive tech – it’s an NVIDIA mobile platform with some 90s-style arcade controls slapped on it. But that also means it easily gets out of your way. And its killer feature is games – many of them games from the 90s that people still love.
People totally want to play Zelda: Tears of the Kingdom more than they want to play “Zoom and MS Word.”
But ironically, it’s exactly the narrative Apple missed – that spatial computing opens up new ways to bring people together. There’s potential to make computing more multiplayer, in the same room and across distances. It can help people who are isolated – relevant as folks struggle with, for example, immune deficiencies. It can make complex 3D concepts easier to see and 3D work easier to do. (That means more than just a 3D heart model, as shown yesterday, of course.) And spatial computing can make interactions possible across a range of physical and cognitive abilities.
All of this can happen across Apple’s platforms, which is part of why there’s no reason to worry about the company’s future. Maybe you don’t want to wear goggles, but you’ll wind up with audio raytracing in Apple headphones or gestural interactions on iOS or eye-tracking on the Mac as an accessibility feature.
And as with examples like this work by artist/technologist Phoenix Perry from over a decade ago, there’s a long history of explorations with gestures and immersive virtual environments:
All the Apple hype in this case is only likely to add more fuel to those investigations.
Whether this particular hardware endures is another matter. The tech demo was impressive. What we’re still missing, though, is more of the actual humans involved. Spatial computing in the end will be about the humans using it, not those headsets.
I’ll be curious to hear back from developers looking deeper into this.