If Minority Report has become the benchmark by which gestural interaction is judged, that was always intentional. The film’s production team wanted to work with the people actually developing science fiction-like technology. And it’s sci-fi like technology.
So, let’s not talk about how cool-looking the clip is above – not that it doesn’t look cool. After all, most of what you actually see on the screen is stuff you can do with your desktop computer and some projectors. So the question is, what benefit do you get from really nailing a gestural input? It’s the input that matters.
Even if you engage exclusively your right brain on this, there’s quite a lot that’s impressive – the properties proponents of this kind of interface have been advocating for many years:
- The interface is 3D. Not to overstate the obvious here, but the ability to intuitively navigate in 3D is no small matter. This sort of interface might not work for detailed 3D modeling, but for quicker, more comfortable 3D navigation, the mouse / mouse wheel has always been woefully inadequate. The mouse is fundamentally designed as a 2D pointing device, which is why it requires awkward conventions like WASD keyboard navigation in 3D games. Joysticks work for spatial navigation (ask your friendly fighter pilot who relies on them in life-or-death situation). But actually moving stuff around in 3D requires something different.
- Gestures are intuitive. We hear a lot about gestures, but these are actual, human gestures – the kinds of motions you’d make to a person, the kinds you’d use when running a dog around an agility course. (And, believe me, if you can keep up with a border collie, you’ve got a good interface!)
- It’s collaborative. Here’s an experiment: share your mouse with a friend. How’d that work out for you?
- It could help navigate information. This to me is actually the least convincing part of the demo – but I think that’s an opportunity. We’ve had a chicken and egg problem: our interface is 2D, so our information is 2D. Sure, there’s the odd exception, like Google Earth – but how much time do you use Google Earth compared to Google Maps? Thought so. Some of the demos here remind me of Apple’s 1990s tag navigation interface for the Web. Others return to the odd, needlessly-3D photo organizing app model that seems to permeate these demos. (And until you can shout “enhance” at your computer like on Star Trek to see some tiny area of an image, I wonder how useful that will be.) I think we have to re-learn how to organize information in three dimensions, having done it in two dimensions for so long.
- It blurs the lines between computing and performance. The reason we focus so much on live performance on this site is that, at its heart, it’s all about real-time communication. If you can make something work live onstage, or live in a club in front of drunken people, you’ve probably mastered it on some important level.
g-speak is really, truly, brilliant work – not just as a video demo, but from what I can see, in the detailed work they’ve done with the gestural interface and the way screens are networked together. To say that it’s “the first major step” in interfacing since 1984 would require ignoring the extensive work done on this sort of interface. Look back to Myron Krueger’s work in the 1970s which predated even today’s UI as we know it, and work looking more like this in the years since. Then again, maybe that’s the point.This isn’t about novelty; on the contrary, it’s trying to work out how to design interfaces connected to metaphors and human physical wiring that pre-date the invention of computers.
Odds are, you can’t afford Oblong’s platform. But that leaves tons of other possibilities this sort of thing could inspire. Musicians already know that moving your hand around in space with no tactile feedback makes precision challenging – the Theremin requires years of practice to master, eludes many would-be players, and limits certain kinds of controls. (Oh yeah … the Theremin also came before 1984. Quite a few years before 1984 … think 1919.)
In other words, we’re now seeing the first realization of the level of sophistication that we knew was coming. But it’s only one implementation. Look out for more gestural interface development in the future. And now that people are nailing the input/output method, the bigger challenge is next: content.