Tuesday, September 4, 2012

Reading #3:  Proton: Multitouch Gestures as Regular Expressions

The world needs more multi touch interfaces...


Introduction

This paper discusses the interesting concept of breaking touch-based interactions into regular expressions so that it can be easily checked for conflicting gestures.  It was written by researches at the University of California, Berkeley in conjunction with those at Pixar Animation Studios.  They are:

Kenrick Kin is a PhD candidate at Berkeley who works primarily with multi touch interfaces.  His webpage is found at http://www.cs.berkeley.edu/~kenrick/
Bjorn Hartman is a co-director at the Berkeley Institute of Design and the Swarm Lab.  His research involves multi-user computing, and the systems research that is necessary to make it possible.  http://www.cs.berkeley.edu/~bjoern/
Tony DeRose is a senior scientist and lead of the Pixar Research Group.  Aside from this project, his recent research involves making math and science more inspiring for middle and high school students.  http://graphics.pixar.com/people/derose/index.html
Maneesh Agrawala is Kenrick's advisor for this project and an associate professor in electrical engineering and computer science at Berkeley.  His focus is on how cognitive design principles can be used to improve the effectiveness of visual displays.  http://vis.berkeley.edu/~maneesh/

Summary

Generally, when developers create multi touch applications, they design and implement custom gestures from scratch.  Like mouse based frameworks, multi touch frameworks use lowest level events and have those register callbacks.  According to this paper, multi touch recognition code is spread across the source of these applications, and they ca also conflict with previously defined gestures.  The Proton framework attempts to consolidate the gesture recognition code into one place, and break the defined gestures into regular expressions, ensuring that similar gestures do not conflict.  

Gestures comparison works much like a finite-state machine.  With each motion (RegEx symbol) that is performed, the number of possible gestures (expressions) diminishes.

Related Work

I'm guessing that there is a large amount of research performed on this subject, given the fact that my paper cites 44 references.  However, the researchers DID miss a few of them, so here they are:

GeForMTjs:  A javascript library that represents gestures in an abstract way: http://www.springerlink.com/content/c416331055107117/fulltext.pdf
An extension of the Proton Framework, the Proton++ Framework:  http://kneecap.cs.berkeley.edu/papers/protonPlusPlus/protonPlusPlus-UIST2012.pdf

These researchers essentially left no paper unreferenced.

Evaluation

At the moment, the gesture matcher can only handle users performing one gesture at a time.  To handle otherwise would require an FSM that would be exponentially larger for each gesture defined.  
The paper did not report any user feedback, despite the fact that they wrote three example applications for testing.  The team plans to perform a more extensive study later.
The evaluation that they did perform was all objective.  Quantitative data gathered reports that the algorithm required 22ms to match from a set of 36 gestures using a 2.2 GHz 2 core Intel processor.  Qualitative data indicates that the program could potentially be sped up by optimizing it, and by computing the FSM before the gestures is made.

Discussion

This work definitely has practical use in the world, and as far as I know, it is a novel concept.  Even though I never thoroughly learned about using regular expressions, this paper was comprehensible to me.  I could foresee multi touch application developers making use of this framework, after it has been optimized.

No comments:

Post a Comment