Accelerometer-based Gestures, and Screen Orientation Capabilities of the Openmoko Neo FreeRunner, Alpha 2 Releaseby Paul V. Borza, This e-mail address is being protected from spam bots, you need JavaScript enabled to view it , www.borza.ro IntroductionHuman-phone interaction is still limited to keyboard and touch screen for most mobile devices that are currently available on the market, although some of the phones are capable to sense much more; this is exactly the case of the Openmoko Neo FreeRunner that embeds two accelerometers, allowing us to measure acceleration values of ±2G, or ±8G. The alpha 2 release of accelerometer-based gestures, and screen orientation project was conceived by Paul V. Borza, mentored by Daniel Willmann, and supported by Google Summer of Code 2008, and Openmoko Inc. This release includes a database containing 12 predefined gestures (e.g. shake-shake, forward-backward, horizontal circle, left, right, up, down, left return, right return, up return, down return, and z gesture), a gesture manager with a graphical user interface that enables any user to train these predefined gestures, a gesture recognizer that notifies the user of the recognized gesture, and support for all four possible screen orientations (e.g. portrait, inverted portrait, left landscape, and right landscape). One should see a live video demonstration of the alpha 2 release, running on the Openmoko 2008.8 distribution, on www.youtube.com/watch?v=K2S2rQUETwc. VisionHere's a simple scenario that you've probably performed today. Your phone is sitting on the table and someone calls you – your phone is ringing. You pick it up half the way to look who's calling you. You push the green answer button, and take the phone to your ear to talk. Don't you feel that you're doing something extra that you shouldn't be doing? Why do you push the green answer button? The phone should know when to automatically answer the call, because you're taking your phone to your ear. You basically do two things, and one is extra. When you push the answer button, you let the phone know that you intend to talk with the other person. Stop! You're already making the gesture that you intend to answer the call by moving your phone to your ear. So why do you still need to push the answer button? The accelerometer-based gestures project tries to remove such repetitive, and unnecessary tasks that you need to perform daily on usual mobile phones. How fast can you switch your mobile phone to silent? You will probably need to press several buttons, navigate the graphical user interface, and finally select the silent profile (not to mention that you have to repeat these steps to change back to your previous profile), so it will take a while. Here's how the Neo FreeRunner will change to silent mode, as of beta 1 release: you just put the Neo FreeRunner on the table, while the screen is facing down. Unbelievable... it will be very easy to switch to silent mode on the Neo FreeRunner, and as well as to achieve other tasks, like going to the main menu when shaken, automatically light up the screen when lifted from the table, and etc. For other community requested use cases, please read wiki.openmoko.org/wiki/Gestures. How it WorksThe top accelerometer is continuously being read to automatically detect whether a dynamic motion signal is present, or not; this is well suited for gestures. We're using a so-called motion end-point detector, which is based on an extremely efficient two-class trivariate Gaussian classifier. Such a classifier is used to filter out obvious static acceleration, but the ultimate decision on the movement boundary is left to the gesture recognizer. The application can also be configured to use the bottom accelerometer, only that models need to be re-created, and re-estimated, because the top accelerometer is tilted by 45° on X, and Y. The core of the gesture recognizer uses hidden Markov models (i.e. left-to-right continuous density hidden Markov models), as most state-of-the-art speech recognition systems. To understand hidden Markov models, please read my BS Thesis, available on gestures.borza.ro, where I explain the whole recognition process using hidden Markov models. Specific hidden Markov model algorithms, like the Forward-Backward, Viterbi, and Baum-Welch, were used to calculate the likelihood of a model/gesture, to decode it in states, and respectively, to train it. There's no need to get into further details here on how the real-time end-point detection flow, and real-time recognition flow runs. For DevelopersThe accelerometer-based gestures, and screen orientation project is structured in three sub- applications, namely 'gesd' (i.e. the gesture recognizer daemon), 'gesl' (i.e. the gesture listener daemon), and 'gesm' (i.e. the gesture manager, supporting both advanced command line options, and a basic graphical user interface). The basic theoretical components include the signal processing module, the decoder module, and the adaptor module. Developers that are interested in only listening for gestures, and as well for screen orientation, should attach to the system bus of dbus, and listen for the 'org.openmoko.accelges.Recognizer.Recognized' signal. The gesture recognizer uses dbus to trigger the gestures, such that other developers could take advantage of the gestures, and screen orientation capabilities of the Neo FreeRunner, in their own applications as easily as possible. Developers can further customize the gestures by creating new ones, visualizing, or training them, using the command line options of the gesture manager; more on wiki.openmoko.org/wiki/Gestures. The installation package can be downloaded from accelges.googlecode.com/files/ accelges_0.1.0-svnr204-r2_armv4t.ipk and installed using 'opkg'. Typical commands that need to be run on the Openmoko Neo FreeRunner in order to use the gestures are '/etc/ init.d/gesd-neo2 start', and '/etc/init.d/gesl start'. To listen for signals, you should use the FSO distribution, and run 'mdbus -s -l'. That's everything you need to know for now to get started. Enjoy! ImprovementsThe alpha 2 release uses 20% of CPU constantly due to floating point operations that are being computed; however, the beta 1 release will use fixed point operations, as even now, the computations are done in the logarithmic scale to avoid underflow. Recognition accuracy can also be improved by introducing a dictionary, and transferring hidden Markov models from full-length gestures, to smaller gestures (similar to using phonemens, instead of full words), together with language grammars (i.e. bigrams), and the time-synchronous Viterbi beam search algorithm. Take the z gesture for example, one can model it as a full-fledged gesture, or as a sequence of three gestures, namely a right gesture, a down-diagonal gesture, and another right gesture, mapped accordingly in the dictionary. ConclusionCurrently having 6195 lines of C99 code, and available as open source under LGPL, this project promises to change the way we're used to interact with mobile devices, by making the Neo FreeRunner adapt to your needs, and enhance the mobile computing experience. |