Yesterday we demonstrated our gaze navigated robot at the Microsoft Robotics event here at ITU Copenhagen. The "robot" transmits a video which is displayed on a client computer. By using an eye tracker we can direct the robot towards where the user is looking. The concept allows for a human-machine interaction with a direct mapping of the users intention. The Danish National TV (DR) came by today and recorded a demonstration. It will be shown tonight at the nine o´ clock news. Below is a video that John Paulin Hansen recorded yesterday which demonstrates the system. Please notice that the frame-rate of the video stream was well below average at the time of recording. It worked better today. In the coming week we'll look into alternative solutions (suggestions appreciated) The projects has been carried out in collaboration with Alexandre Alapetite from DTU. His low-cost, LEGO-based rapid mobile robot prototype, gives interesting possibilities to test some human-computer and human-robot interaction.
The virgin tour around the ITU office corridor (on YouTube)
Abstract Experimental gaze interaction techniques are typically prototyped from scratch using proprietary libraries provided by the manufacturers of eye tracking equipment. These libraries provide gaze data interfaces, but not any of the additional infrastructure that is common to the implementation of such techniques. This results in an unnecessary duplication of effort. In this paper, a framework for implementing gaze selection techniques is presented. It consists of two components: a gaze library to interface with the tracker and a set of classes which can be extended to implement different gaze selection techniques. The framework is tracker and operating system independent, ensuring compatibility with a wide range of systems. Support for user testing is also built into the system, enabling researchers to automate the presentation of est targets to users and record relevant test data. These features greatly simplify the process of implementing and evaluating new interaction techniques. The practicality and flexibility of the framework are demonstrated by the successful implementation of a number of gaze selection techniques.
van Tonder, M., Cilliers, C., and Greyling, J. 2008. A framework for gaze selection techniques. In Proceedings of the 2008 Annual Research Conference of the South African institute of Computer Scientists and information Technologists on IT Research in Developing Countries: Riding the Wave of Technology (Wilderness, South Africa, October 06 - 08, 2008). SAICSIT '08, vol. 338. ACM, New York, NY, 267-275. DOI= http://doi.acm.org/10.1145/1456659.1456690
Hyung Min Park, Seok Han Lee and Jong Soo Choi from the Graduate School of Advanced Imaging Science, Multimedia & Film at the University of Chung-Ang, Korea presented a paper on their Wearable Augmented Reality System (WARS) at the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality. They use a half-blink mode (called "aging") for selection which is detected by their custom eye tracking algorithms. See the end of the video.
Undisturbed interaction is essential to provide immersive AR environments. There have been a lot of approaches to interact with VEs (virtual environments) so far, especially in hand metaphor. When the user‟s hands are being used for hand-based work such as maintenance and repair, necessity of alternative interaction technique has arisen. In recent research, hands-free gaze information is adopted to AR to perform original actions in concurrence with interaction. [3, 4]. There has been little progress on that research, still at a pilot study in a laboratory setting. In this paper, we introduce such a simple WARS(wearable augmented reality system) equipped with an HMD, scene camera, eye tracker. We propose „Aging‟ technique improving traditional dwell-time selection, demonstrate AR gallery – dynamic exhibition space with wearable system.
"We did a simple questionnaire-based analysis. The results of the analysis show some promises for implementing gaze-augmented problem-solving interfaces. Users of gaze-augmented interaction felt more immersed than the users of other two modes - dwell-time based and computer mouse. Immersion, engagement, and user-experience in general are important aspects in educational interfaces; learners engage in completing the tasks and, for example, when facing a difficult task they do not give up that easily. We also did analysis of the strategies, and we will report on those soon. We could not attend the conference, but didn’t want to disappoint eventual audience. We thus decided to send a video instead of us. " (from Romans blog)
"The possibilities of eye-tracking technologies in educational gaming are seemingly endless. The question we need to ask is what the effects of gaze-based interaction on user experience, strategy during learning and problem solving are. In this paper we evaluate the effects of two gaze based input techniques and mouse based interaction on user experience and immersion. In a between-subject study we found that although mouse interaction is the easiest and most natural way to interact during problemsolving, gaze-based interaction brings more subjective immersion. The findings provide a support for gaze interaction methods into computer-based educational environments." Download paper as PDF.
Some of this research has also been presented within the COGAIN association, see:
Gowases Tersia (2007) Gaze vs. Mouse: An evaluation of user experience and planning in problem solving games. Master’s thesis May 2, 2007. Department of Computer Science, University of Joensuu, Finland. Download as PDF
"This article proposes an alternative interaction method, the conductor interaction method (CIM), which aims to provide a more natural and easier-to-learn interaction technique. This novel interaction method extends existing HCI methods by drawing upon techniques found in human-human interaction. It is argued that the use of a two-phased multimodal interaction mechanism, using gaze for selection and gesture for manipulation, incorporated within a metaphor-based environment, can provide a viable alternative for interacting with a computer (especially for novice users). Both the model and an implementation of the CIM within a system are presented in this article. This system formed the basis of a number of user studies that have been performed to assess the effectiveness of the CIM, the findings of which are discussed in this work. More specifically the CIM aims to provide the following.
—A More Natural Interface. The CIM will have an interface that utilizes gaze and gestures, but is nevertheless capable of supporting sophisticated activities. The CIM provides an interaction technique that is as natural as possible and is close to the human-human interaction methods with which users are already familiar. The combination of gaze and gestures allows the user to perform not only simple interactions with a computer, but also more complex interacones such as the selecting, editing, and placing of media objects.
—A Metaphor Supported Interface. In order to help the user understand and exploit the gaze and gesture interface, two metaphors have been developed. An orchestra metaphor is used to provide the environment in which the user interacts. A conductor metaphor is used for interacting within this environment. These two metaphors are discussed next. —A Two-Phased Interaction Method. The CIM uses an interaction process where each modality is specific and has a particular function. The interaction between user and interface can be seen as a dialog that is comprised of two phases. In the first phase, the user selects the on-screen object by gazing at it. In the second phase, with the gesture interface the user is able to manipulate the selected object. These distinct functions of gaze and gesture aim to increase system usability, as they are based on human-human interaction techniques, and also help to overcome issues such as the Midas Touch problem that often experienced by look-and-dwell systems. As the dialog combines two modalities in sequence, the gaze interface can be disabled after the first phase. This minimizes the possibility of accidentally selecting objects through the gaze interface. The Midas Touch problem can also be further addressed by ensuring that there is ample dead space between media objects.
—Significantly Reduced Learning Overhead. The CIM aims to reduce the overhead of learning to use the system by encouraging the use of gestures that users can easily associate with activities they perform in their everyday life. This transfer of experience can lead to a smaller learning overhead [Borchers 1997], allowing users to make the most of the system’s features in a shorter time.
"We present a 3rd person adventure puzzle game using a novel combination of non intrusive eyetracking technology and voice recognition for game communication. Figure 1 shows the game, and its first person sub games that make use of eye tracker functionality in contrasting ways: a catapult challenge (a) and a staring competition(b)."
"There are two different modes of control in the main game. The user can select objects by looking at them and perform ’look’, ’pickup’, ’walk’, ’speak’, ’use’ and other commands by vocalizing there respective words. Alternatively, they can perform each command by blinking and winking at objects. To play the catapult game for example, the user must look at the target and blink, wink or drag to fire a projectile towards the object under the crosshair. "
Their work was presented at the ACM SIGGRAPH 2008 with the associated poster: