FACING VIRTUAL HABITAT:

RANGE-IMAGES FACE RECONSTRUCTION AND REAL-FACED AVATARS

Stanislav Stanek, sstanek@fmph.uniba.sk
Marek Zimányi, zimanyi@fmph.uniba.sk

Abstract

Virtual habitat represents a new way of creating virtual environments inhabited by virtual population. We present a project aimed at creation of real-faced emphatic avatars. We use facial feature points (FP) defined in MPEG-4 standard for facial animation. From 3D position of FPs we will extract head motions and facial expressions for the avatar. We obtain the 3D-model of user head from the range data sampling and reconstruction. This 3D-model is used to define 3D-model of avatar.

1. Introduction

We will create an agent with emphatic facilities that can communicate with user and is either navigated by another user or by a computer. Suitable applications for this are for example virtual cities or communication between at least two users on Internet. In MPEG-4 content-based audiovisual coding standard has finally come some degree of standardization for facial animation. It defines feature points (FPs) that control facial expressions. This project is trying to create facial animation system that will produce realistic emphatic facial expressions and head motions of an agent or avatar whose appearance corresponds to user. We will capture positions and movements of FPs in time with camera. Achieved motions will be mapped on 3D model. There are two methods of using our system. First we can pre-capture realistic movements and than map them on 3D model of autonomous agent. This agent has no user behind and computer software with some AI represents its behavior. For communication between users we can use the second way. Capturing realistic movements from user head in real time and applying them on avatar also in real time. (Difference between autonomous agent and avatar in [Qvor01].) We will describe construction of both - appearance and functionality of avatars head.

2. Capturing system

Facial and head motions are captured using optical Motion Capture Technology. This setup consists of two cameras positioned not strictly in front of the user, but little bit to the left and right side. Useful positions of them are on left and right top of a monitor (See picture). Commercial capturing system, which uses camera for facial capturing, is for example [Eyem02].

3. Analyzes of obtained image data

First we have to find 2D projection of FPs in images using image processing with help of general structure of human face. In the beginning we have to analyze whole picture, but to find next position of FP in next frame, it is effective to search only in small area around previous position. Than we can reconstruct 3D positions of FPs if they are captured with two cameras or we have some additional information. Additional information can be for example distances between FPs. [Stan00]

4. Face Reconstruction using Range Images

For the reconstruction of surfaces or objects from range images a number of techniques have been developed, e. g. in [Leon00][Levoy02]. We propose to use a hardware set up using the electric light and the projected lattice for head surface reconstruction. The electric light is projected on the object by a data projector or an overhead projector. Commercial software for this is e.g. Eyetronic (www.eyetronic.com).

5. Motions mapping and 3D model deformations

Our 3D model corresponds to the head of the user (from which we are capturing positions of FPs) so captured displacements of users FPs correspond to displacements of FPs on 3D model. The texture for 3D model we will achieve from images captured with cameras. For mapping it on 3D model we will use detected position of FPs that define important facial areas such as eyes, mouth, eyebrows, nose, etc. For deformations we will use Facial Animation Tables [Gach01] that are defined in MPEG-4.

6. Conclusion and Future Work

We will improve semi-automated or automated selection of FPs. In the future we would like to improve obtaining of range images from two cameras.

References

[Levoy02] LEVOY, M. - RUSINKIEWICZ, S. - HALL-HOLT, O. 2002. Real-Time 3D Model Acquisition. SIGGRAPH 2002, http://www-graphics.stanford.edu/papers/rt_model/

[Leon00] A. Leonardis et al (eds). Proceedings of the NATO Advanced Research Workshop on Confluence of Computer Vision and Computer Graphics, ISBN 0-7923-6611-5, 2000 Kluwer Academic Publisher

[Eyem02] Eyematic Interface Page. 2002. Professional 3D Facial Animation Toolset at SIGGRAPH 2002. http://www.eyematic.com

[Qvor01] QVORTRUP, L. ed. 2001. Virtual Interaction: Interaction in Virtual Inhabited 3D Worlds. Springer-Verlag London Berlin Heidelberg. ISBN 1-85233-331-6

[Stan00] Stanek, S., New Ways of Facial Motion Capture, Spring Conference on Computer Graphics and Its Applications 2000, Comenius University, Bratislava, pp. 34 – 35, ISSN 1335-5694

[Gach01] Stephane Gachery, Nadia Magnenat-Thalmann, Designing MPEG-4 Facial Animation Tables for Web Applications, MiraLab - University of Geneva, www.miralab.unige.ch