Modeling Reality for Camera Registration in Augmented Reality Applications

One of the central problems of Augmented Reality is to make reality and virtual objects coincide in some way. Any technique aiming at solving this problem requires an internal way of representing the real objects of interest, i.e. the objects that the system expects to see. We name these representations Reality Models and take in this work a closer look at the various ways of representing reality in the context of AR. We propose a classification of AR applications based on the requirements on Reality Models. AR applications can be first classified into applications where the full 6DOF camera pose is recovered and applications where a 2D object localization in the image space is sufficient. We further detail the classification by providing examples of planar fiducials and textured objects in the first case and object detection and local pose estimation in the second case. In all provided examples we extend the state of the art by providing new Reality Models or better ways to construct and use existing ones. Augmented Reality is a powerful and intuitive technology that can be used in various situations, ranging from small smartphone games to large-scale outdoor augmentations for marketing and tourism. However, these diverse possibilities share one common feature: in order to augment reality, at least some parts of the real environment need to be known in advance and modeled in an appropriate way. This prior knowledge is necessary to allow an AR system to build the ‘‘link’’ between the real world and its virtual counterpart. In this context, the challenge is that the mathematical models used for real objects have to meet the requirements of an AR application: the representation should be sparse to ensure a low memory footprint, robust to different kinds of alteration that arises in the context of optical sensors (illumination changes, partial occlusions, shape deformation) and at the same time complete enough to allow for recovering 3D information (object position or complete camera pose) in real time. A look at the state of the art shows that this aspect of Augmented Reality has never been analyzed thoroughly. In this work, our goal is to investigate how real objects and scenes can be efficiently modeled for AR applications and to develop solutions for reality modeling and pose estimation for different types of AR scenarios. We therefore propose a classification of Reality Models based on the type of real objects used, the nature of the virtual augmentation and the type of registration they permit. In this context, we present novel representations, and provide a detailed analysis of their use in AR. Depending on the type of information available on the real objects present in the scene, and on the type of AR application being built, two different approaches for registration can be developed: a full 3D registration of the camera or a 2D object labeling approach (see Fig. 1). A full 3D registration means that all the parameters of the camera are known. As a consequence, virtual objects can be inserted in the real scene at an exact position and they appear as if they were completely integrated in the environment. Reality Models for this approach can be further divided into markerbased models, and models based on textured objects. 2D object labeling is used when the focus of the application is not the integration of virtual objects in a real scene, but rather the automatic identification of objects or scene parts in order to provide contextual information to the user. For this approach, the Reality Models can use object detection or extended object detection (Fig. 1). A. Pagani (&) DFKI GmbH, Trippstadter Straße 122, 67655 Kaiserslautern, Germany e-mail: alain.pagani@dfki.de