Titre : | Utilisation des techniques de deep learning pour l’amélioration de la gestion des occultations pour la réalité augmentée |
Auteurs : | Roumaissa Bekiri, Auteur ; Mohamed Chaouki Babahenini, Directeur de thèse |
Type de document : | Thése doctorat |
Editeur : | Biskra [Algérie] : Faculté des Sciences Exactes et des Sciences de la Nature et de la Vie, Université Mohamed Khider, 2024 |
Format : | 1 vol. (134 p.) / ill., couv. ill. en coul / 30 cm |
Langues: | Anglais |
Mots-clés: | Augmented Reality, Occlusion, hand pose estimation, deep learning, Humancomputer interaction, 2D pose, 3D pose |
Résumé : |
Augmented Reality (AR) represents a groundbreaking technological frontier that seamlessly merges the digital and physical worlds. At the core of this technology lies the need for precise and intuitive interactions, and hand pose estimation has emerged as a crucial component in achieving this goal. Besides, it is considered more challenging than other human part estimations due to the small size of the hand, its greater complexity, and its important self-occlusions. In this context, we investigate the occlusion issue throughout the interaction. This dissertation proposes a classical method for resolving occlusion in a dynamic augmented reality system by employing a close-range photogrammetry algorithm. Additionally, we create realistic datasets composed of physical scenes from different viewpoint cameras. Further, we apply depth map data that proves to be a valuable strategy for effectively managing occlusion in augmented reality scenarios, which provides essential information about the spatial relationships and distances between objects in the scene and can accurately discern which objects should appear in front of or behind others. This approach has proven instrumental in addressing the persistent challenge of occlusion, allowing for seamless and contextually creating more immersive AR experiences. Then, we extend our study in the online process. We address the problem of hand pose estimation and present a new regression method from monocular RGB images, which aims to tackle occlusion issues during hand-object interaction in real-time. With the advent of deep learning, there has been a shift towards using deep neural networks to learn, grasp, and manipulate objects accurately. The proposed framework, defined as the "ResUnet network," provides effective capabilities in detecting and predicting both 2D and 3D hand pose. This is achieved by utilizing three primary modules: feature extraction, which employs a transfer learning technique to extract feature maps; 2D pose regression; and 3D hand estimate. Our regression methodology consistently outperforms the current state-of-the-art hand pose estimation approaches, as demonstrated by the quantitative and qualitative findings obtained from three datasets. |
Sommaire : |
List of Figures i List of Tables v 1 General introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . 1 1.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 3 1.3 Problem statement . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Thesis Contributions . . . . . . . . . . . . . . . . . 6 1.5 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Augmented Reality: Definition, Applications, Interaction 9 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Augmented Reality: Definition and Concepts . . . . . . . 10 2.3 Definition of Virtual Reality . . . . . . . . . . . . . . . . . . . . . 13 2.4 Difference between Augmented Reality and Virtual Reality . 14 2.5 Augmented Reality Applications . . . . . . . . . . . . . . . . . . 16 2.5.1 Education & Training . . . . . . . . . . . . . . . . . . .. . 16 2.5.2 Entertainment & Commerce . . . . . 18 2.5.3 Navigation & Tourism . . . . . . . . . . . . . . . . . 22 2.5.4 Medical & Construction . . . . . . . . . . . . . . . 24 2.6 Augmented Reality Devices . . . . . . . . . . . . . . .. . . . 27 2.6.1 Displays . . . . . . . . . . . . . . . . . . . .. . . . 27 2.6.2 Computers . . . . . . . . . . . . . . . . . . . . . .. . . 33 2.6.3 Tracking . . . . . . . . . . . . . . . . . . . . . . . . 33 2.6.4 Input Devices . . . . . . . . . . . . . . . . . . . . 33 2.7 Hardware and software platforms for AR . . . . . . . . . . 34 2.7.1 Hardware Platforms . . . . . . . . . . . . . . . . . . 34 2.7.2 Software Platform . . . . . . . . . . . . . . . . . .. 35 2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .. . . 37 3 Handling occlusion in augmented reality: Literature review 38 3.1 Introduction . . . . . . . . . . . . . . . . . . . . .. . . . 38 3.2 Classical Methods for Handling Occlusion in Augmented Reality . . . . . . 39 3.2.1 Contour-based methods . . . . . . . . . . . . . .. . . 39 3.2.2 Depth-based method . . . . . . . . . . . . . . . . . . 40 3.2.3 3D Reconstruction-Based Approaches . . . . . . . . . 41 3.3 Research Questions . . . . . . . . . . . . . . . . . . . . . . . 42 3.4 Hand Recognition techniques . . . . . . . . . . . .. . . . . 43 3.4.1 Sensor-Based Approaches . . . . . . . . . . . . . . . 45 3.4.2 Vision-based approaches . . . . . . . . . . . . . . . . . . 47 3.5 Deep learning hand pose estimation methods: . . . . . . . . . 52 3.5.1 Depth-based methods . . . . . . . . . . . . . . . . 52 3.5.2 Image based method . . . . . . . . . . . . . . . . . 57 3.5.3 RGBD-based methods . . . . . . . . . . . . . . . . 61 3.6 Datasets and Evaluation Measurement . . . . . . . . . . . . 64 3.6.1 Benchmark Datasets . . . . . . . . . . . . . . .. . . 64 3.6.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . 68 3.7 Bibliometric analysis of related works using Vos Viewer analysis . . . . . . 68 3.7.1 VosViewer Overview . . . . . . . . . . . . . . . . . . . 68 3.7.2 The Most Cited Documents ranking . . . . . . . . . . 69 3.7.3 The Most Cited Authors bibliographic analysis . . . . . . 71 3.7.4 Keywords Occurrence network for "Hand Pose estimation Topic" . 74 3.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . .. . . 76 3.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 81 4 Real time handling occlusion in augmented reality based on photogrammetry83 4.1 Introduction . . . . . . . . . . . . . .. . . . . . 83 4.1.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . 84 4.2 System Overview . . . . . . . . . . . . . . . . . .. . . . . . . 84 4.2.1 Photogrammertic 3D modeling . . . . . . . . . . . 85 4.2.2 AR application development with handling occlusion . . . 94 4.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . 95 4.3.1 Experimental results . . . . . . . . . . . . . . . . . . 95 4.3.2 Discussion . . . . . . . . . . . . . . . . . . . . . . 98 4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . 100 5 Hand pose estimation based on regression method from monocular RGB cameras for handling occlusion 101 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 101 5.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.3 Implementation Details . . . . . . . . . . . . . . . . . . 109 5.3.1 Data pre-processing . . . . . . . . . . . . . . . . . 110 5.3.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . 111 5.3.3 Metrics Evaluation . . . . . . . . . . . . . . . . .. 113 5.4 Experimental and Results . . . . . . . . . . . . . . . . .. . . 114 5.4.1 Quantitative Evaluation . . . . . . . . . . . . . . . . . 114 5.4.2 Qualitative Evaluation . . . . . . . . . . . . . . .. 125 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 128 6 General conclusion 130 Bibliography 134 |
En ligne : | http://thesis.univ-biskra.dz/id/eprint/6406 |
Disponibilité (1)
Cote | Support | Localisation | Statut |
---|---|---|---|
TINF/193 | Théses de doctorat | bibliothèque sciences exactes | Consultable |