CVPR'26 Tutorial: 3D Human Mesh Modeling and Recovery from RGB and LiDAR

Brégier, Romain; Sarandi, Istvan; Galaaoui, Salma

3D Human Mesh Modeling and Recovery from RGB and LiDAR

Romain Brégier^*, István Sárándi^*, Salma Galaaoui^*

CVPR 2026 Tutorial
Denver, CO, USA
^*Indicates Equal Contribution

General Info Schedule Organizers Program

Latest News

Video recordings should be available on CVPR Youtube channel around September 2026.
Presentation slides:

Understanding Human Parametric Body Mesh Models (Romain Brégier)

3D Human Mesh Recovery from Images (István Sárándi)

3D Human Pose and Shape Estimation from LiDAR Point Clouds (Salma Galaaoui)

General Information

In this half-day tutorial, we will provide a general overview of the state of the art in human mesh modeling and recovery. We will begin with an introduction to parametric human body mesh models covering their motivation, design choices and trade-offs. We will then move to an overview of 3D Human Mesh Recovery (HMR) from RGB data, before moving on to HMR from depth data, with a particular focus on challenging in-the-wild conditions and LiDAR data.

Date: June 3rd at 1:30 PM
Location: Mile High 2B, Colorado Convention Center
Format: Half-day tutorial. Recordings will be made available online.
Target audience: Academic and industrial researchers of all levels. We will introduce foundational concepts and will step-by-step build toward more advanced topics throughout the presentations.
Subject areas: Humans; Face, body, pose, gesture, movement (primary), 3D from single images, 3D from multi-view and sensors, Robot perception, Autonomous driving.

Motivation

The understanding of human pose and shape is the cornerstone of multiple AI applications ranging from monitoring, AR/VR, sport and posture analysis, human-robot interaction all the way to autonomous driving. Accurate human perception enables digital systems to interact appropriately with people in both indoor and outdoor environments.

Recent advances have pushed the field forward: modern methods now begin to achieve strong in-the-wild Human Mesh Recovery (HMR) performance, making them more reliable and useful for a wide variety of downstream tasks. With this growing interest, the community has seen the emergence of datasets and shape-recovery models, as well as an expanding range of input modalities; including RGB, depth, LiDAR, etc. At the same time, multiple human body models are being developed, each offering different levels of detail, interpretability and expressivity.

While these developments open up exciting new opportunities, they also introduce new challenges. Designing and deploying human mesh recovery systems remains difficult due to dependency on the chosen body model, peculiarities of single-person and multi-person settings, challenges of occlusions and interactions with the 3D scene, and the reliance on data-hungry training pipelines.

This tutorial is therefore motivated by the need for a clear, structured, and accessible overview of the current HMR landscape. The increasing use of foundation models and large-scale pretrained systems makes it particularly timely to disseminate a clear picture of the underlying principles of human body modeling and HMR, so that these methods can be more easily adopted, extended, and applied to adjacent fields beyond core human pose estimation. Our goal is to lower the entry barrier for newcomers, provide a unifying perspective for practitioners, and foster collaboration between communities working on human modeling, 3D vision, graphics, and embodied AI. By providing access to these concepts, we aim to maximize the impact of recent advances and encourage their use in downstream applications.

Tentative Schedule

Time and place: June 3rd, Mile High 2B

13:30-14:30	Understanding Human Parametric Body Mesh Models (Romain Brégier) Motivation for human body modeling Fundamentals of parametric 3D body models Overview of existing models (SMPL, SMPL-X, Anny, MHR, SOMA, etc.) Practical considerations, interoperability, and limitations
14:30-15:30	3D human mesh recovery from images (István Sárándi) Motivation, broader context, relation to other tasks. Overview of core single-person HMR methods. Extensions: multi-person, video, world-space, multi-view, scene geometry. available datasets and their tradeoffs. Practical survey of state-of-the-art open-source tools. Open challenges, promising direction
15 min	Coffee break
15:45-16:45	3D human pose and shape estimation from LiDAR point clouds (Salma Galaaoui) Active sensors and depth cameras. HMR in outdoor safety critical scenarios: motivating the use of LiDAR. Review of 3D HMR from depth and LiDAR. Existing datasets and metrics. Open research directions. Code examples and demo.

Program

Understanding Human Parametric Body Mesh Models

Slides

Parametric body models provide a compact, structured, and controllable representation of human morphology and pose. They have a long history in computer vision and graphics, and are central to computer animation, motion capture, human avatars, and Human Mesh Recovery. More recently, body model parameters have also emerged as a structured representation for Large Language Models, as illustrated by works such as ChatPose. Multiple body models have been proposed over the years -- such as the seminal SMPL -- and the recent release of permissively licensed models such as MHR and Anny is likely to accelerate adoption beyond academia. Yet, many CV and ML practitioners still perceive parametric body models as opaque black boxes, especially when lacking a background with 3D geometry and computer graphics.

This tutorial aims to bridge that gap. We will provide an overview of the field and help the audience navigate the growing ecosystem of parametric body models (SMPL, SMPL-X, STAR, MHR, Anny, SOMA, etc) and their respective use cases. Most importantly, the tutorial will cover the core principles behind parametric body mesh modeling -- including skeletal animation, skinning, and statistical shape modeling -- together with practical recommendations and implementation insights.

3D Human Mesh Recovery from Images: A Practical Overview

The field of 3D human mesh recovery from RGB images has witnessed remarkable progress in recent years. In the nearly five years since the ``SMPL Made Simple'' tutorial at CVPR 2021, the landscape has transformed. Methods have matured and real-time end-to-end models are ready for use in novel applications in an off-the-shelf manner. Despite this progress, adoption outside the core research community remains limited due to unfamiliarity and a perceived complexity of handling 3D geometry, leading many researchers to rely on legacy 2D pose estimation tools, unaware that modern HMR methods now offer full 3D body geometry with shape, contact surfaces, and metric scale at comparable computational cost. Our goal is therefore to make these methods more accessible to broader audiences.

This section provides a comprehensive overview of methods for recovering 3D human body meshes from RGB images. We set the stage by clarifying the task landscape, distinguishing HMR from related human-centric tasks such as 2D/3D pose estimation, DensePose, and avatar reconstruction, clarifying what the benefits of structured mesh recovery are (template correspondence, priors, contact modeling) and how it can be useful for adjacent research fields. We trace the evolution of methods from early, time-consuming optimization-based model fitting (e.g., SMPLify) to modern real-time end-to-end approaches (HMR2.0, SMPLer-X, NLF) and non-real time high-accuracy models (SAM 3D Body), discussing both parametric and point-based output representation. Beyond single-image methods, we briefly touch on extensions to multi-person scenarios (Multi-HMR, AiOS), temporal tracking (4DHumans), world-space trajectory estimation (WHAM, SLAHMR, TRAM), multi-view settings, and integration with recent geometric foundation models (Human3R, HAMSt3R). Through a practical survey of the state-of-the-art we offer recommendations for practitioners on choosing methods and avoiding common pitfalls. We conclude with open challenges and a live demonstration of human mesh recovery with the real-time SOTA NLF model.

3D Human Pose and Shape Estimation from LiDAR point-clouds: an Overview

While RGB images offer high-resolution texture and color details that enhance object discrimination in perception systems, they remain highly sensitive to low-light conditions, glare, and overexposure, often yielding unusable data in challenging outdoor scenarios. Additionally, they suffer from inherent depth and scale ambiguity due to 3D-to-2D projection. To enhance HMR accuracy, researchers have integrated 3D data at the acquisition stage, first adopting Motion Capture (MoCap) systems and depth cameras (such as stereo-vision, structured-light, and time-of-flight sensors) in a Multi-Sensor-Fusion approach.

Indeed, HMR from depth (e.g., Kinect) is now considered a fairly mature research field, as this additional modality provides pixel-wise distances that enhance the overall accuracy of the system. Despite these advances, depth cameras remain sensitive to lighting conditions and unreliable in outdoor settings.

In comparison to RGB/depth cameras, LiDAR sensors offer a wider distance range, perform reliably under adverse weather conditions and are not sensitive to the absence of light which is crucial for safety critical applications such as autonomous driving, surveillance systems, etc. However, LiDAR point clouds present several challenges, notably, the lack of color and texture information, self- and mutual- occlusions, motion artifacts, noise-ridden measurements etc. Furthermore, due to the wider field of view of LiDAR scenes, the human subject sometimes represents a very small portion of the scene and can be described by a few points only depending on its distance to the sensor. Another notable issue is the high variability of scans from one LiDAR sensor to another that hinders the generalization capability of models. For these reasons, successful RGB-based methods and architectures rarely transfer seamlessly to LiDAR inputs, requiring adaptations for sensor-specific traits.

In this portion of the tutorial, we aim to provide the audience with an overview of HMR methods that integrate LiDAR information into their pipelines based on our recent survey in the field. We will introduce the different LiDAR technologies used particularly in outdoor settings along with a brief taxonomy. Then we dive into the current 3D HMR approaches from LiDAR, categorize them, introduce the additional metrics suitable for this modality and the common datasets and benchmarks in the field, and finish with some potential research directions and open questions. We will illustrate our presentation with practical examples of synthetic LiDAR point cloud generation, and we intend to demonstrate particularities of LiDAR point clouds -- which the general CVPR audience may not be familiar with -- through live demonstrations.