Hugo Bertiche

[Resume]

I obtained my Bachelor degree in Aeronautical Engineering at the Universitat Politècnica de Catalunya (UPC) in 2015. Later, I received my MSc in Artificial Intelligence from the Universitat de Barcelona (UB), Universitat Politècnica de Catalunya (UPC) and Universitat Rovira i Virgili (URV) in 2017. Finally, I obtained my Ph.D. in Mathematics and Computer Science at the Universitat de Barcelona (UB), as a member of the Human Pose and Behaviour Analysis (HuPBA). My Ph.D. began with focus on Computer Vision and 3D human pose and shape recovery through deep learning. I quickly transitioned to 3D clothing in human centric domains, which is more strongly aligned with my interests: deep learning, computer graphics and physics. I believe deep learning has barely scratched the surface of its potential within the 3D garment domain and I am eager to see what we all will be able to build in the future.

Publications

Neural Cloth Simulation
Hugo Bertiche, Meysam Madadi and Sergio Escalera
ACM Trans. Graph. 41, 6, Article 220 (December 2022), 14 pages. DOI:https://doi.org/10.1145/3550454.3555491
We present a general framework for the garment animation problem through unsupervised deep learning inspired in physically based simulation. Existing trends in the literature already explore this possibility. Nonetheless, these approaches do not handle cloth dynamics. Here, we propose the first methodology able to learn realistic cloth dynamics unsupervisedly, and henceforth, a general formulation for neural cloth simulation. The key to achieve this is to adapt an existing optimization scheme for motion from simulation based methodologies to deep learning. Then, analyzing the nature of the problem, we devise an architecture able to automatically disentangle static and dynamic cloth subspaces by design. We will show how this improves model performance. Additionally, this opens the possibility of a novel motion augmentation technique that greatly improves generalization. Finally, we show it also allows to control the level of motion in the predictions. This is a useful, never seen before, tool for artists. We provide of detailed analysis of the problem to establish the bases of neural cloth simulation and guide future research into the specifics of this domain.
[Bibtex]
@article{10.1145/3550454.3555491,
 author = {Bertiche, Hugo and Madadi, Meysam and Escalera, Sergio},
 title = {Neural Cloth Simulation},
 year = {2022},
 issue_date = {December 2022},
 publisher = {Association for Computing Machinery},
 address = {New York, NY, USA},
 volume = {41},
 number = {6},
 issn = {0730-0301},
 url = {https://doi.org/10.1145/3550454.3555491},
 doi = {10.1145/3550454.3555491},
 journal = {ACM Trans. Graph.},
 month = {dec},
 articleno = {220},
 numpages = {14},
 keywords = {deep learning, disentangle, simulation, unsupervised, dynamics, neural network, cloth}
}
PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation
Hugo Bertiche, Meysam Madadi and Sergio Escalera
ACM Trans. Graph. 40, 6, Article 198 (December 2021), 14 pages. DOI:https://doi.org/10.1145/3478513.3480479
We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to un-supervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth. While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used. Also, dependency on PBS data limits the scalability of these solutions, while their formulation hinders its applicability and compatibility. By proposing an unsupervised methodology to learn PSD for LBS models (3D animation standard), we overcome both of these drawbacks. Results obtained show cloth-consistency in the animated garments and meaningful pose-dependant folds and wrinkles. Our solution is extremely efficient, handles multiple layers of cloth, allows unsupervised outfit resizing and can be easily applied to any custom 3D avatar.
[Bibtex]
@article{10.1145/3478513.3480479,
 author = {Bertiche, Hugo and Madadi, Meysam and Escalera, Sergio},
 title = {PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation},
 year = {2021},
 issue_date = {December 2021},
 publisher = {Association for Computing Machinery},
 address = {New York, NY, USA},
 volume = {40},
 number = {6},
 issn = {0730-0301},
 url = {https://doi.org/10.1145/3478513.3480479},
 doi = {10.1145/3478513.3480479},
 journal = {ACM Trans. Graph.},
 month = {dec},
 articleno = {198},
 numpages = {14},
 keywords = {garment, deep learning, animation, simulation, pose space deformation, physics, neural network}
}
DeePSD: Automatic deep skinning and pose space deformation for 3D garment animation
Hugo Bertiche, Meysam Madadi, Emilio Tylson and Sergio Escalera
In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 5471-5480).
We present a novel solution to the garment animation problem through deep learning. Our contribution allows animating any template outfit with arbitrary topology and geometric complexity. Recent works develop models for garment edition, resizing and animation at the same time by leveraging the support body model (encoding garments as body homotopies). This leads to complex engineering solutions that suffer from scalability, applicability and compatibility. By limiting our scope to garment animation only, we are able to propose a simple model that can animate any outfit, independently of its topology, vertex order or connectivity. Our proposed architecture maps outfits to animated 3D models into the standard format for 3D animation (blend weights and blend shapes matrices), automatically providing of compatibility with any graphics engine. We also propose a methodology to complement supervised learning with an unsupervised physically based learning that implicitly solves collisions and enhances cloth quality.
[Bibtex]
@InProceedings{Bertiche_2021_ICCV,
 author = {Bertiche, Hugo and Madadi, Meysam and Tylson, Emilio and Escalera, Sergio},
 title = {DeePSD: Automatic Deep Skinning and Pose Space Deformation for 3D Garment Animation},
 booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
 month = {October},
 year = {2021},
 pages = {5471-5480}
}
Neural Implicit Surfaces for Efficient and Accurate Collisions in Physically Based Simulations
Hugo Bertiche, Meysam Madadi and Sergio Escalera
arXiv:2110.01614
Current trends in the computer graphics community propose leveraging the massive parallel computational power of GPUs to accelerate physically based simulations. Collision detection and solving is a fundamental part of this process. It is also the most significant bottleneck on physically based simulations and it easily becomes intractable as the number of vertices in the scene increases. Brute force approaches carry a quadratic growth in both computational time and memory footprint. While their parallelization is trivial in GPUs, their complexity discourages from using such approaches. Acceleration structures –such as BVH– are often applied to increase performance, achieving logarithmic computational times for individual point queries. Nonetheless, their memory footprint also grows rapidly and their parallelization in a GPU is problematic due to their branching nature. We propose using implicit surface representations learnt through deep learning for collision handling in physically based simulations. Our proposed architecture has a complexity of O(n) –or O(1) for a single point query– and has no parallelization issues. We will show how this permits accurate and efficient collision handling in physically based simulations, more specifically, for cloth. In our experiments, we query up to 1M points in ∼ 300 milliseconds
[Bibtex]
@article{DBLP:journals/corr/abs-2110-01614,
 author = {Hugo Bertiche and Meysam Madadi and Sergio Escalera},
 title = {Neural Implicit Surfaces for Efficient and Accurate Collisions in Physically Based Simulations},
 journal = {CoRR},
 volume = {abs/2110.01614},
 year = {2021},
 url = {https://arxiv.org/abs/2110.01614},
 eprinttype = {arXiv},
 eprint = {2110.01614},
 timestamp = {Fri, 08 Oct 2021 15:47:55 +0200},
 biburl = {https://dblp.org/rec/journals/corr/abs-2110-01614.bib},
 bibsource = {dblp computer science bibliography, https://dblp.org}
}
Deep Parametric Surfaces for 3D Outfit Reconstruction from Single View Image
Hugo Bertiche, Meysam Madadi and Sergio Escalera
2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), 2021, pp. 1-8, doi: 10.1109/FG52635.2021.9667017.
We present a methodology to retrieve analytical surfaces parametrized as a neural network. Previous works on 3D reconstruction yield point clouds, voxelized objects or meshes. Instead, our approach yields 2-manifolds in the euclidean space through deep learning. To this end, we implement a novel formulation for fully connected layers as parametrized manifolds that allows continuous predictions with differential geometry. Based on this property we propose a novel smoothness loss. Results on CLOTH3D++ dataset show the possibility to infer different topologies and the benefits of the smoothness term based on differential geometry.
[Bibtex]
@INPROCEEDINGS{9667017,
 author={Bertiche, Hugo and Madadi, Meysam and Escalera, Sergio},
 booktitle={2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)},
 title={Deep Parametric Surfaces for 3D Outfit Reconstruction from Single View Image},
 year={2021},
 volume={},
 number={},
 pages={1-8},
 doi={10.1109/FG52635.2021.9667017}
}
Learning Cloth Dynamics: 3D+Texture Garment Reconstruction Benchmark
Meysam Madadi, Hugo Bertiche, Wafa Bouzouita, Isabelle Guyon and Sergio Escalera
Proceedings of the NeurIPS 2020 Competition and Demonstration Track, PMLR 133:57-76, 2021.
Human avatars are important targets in many computer applications. Accurately tracking, capturing, reconstructing and animating the human body, face and garments in 3D are critical for human-computer interaction, gaming, special effects and virtual reality. In the past, this has required extensive manual animation. Regardless of the advances in human body and face reconstruction, still modeling, learning and analyzing human dynamics need further attention. In this paper we plan to push the research in this direction, e.g. understanding human dynamics in 2D and 3D, with special attention to garments. We provide a large-scale dataset (more than 2M frames) of animated garments with variable topology and type, calledCLOTH3D++. The dataset contains RGBA video sequences paired with its corresponding 3D data. We pay special care to garment dynamics and realistic rendering of RGB data, including lighting, fabric type and texture. With this dataset, we hold a competition at NeurIPS2020. We design three tracks so participants can compete to develop the best method to perform 3D garment reconstruction in a sequence from (1) 3D-to-3D garments, (2) RGB-to-3D garments, and (3) RGB-to-3D garments plus texture. We also provide a baseline method, based on graph convolutional networks, for each track. Baseline results show that there is a lot of room for improvements. However, due to the challenging nature of the problem, no participant could outperform the baselines.
[Bibtex]
@InProceedings{pmlr-v133-madadi21a,
 title = {Learning Cloth Dynamics: 3D+Texture Garment Reconstruction Benchmark},
 author = {Madadi, Meysam and Bertiche, Hugo and Bouzouita, Wafa and Guyon, Isabelle and Escalera, Sergio},
 booktitle = {Proceedings of the NeurIPS 2020 Competition and Demonstration Track},
 pages = {57--76},
 year = {2021},
 editor = {Escalante, Hugo Jair and Hofmann, Katja},
 volume = {133},
 series = {Proceedings of Machine Learning Research},
 month = {06--12 Dec},
 publisher = {PMLR},
 pdf = {http://proceedings.mlr.press/v133/madadi21a/madadi21a.pdf},
 url = {https://proceedings.mlr.press/v133/madadi21a.html},
}
Deep unsupervised 3D human body reconstruction from a sparse set of landmarks
Meysam Madadi, Hugo Bertiche and Sergio Escalera
Int J Comput Vis 129, 2499–2512 (2021). https://doi.org/10.1007/s11263-021-01488-2
In this paper we propose the first deep unsupervised approach in human body reconstruction to estimate body surface from a sparse set of landmarks, so called DeepMurf. We apply a denoising autoencoder to estimate missing landmarks. Then we apply an attention model to estimate body joints from landmarks. Finally, a cascading network is applied to regress parameters of a statistical generative model that reconstructs body. Our set of proposed loss functions allows us to train the network in an unsupervised way. Results on four public datasets show that our approach accurately reconstructs the human body from real world mocap data.
[Bibtex]
@article{madadi2021deep,
 title={Deep unsupervised 3D human body reconstruction from a sparse set of landmarks},
 author={Madadi, Meysam and Bertiche, Hugo and Escalera, Sergio},
 journal={International Journal of Computer Vision},
 volume={129},
 number={8},
 pages={2499--2512},
 year={2021},
 publisher={Springer}
}
CLOTH3D: Clothed 3D Humans
Hugo Bertiche, Meysam Madadi and Sergio Escalera
In European Conference on Computer Vision (pp. 344-359). Springer, Cham.
We present CLOTH3D, the first big scale synthetic dataset of 3D clothed human sequences. CLOTH3D contains a large variability on garment type, topology, shape, size, tightness and fabric. Clothes are simulated on top of thousands of different pose sequences and body shapes, generating realistic cloth dynamics. We provide the dataset with a generative model for cloth generation. We propose a Conditional Variational Auto-Encoder (CVAE) based on graph convolutions (GCVAE) to learn garment latent spaces. This allows for realistic generation of 3D garments on top of SMPL model for any pose and shape.
[Bibtex]
@inproceedings{bertiche2020cloth3d,
 title={CLOTH3D: clothed 3d humans},
 author={Bertiche, Hugo and Madadi, Meysam and Escalera, Sergio},
 booktitle={European Conference on Computer Vision},
 pages={344--359},
 year={2020},
 organization={Springer}
}
SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery
Meysam Madadi, Hugo Bertiche and Sergio Escalera
Pattern Recognition, 106, 107472.
Current state-of-the-art in 3D human pose and shape recovery relies on deep neural networks and statistical morphable body models, such as the Skinned Multi-Person Linear model (SMPL). However, regardless of the advantages of having both body pose and shape, SMPL-based solutions have shown difficulties to predict 3D bodies accurately. This is mainly due to the unconstrained nature of SMPL, which may generate unrealistic body meshes. Because of this, regression of SMPL parameters is a difficult task, often addressed with complex regularization terms. In this paper we propose to embed SMPL within a deep model to accurately estimate 3D pose and shape from a still RGB image. We use CNN-based 3D joint predictions as an intermediate representation to regress SMPL pose and shape parameters. Later, 3D joints are reconstructed again in the SMPL output. This module can be seen as an autoencoder where the encoder is a deep neural network and the decoder is SMPL model. We refer to this as SMPL reverse (SMPLR). By implementing SMPLR as an encoder-decoder we avoid the need of complex constraints on pose and shape. Furthermore, given that in-the-wild datasets usually lack accurate 3D annotations, it is desirable to lift 2D joints to 3D without pairing 3D annotations with RGB images. Therefore, we also propose a denoising autoencoder (DAE) module between CNN and SMPLR, able to lift 2D joints to 3D and partially recover from structured error. We evaluate our method on SURREAL and Human3.6M datasets, showing improvement over SMPL-based state-of-the-art alternatives by about 4 and 25 millimeters, respectively.
[Bibtex]
@article{madadi2020smplr,
 title={SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery},
 author={Madadi, Meysam and Bertiche, Hugo and Escalera, Sergio},
 journal={Pattern Recognition},
 volume={106},
 pages={107472},
 year={2020},
 publisher={Elsevier}
}