MACS: Mass Conditioned 3D Hand and Object Motion Synthesis

teaser
Soshi Shimada1,2 Franziska Mueller3 Jan Bednarik3 Bardia Doosti3 Bernd Bickel3 Danhang Tang3 Vladislav Golyanik1 Jonathan Taylor3 Christian Theobalt1,2 Thabo Beeler3
1Max Planck Institute for Informatics, SIC 2VIA Research Center 3Google
3D Vision 2024

Abstract

The physical properties of an object, such as mass, significantly affect how we manipulate it with our hands. Surprisingly, this aspect has so far been neglected in prior work on 3D motion synthesis. To improve the naturalness of the synthesized 3D hand-object motions, this work proposes MACS—the first MAss Conditioned 3D hand and object motion Synthesis approach. Our approach is based on cascaded diffusion models and generates interactions that plausibly adjust based on the object’s mass and interaction type. MACS also accepts a manually drawn 3D object trajectory as input and synthesizes the natural 3D hand motions conditioned by the object’s mass. This flexibility enables MACS to be used for various downstream applications, such as generating synthetic training data for ML tasks, fast animation of hands for graphics workflows, and generating character interactions for computer games. We show experimentally that a small-scale dataset is sufficient for MACS to reasonably generalize across interpolated and extrapolated object masses unseen during the training. Furthermore, MACS shows moderate generalization to unseen objects, thanks to the mass-conditioned contact labels generated by our surface contact synthesis model ConNet. Our comprehensive user study confirms that the synthesized 3D hand-object interactions are highly plausible and realistic.

Submission Video

Citation

@inproceedings{
    MACS2024,
    author = {Shimada, Soshi and Mueller, Franziska and Bednarik, Jan and Doosti, Bardia 
              and Bickel, Bernd and Tang, Danhang and Golyanik, Vladislav 
              and Taylor, Jonathan and Theobalt, Christian and Beeler, Thabo},
    title = {MACS: Mass Conditioned 3D Hand and Object Motion Synthesis}, 
    booktitle = {International Conference on 3D Vision (3DV)}, 
    year = {2024}
  }