Overview
MeshFlow is presented as a novel method for generating artist-like 3D meshes. The approach aims to address challenges associated with current mesh generation techniques, particularly those based on auto-regressive (AR) next-token prediction. The method integrates a Variational Autoencoder (VAE) with a flow-based diffusion transformer.
Research Context
Existing mesh generators frequently employ Auto-Regressive (AR) next-token prediction, which is considered a natural choice due to the discrete nature of mesh topology. However, this AR approach has performance limitations, notably a quadratic inference cost relative to mesh size. Additionally, these methods typically necessitate the discretization of vertex coordinates, leading to quantization errors. The research identifies these aspects as areas for improvement in 3D mesh generation.
Approach
The MeshFlow method incorporates a Variational Autoencoder (VAE) for continuous representation. This VAE is supervised using a contrastive loss function. Its role is to represent both continuous vertex positions and discrete connectivity within a continuous latent space. This latent space is described as being more compact than prior token-based mesh representations.
Following the VAE, a 3D generator is constructed using a Rectified Flow transformer. This transformer is designed to generate all mesh vertices and edges simultaneously, functioning in parallel rather than sequentially.
Findings
- MeshFlow's VAE, supervised with a contrastive loss, effectively represents continuous vertex positions and discrete connectivity in a continuous latent space.
- The developed continuous latent space is significantly more compact compared to previous token-based mesh representations.
- The 3D generator, built on a Rectified Flow transformer, facilitates the parallel generation of all mesh vertices and edges.
- MeshFlow generates meshes 18 times faster than the fastest auto-regressive (AR) generator.
- The method achieves accuracy across standard mesh-generation metrics.
Why This Matters
The described method addresses efficiency and quality concerns in 3D mesh generation. By enabling significantly faster mesh generation while maintaining accuracy and avoiding quantization errors, it offers an alternative to methods limited by high computational costs and representational inaccuracies.