MIRA - Multimodal Image Reconstruction with Attention
MIRA is a multimodal transformer (Encoder-Decoder) based architecture for Text or Image to 3D reconstruction focussing on generating the 3D representation just using single 2D image of object within seconds.