Research Of The Modifiction Of The Internal Structure Of 3D-VAE-GAN To Produce A 3D Polygon Mesh
Introduction
The interaction of computers in the physical world and its capability to model 3D objects assisted with machine learning algorithms are of increased in recent time. Recent research in machine learning and computer vision has encouraged computer to be used as a modeling tool to enable the generation of 3D models. The task of modeling is critically dependent on the ability of the machine learning algorithm to represent and reason about shape. Shape is one of the most basic properties, that is used to describe the 3D surface structure of objects. The ability of machine learning algorithms to generate 3D objects enable multiple practical applications in game, medicine, augmented reality, entertainment and robotics industries which include animation of scenes and characters, motion capture and analysis, and scene understanding (Anguelov, D. 2006). 3D objects can be represented as 3D models based on voxels, point clouds, or polygon meshes. Given a 2D image of a 3D object of a specific class (e. g chairs, tables, rabbits), this research aims to generate a 3D polygon mesh of such object. A polygon mesh is a collection of vertices, edges, and faces that defines the shape of a 3D object. A polygon mesh is more memory efficient in surface representation of 3D objects compared to voxel-based and point cloud-based 3D models. Recent works in 3D model generation have established an approach to produce a voxel-based 3D model with a 2D image as input. This approach integrates two machine learning algorithms - Variational AutoEncoder (VAE) and Generative Adversarial Network (GAN).
In 3D-VAE-GAN, VAE takes a 2D image as input and produces a vector. 3D-GAN receives a vector as input and produces a voxel-based 3D model. However, this approach requires post-processing to produce a 3D polygon mesh. In this research, the internal structure of 3D-VAE-GAN will be modified to produce a 3D polygon mesh. To achieve this, a renderer will be incorporated to enable it to accept 3D polygon mesh as input. Marching cubes algorithm will also be implemented to produce a 3D polygon mesh as output. The 3D-VAE-GAN will be trained using 2D images of a 3D object with its corresponding 3D polygon mesh.
Research Aim, Scope and Expected Output
The objectives of the research are as follow:
- To develop a framework that generates 3D polygon mesh from a specific class of a 3D object;
- To modify the 3D-VAE-GAN internal architectural structure by implementing MCA at the last layer of GAN architecture;
- To formulate an alternative differentiable MCA that is trainable with 3D-VAE-GAN;
- To modify 3D-VAE-GAN’s loss function to accommodate the differentiable MCA;
- To develop a renderer that converts 3D polygon mesh to 3D voxel;
- To evaluate the performance of the architecture developed on 3D polygon mesh generation.
The scopes of the study are:
To generate 3D polygon mesh using 3D-VAE-GAN setup.
Expected Output
The expected output of our research work is 3D polygon mesh of 3D objects.
Research Motivation
3D Model Generation has become an active research area in computer vision and machine learning in recent time. 3D objects can be represented as 3D models based on voxels, point clouds or polygon meshes. Recent works produced 3D models that are voxels or point clouds based that are not suitable for 3D object surface representation. Some works attempt to produce 3D polygon mesh from a voxel and point cloud based 3D model using post-processing method. However, the output produces are not realistic. The need to generate 3D polygon mesh from scratch using machine learning algorithm necessitates the research title “3D Polygon Mesh Generation Using Variational AutoEncoder (VAE) and Generative Adversarial Network (GAN) “.
Research Significance and Importance
Novelty
We believed that a good generative model should be able to generate 3D objects that are both highly varied and realistic; this makes a 3D generative model of objects interesting. The main objective of this research work is to generate 3D polygon mesh from 2D image using Variational AutoEncoder (VAE) and Generative Adversarial Network (GAN). Though many researchers had worked and still working in the area of 3D generation, very few were reported about 3D polygon mesh generation. Most of the works that have the same setup with our proposed framework used post-processing method to produce 3D polygon mesh while other used approaches different from our approach. Wu et al. (2016) used 3D-VAE-GAN architecture to generate a voxel-based 3D model, Ping et al. (2018) adapted GAN architecture and Marching Cubes Algorithm (post-processing) to generate polygon mesh-based 3D model. Our proposed framework is similar to Wu et al. (2016) with different output. Wu et al. (2016) produced a voxel-based 3D model, and our framework intends to produce 3D polygon mesh. Our proposed framework will not perform post-processing as it was recorded by Ping et al. (2018). It will generate the 3D polygon mesh by incorporating a marching cubes algorithm in the last layer of the generator network of GAN. Our framework will also use a renderer to convert our dataset to an accepted input for 3D-GAN.
Research Gap
3D object generation is a standing problem in computer graphics and computer vision community for years. To overcome the problem, researchers provide solutions using both geometrical and machine learning approaches. To an extent, a laudable achievement recorded by generating a voxel and a point cloud based 3D model. However, an attempt to produces a 3D polygon mesh suitable for a 3D object surface representation with post-processing method yielded a little result in term of accuracy. In this study, we intend to investigate how to generate a 3D polygon mesh using machine learning approach to further improve a 3D model generation. We will also extend our proposed architecture to perform 3D reconstruction from a single 2D image.
Research Problems
The research problems for the proposed research work are as follow:
- How can a 3D polygon mesh be generated from a specific class of a 3D object?
- How can 3D-VAE-GAN be adapted to generate 3D polygon mesh?
- How can MCA be trainable with 3D-VAE-GAN to enhance the polygon mesh generation?
- How can the MCA be integrated into the 3D-VAE-GAN’s loss function?
- How can we feed both vertex list and polygon-face list into 3D-GAN architecture as inputs?
- How can the proposed framework be used to reconstruct 3D polygon mesh from 2D image?
- How does the polygon mesh-based 3D Generation framework perform?
Assessment of Commercial Potential of the Work
Expected Findings, Outputs or possible Applications created based on the work
The output for the research is 3D polygon mesh. Polygon mesh generator, 3D polygon mesh classifier and deep marching cubes algorithm will be developed.
The key Stakeholders for our Research Output
The following are direct stakeholders that will make use of the output of my research output:
- Game Developers
- Augmented Reality (AR) Developers
- Robotics Engineers
- Autonomous Car Developers
- Medical Imaging Technicians
Game developers are software developer that specializes in video game development. This has to do with the process and related disciplines of creating video games. They make use of our 3D model to develop their games.
They are software developers that integrate digital information with the user's surroundings in real time whose elements are "augmented" by computing-generated real-world sensory input such as video, sound, graphics or GPS data.
Robotics engineers use computer-aided design and drafting (CADD) and computer-aided manufacturing (CAM) systems to perform their tasks. Robotics research engineers design robotic systems and research methods to manufacture them economically.
They are software developers that develop a driverless car with the help of machine learning algorithm.
Medical imaging technicians are responsible for gathering images through X-rays, ultrasounds and other equipment. These images are then used by doctors and other health care professionals to diagnose or more closely examine medical issues, concerns or conditions. Medical imaging technicians play a huge role in giving physicians the up-close look needed to determine what type of care a patient needs.
Commercial Potential of our Research
Yes, my research has a commercial potential. Since the Internet of Things (IoT) has become the order of the day, there is a need to integrate some elements of artificial intelligence into IoT which make it work well. Machine learning developers are responsible for designing machine learning models incorporated into IoT. My research output can be used by above stakeholders as inputs for their applications. For example, AR developers need my research output to serve as input for its various applications. Some of the applications are Airport passenger app, Ikea place app, Dulux visualizer, Measured app, Envisioned by the mine, Sephora, Accuvein etc. 3D Polygon Mesh Generation using Variational AutoEncoder and Generative Adversarial Network fall under direct commercialization. I intention to pursue the commercial potential of my research is by seeking the support of Multimedia University through the Entrepreneur Development Centre, Multimedia University to help in protecting the intellectual property (IP) arise from my research, generation of revenue and profits through the licensing of this IP to commercial organizations and individuals. MMU through EDC should coordinate commercialization process of the research output. 5. 4 Reasons/JustificationsMultimedia University will be in charge of commercialization process of my research work since I am under MMU GRA. The profit from the research work should be shared as stipulated in MMU guidelines for research commercialization process.
Value Proposition Canvas (VPC) and Business Model Canvas (BMC)
- Value Proposition Canvas (VPC) is a tool to help entrepreneurs to visualize, design and test how their business creates value for the customers.
- Business Model Canvas (BMC) is a visual template for developing a new or documenting an existing business model.
Concluding Remarks
Our work on 3D polygon mesh generation uses a novel 3D-VAE-GAN network. Latent representations used as input to 3D-GAN were obtained by feeding 2D image into VAE as against Goodfellow et al. (2014) specification. Our network is able to recover the 3D object corresponding to that 2D image and results obtained shows that 3D-VAE-GAN model can generate 3D polygon mesh. We believed that our business model will help us to pursue the commercial potential of our research.