GENERATING NEW 3D MODELS FROM TEXT MAGIC3D
Nvidia, a GPU manufacturer, has announced Magic3D, a generative AI that can produce 3D models from a text prompt.
Generative AI for 3D modeling aids in the conceptualization of components with complex, and organic shapes. Whereas, 3D printing, is an ideal technology for bringing these shapes to life because it is capable of producing complex structures while also being cost-effective.
The results from the annual 3D Printing Industry Executive Survey, show that the automatic generation of 3D models using AI is a hot topic.
Magic3D creates a 3D mesh model with colored texture within 40 minutes. This comes right after the company entered a prompt like “A blue poison-dart frog sitting on a water lily.” The result obtained, with improvements, can be utilized in CGI art scenes or video games. Nvidia outlines Magic3D in its academic paper as a response to DreamFusion, a text-to-3D model released by Google researchers in September 2022. In other news, Physna Inc. created a generative AI prototype for 3D models and scenes in two weeks using 8,000 models.
The researchers in the paper explained how this technology will allow anyone to create 3D models without the need for special training. “Once refined, the resulting technology could speed up video game (and VR) development and perhaps eventually find applications in special effects for film and TV. We hope with Magic3D, we can democratize 3D synthesis and open up everyone’s creativity in 3D content creation.”
Nvidia is well-placed to progress AI. The company’s GPUs can create lifelike graphics using shaders, which instruct each pixel in an image on how to display in a specific light. The shader is calculated for each pixel, a repetitive calculation across numerous pixels. Nvidia GPUs can quickly render images due to their design for conducting multiple simple calculations, like shading pixels, at once, unlike Intel microprocessors or general-purpose CPUs. Nvidia sees AI applications as a critical growth driver, Bloomberg has attributed a $4.6 billion increase in the wealth of Nvidia founder Jensen Huang to the popularity of ChatGPT – an AI chatbot.
What tasks can Magic3D perform?
Magic3D employs a two-stage method that takes a rough model created in low resolution and optimizes it to a higher resolution which is similar to DreamFusion using a text-to-image model to produce a 2D image that is then optimized into volumetric NeRF (Neural radiance field) data. Based on the authors of the paper, the resulting Magic3D technique can produce 3D objects twice as fast as DreamFusion.
Magic3D can also conduct prompt-based 3D mesh editing. Provided a base prompt and a low-resolution 3D model, the text can be modified to change the resulting model. Additionally, the authors of Magic3D illustrated preserving the same subject across multiple generations (a concept known as coherence) and implementing the style of a 2D image (such as a cubist painting) to a 3D model.
Generative AI and 3D printing: a future with huge potential
Paul Powers, Founder, and CEO of Physna Inc. shared his thoughts on creating equitable 3D Generative AI. The CEO says that generative AI conquered 2022 and consequently the firm decided to take a dive into combining 3D printing and generative AI. Although Physna is a 3D search and analysis company focused on engineering and design applications in AR/VR and manufacturing, it built a very basic generative AI prototype for 3D models and scenes in 2 weeks using only 8,000 models with just 3 engineers.
Powers further explained the reason behind this experiment. He claims that generative AI has taken many industries by storm but is lacking behind in 3D printing. The main reasons behind this delay are complicated 3D models and a lack of labeled 3D data. 3D models have conventionally been difficult to create, come in a variety of incompatible formats, and have received scant attention in comparison to 2D model analysis (text, images, video, etc.). Few enterprises are fitted to concentrate on 3D because it has historically been a difficult issue to overcome at the analytical level.
Furthermore, Google’s DreamFusion team summarized the second issue in their article last year. There is less 3D data as compared to 2D data. Google’s DreamFusion team utilized NeRFs in the same way that Nvidia’s Magic3D team did (Neural Radiance Fields). They are also empty “shells” in the sense that they lack geometry and internal components, says Paul.
This means users not only possess less information on the object at hand, but it’s also difficult to make assumptions about the technology. While training on NeRFs may be more helpful than training on 2D models, as Google’s DreamFusion team pointed out, “NeRFs isn’t a good substitute for true, labeled 3D models.” This means that, in the absence of a solution, generative AI will not perform nearly as well in 3D as it does in other areas in the near future. The company further carried out some experiments to check the compatibility of generative AI with 3D printing.
How does GPU computing improve 3D printing?
GPU computing entails the use of a GPU (graphics processing unit) as a co-processor to expedite CPUs for technological and scientific computing. By offloading some of the time-consuming and compute-intensive code, the GPU speeds up CPU-based applications.
The remainder of the application continues to run on the CPU. From the user perspective, the application runs quicker as it makes use of the GPU’s parallel processing power to enhance performance. This type of computing is known as “hybrid” or “heterogeneous” computing. A CPU typically has four to eight CPU cores, whereas a GPU typically has hundreds of smaller cores. The GPU gains its high computing performance from its massive parallel structure.
Application developers can take advantage of the parallel GPU architecture’s performance by employing NVIDIA’s “CUDA” parallel programming model. The NVIDIA CUDA parallel-programming model is supported by all NVIDIA GPUs, including GeForce, Quadro, and Tesla. Previously, Nvidia introduced a way of converting 2D images into 3D models.
The framework demonstrates how it is possible to infer shape, texture, and light from a single image, in a manner similar to how the naked eye works. NVIDIA PR specialist Lauren Finkle wrote on the company blog, “Close your left eye as you look at this screen. Now close your right eye and open your left, you’ll notice that your field of vision shifts depending on which eye you’re using. That’s because while we see in two dimensions, the images captured by your retinas are combined to provide depth and produce a sense of three-dimensionality.”
The NVIDIA rendering framework, known as a differentiable interpolation-based renderer, or DIB-R, has the potential to assist and expedite different areas of 3D design and robotics, rendering 3D models in seconds. According to Finkle, the 3D world we exist in is actually viewed through a 2D lens, which is known as stereoscopic vision.
Depth is created in the brain by merging images seen through each eye, giving the impression of a three-dimensional image. DIB-R, which works on a similar principle, can predict the shape, color, texture, and lighting of an image by transforming input from a 2D image into a map. This map is then utilized to create a polygon sphere, resulting in a 3D model that represents the component in the original 2D image.
Elsewhere, Daghan Cam, previously a teaching fellow at the University College London’s Bartlett School of Architecture, created a flawless 3D printed architecture using GPU computing. Cam used his expertise with the CUDA parallel programming model and NVIDIA GPUs to teach his robotic fabrication system to utilize algorithms to finish his abstractly designed structures before 3D printing a 3D printed prototype.
Cam turned to Boston Limited and Materialise to 3D print his modernistic prototype design after finishing the 3D model with a Quadro K6000 graphics card and Tesla K40 GPU accelerator. Materialise’s high-resolution Mammoth stereolithography printer, capable of producing large-scale and complex prints in a single piece, was employed to 3D print the prototype. The completed prototype was intricate, abstract, and extremely pleasing to the eye, and appeared to be perfectly suited for display at the MOMA or the Louvre.