Stable Fast 3D

Family

Stable Video Diffusion

Type

Foundation Model

License

Stability AI Non-Commercial Research Community License

Released

2024-08-01

How To Use

Note: Stable Fast 3D weights are released under a Stability AI Non-Commercial Research Community License, and cannot be utilized for commercial purposes. Please read the license to verify if your use case is permitted.

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run Stable Fast 3D using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

comfyanonymous /

ComfyUI

Generate images and videos using a powerful low-level workflow graph builder - the fastest, most flexible, and most advanced visual generation UI.

Model Report

stabilityai / Stable Fast 3D

Stable Fast 3D is a transformer-based generative AI model developed by Stability AI that reconstructs textured 3D mesh assets from single input images in approximately 0.5 seconds. The model predicts comprehensive material properties including albedo, roughness, and metallicity, producing UV-unwrapped meshes suitable for integration into rendering pipelines and interactive applications across gaming, virtual reality, and design workflows.

Explore the Future of AI

Your server, your data, under your control

Stable Fast 3D (SF3D) is a generative artificial intelligence model developed by Stability AI for rapid 3D reconstruction from a single input image. SF3D produces textured, UV-unwrapped 3D mesh assets, facilitating downstream applications in graphics, virtual reality, and design. The model combines fast inference capabilities with explicit material parameter prediction, enabling assets that are readily integrable into modern rendering pipelines and interactive platforms. SF3D was first introduced in August 2024 and is built on advances over previous iterations such as TripoSR and SV3D, focusing on both the fidelity and usability of its outputs, as described in Stability AI News and its Hugging Face Model Card.

Technical Architecture and Features

At its core, SF3D employs a Transformer-based architecture for the image-to-3D reconstruction task, as detailed in its Hugging Face Model Card. Unlike previous methods such as TripoSR, SF3D was retrained from the ground up with a redesigned network, supporting explicit mesh extraction and optimized UV-unwrapping strategies. It predicts comprehensive surface properties—most notably, albedo, roughness, and metallicity—which are essential for realistic rendering. The illumination model is disentangled to minimize baked-in lighting, allowing the same asset to adapt seamlessly to different environments within engines or visualization pipelines, a feature highlighted in Stability AI News.

The model processes input images of 512x512 pixels, with output texture resolutions specified at inference. SF3D integrates options for mesh remeshing, supporting triangle and quadrilateral topographies, and enables fine-grained control over the vertex count, facilitating compatibility with downstream 3D workflows. Remeshing implementations are based on established algorithms such as Botsch and Kobbelt's multiresolution modeling and Instant Field-Aligned Meshes, as further detailed in the SF3D GitHub repository, and introduce only minor computational overhead.

Training Data and Methodology

SF3D was trained primarily on the Objaverse dataset, which offers a diverse set of 3D models suitable for robust generalization. To better match the image distribution encountered in practical use cases, Stability AI enhanced its rendering methodology and implemented a rigorous data curation process, selectively including Objaverse objects based on licensing and suitability, as described in its Hugging Face Model Card. This approach allowed SF3D to learn from a spectrum of object geometries and textures, improving its ability to generalize to novel data.

The supervised training regime focused on reconstructing both geometry and physically-informed material properties. The architectural innovations and dataset augmentations collectively enabled the disentanglement of geometry and lighting, a critical factor for creating flexible, reusable 3D assets, as discussed in the SF3D Technical Report.

Performance and Evaluation

SF3D distinguishes itself through inference speed and mesh quality. The model reconstructs a 3D asset in approximately 0.5 seconds on a standard consumer-grade GPU, reducing turnaround compared to preceding models, such as SV3D, which required upwards of ten minutes for similar tasks. SF3D achieves this with a parameter count of 1.01 billion, balancing efficiency and capacity as documented in Stability AI News and its Hugging Face Model Card.

Empirical comparisons show that SF3D produces high-fidelity meshes, with uniform UV maps and reduced illumination artifacts compared to both TripoSR and ground truth references. The model's material parameter prediction further improves the realism and adaptability of its outputs across rendering contexts. Visual evaluations consistently demonstrate sharper geometry and more accurate textures.

Applications and Use Cases

SF3D enables a wide spectrum of use cases in both creative and technical domains. Its rapid asset generation is particularly beneficial for rapid prototyping in game development and virtual reality content creation, where numerous static objects—such as props and furniture—must be generated efficiently, as noted by Stability AI. Designers and artists leverage the model's outputs for concept creation, visualization, and digital art workflows, while educational platforms utilize the reconstruction capability to generate interactive learning content.

The model's explicit material predictions and UV-mapped outputs facilitate direct integration into photorealistic rendering engines, supporting applications in architecture, retail visualization, and augmented reality experiences. SF3D is also suited for research on 3D representation learning and generative modeling, offering a resource for studying reconstruction limitations and dataset biases, as outlined in its Hugging Face Model Card. The outputs are not designed to provide factual representations of real individuals or historical events, reflecting guidelines set forth in the model's use policy.

Limitations and Licensing

The scope of SF3D is purposefully constrained: it is not optimized for realistic representation of people or events, nor for applications requiring precise factual correspondence, as stated in its Hugging Face Model Card. Support for certain backends such as Mac Silicon (MPS) and Windows remains experimental, and performance or memory usage on these platforms may differ from primary CUDA-enabled environments, described in the SF3D GitHub Repository. The model's speed and mesh quality may vary based on specific system configurations and input complexities.

Stable Fast 3D is distributed under Stability AI's Community License, permitting free academic, research, and non-commercial use, as well as commercial use for entities with annual revenue up to $1,000,000. Larger-scale commercial applications require an explicit enterprise license from Stability AI.

Related Models and Lineage

SF3D is a direct successor to TripoSR, another transformer-based 3D reconstruction model developed by Stability AI. The advances introduced in SF3D—including faster inference, mesh extraction, illumination disentanglement, and material parameter prediction—set it apart from both TripoSR and SV3D. Benchmarking evidence supports SF3D's position as a rapid and versatile 3D asset generator suitable for a variety of applied domains, as outlined in Stability AI News.