Browse Models
The simplest way to self-host ControlNet SD 1.5 Depth. Launch a dedicated cloud GPU server running Lab Station OS to download and serve the model using any compatible app or framework.
Download model weights for local inference. Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on your system resources, particularly GPU(s) and available VRAM.
ControlNet SD 1.5 Depth enables depth-aware image generation using depth maps to control spatial composition. It supports multiple depth estimation methods (MiDaS, Leres, Zoe) and can be combined with other ControlNet models. Notable for its Control-LoRA variants that maintain functionality at reduced model sizes.
ControlNet SD 1.5 Depth is a specialized model in the ControlNet family designed to provide conditional control over image generation using depth information. Built on Stable Diffusion 1.5, it enables users to manipulate and generate images guided by depth maps, offering precise control over spatial relationships and composition in the generated output.
The model was trained on a comprehensive dataset combining depth maps from multiple sources, including Midas, Leres, and Zoe, at various resolutions (256, 384, and 512 pixels). The training process incorporated data augmentation techniques, including random left-right flipping, to improve model robustness. A significant update in version 1.1 refined the training dataset to address issues with duplicated images and incorrect prompts, resulting in a more unbiased model that's less prone to generating grayscale human images, as detailed in the ControlNet v1.1 repository.
The model is available in multiple variants, including Control-LoRA versions that significantly reduce the model size. While the original ControlNet models are approximately 4.7GB, the Control-LoRA variants come in Rank 256 (738MB) and Rank 128 (377MB) sizes, making them more suitable for consumer-grade GPUs. These compressed versions were achieved through low-rank parameter-efficient fine-tuning and were trained on a diverse dataset encompassing various image concepts and aspect ratios.
The model supports multiple depth map preprocessors, including depth_midas
, depth_zoe
, depth_leres++
, and depth_leres
, each offering slightly different depth estimations and resulting image outputs. This flexibility allows users to choose the most appropriate preprocessor for their specific use case. The model demonstrates robust performance with real depth maps from rendering engines and can effectively process depth information across different resolutions.
A key feature of the ControlNet implementation is its smart resampling algorithm, which ensures pixel-perfect control images regardless of resolution. The model can be used in conjunction with other ControlNet models through the "Multi-ControlNet" feature, exclusively supported in the Automatic1111 WebUI.
The model can be implemented through various interfaces, including ComfyUI, StableSwarmUI, and the popular Automatic1111 WebUI. For optimal performance on systems with limited VRAM, users are recommended to use specific command-line flags: --medvram-sdxl
for systems with 8GB to 16GB of VRAM, and --lowvram
for systems with less than 8GB.
The model files can be placed in either stable-diffusion-webui\extensions\sd-webui-controlnet\models
or stable-diffusion-webui\models\ControlNet
when using the Automatic1111 interface. The implementation supports various preprocessors and can be combined with other ControlNet models for more complex image generation tasks.