Apps Models Pricing

Loading...

/

Terms & Conditions

/

/

Acceptable Use Policy

Browse Models

lllyasviel /

ControlNet SD 1.5 Canny

Stable Diffusion 1

Type

ControlNet Model

License

CreativeML Open RAIL-M License

Released

2023-04-13

How To Use

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run ControlNet SD 1.5 Canny using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

ComfyUI

comfyanonymous /

ComfyUI

Generate images and videos using a powerful low-level workflow graph builder - the fastest, most flexible, and most advanced visual generation UI.

Stable Diffusion WebUI Forge

lllyasviel /

Stable Diffusion WebUI Forge

Forge is a platform built on top of Stable Diffusion WebUI to make development easier, optimize resource management, speed up inference, and study experimental features.

Stable Diffusion Web UI

Automatic1111 /

Stable Diffusion Web UI

Automatic1111's legendary web UI for Stable Diffusion, the most comprehensive and full-featured AI image generation application in existence.

Kohya's GUI

bmaltais /

Kohya's GUI

Train your own LoRAs and finetunes for Stable Diffusion and Flux using this popular GUI for the Kohya trainers.

Model Report

lllyasviel / ControlNet SD 1.5 Canny

ControlNet SD 1.5 Canny is a conditioning model for Stable Diffusion 1.5 that uses Canny edge detection to control image generation structure. Part of the ControlNet 1.1 family, it processes edge maps to guide composition while preserving text-prompt flexibility. The model features improved training data quality, enhanced robustness, and supports multi-ControlNet workflows for complex conditioning scenarios.

Explore the Future of AI

Your server, your data, under your control

ControlNet SD 1.5 Canny is a model within the ControlNet 1.1 family, designed to condition image generation in Stable Diffusion 1.5 using Canny edge maps as control signals. The model leverages Canny edge detection—a classic computer vision algorithm that highlights image boundaries—to influence the composition and structure of generated outputs, facilitating precise and repeatable control over the creative process. ControlNet SD 1.5 Canny builds upon its predecessor, offering improvements in robustness and visual quality by refining its data pipeline and training protocol.

SCNNR naming convention diagram

Model Architecture and Technology

The underlying architecture of ControlNet 1.1, including the Canny variant, retains the core structure of earlier releases. The system is built around the integration of ControlNet's auxiliary encoding with the Stable Diffusion 1.5 U-Net backbone, facilitating image generation that is explicitly guided by structural information extracted via Canny edge maps. During inference, the ControlNet Encoder output is processed with a global average pooling operation before fusion with the main diffusion U-Net latent space, a design consideration outlined in the official ControlNet-v1-1-nightly repository.

A critical operational detail is that ControlNet is integrated on the conditional branch of the classifier-free guidance (CFG) mechanism in Stable Diffusion, ensuring that the edge-derived conditions robustly influence the generation process without affecting unconditional guidance. This architecture allows the user to exert granular control over image composition while maintaining the flexibility of text-conditioned generation.

The ControlNet 1.1 framework includes support for "Multi-ControlNet," where multiple ControlNet models can be used in tandem, enabling combinations of edge, pose, segmentation, and other modalities. This functionality is accessible in popular Stable Diffusion user interfaces and facilitates complex conditioning scenarios, as discussed in the main ControlNet documentation.

Training Data and Methodology

ControlNet SD 1.5 Canny underwent extensive improvements in its training dataset compared to its 1.0 predecessor. The data curation process prioritized the removal of duplicated grayscale human images—previously found to bias results toward grayscale generations—alongside thorough filtering of low-quality, blurry, or JPEG-artifacted samples. Furthermore, errors in paired prompts, stemming from processing bugs, were systematically resolved, resulting in higher-quality image-text associations during training. Details of these changes are described in the project documentation on HuggingFace.

The training process utilized a large compute budget, employing eight NVIDIA A100 80GB GPUs and a batch size of 256 over three days. This enabled robust convergence and improved generalization, as evidenced by performance benchmarks and output quality. The training procedure not only resumed weights from the earlier Canny 1.0 model but also incorporated random Canny edge thresholding for input diversities, further enhancing downstream robustness.

Model Performance and Output Characteristics

Qualitative and batch testing of ControlNet SD 1.5 Canny demonstrates marked improvements in both consistency and fidelity compared to previous versions. The model is capable of generating images that closely align with the structure established by input Canny edge maps while allowing rich variations in style and detail via text prompts. Results from systematic tests, including the frequently referenced "dog in a room" scenario, illustrate the model's proficiency at preserving spatial layout and prominent edges while synthesizing visually coherent outputs.

Batch test of ControlNet SD 1.5 Canny model output

The architecture also supports the combination of multiple ControlNet modules, enhancing possibilities for creative compositing. As a result, users are able to generate complex scenes or character variations with fine-grained structural and stylistic control, which has been demonstrated in various multi-ControlNet scenarios shared in the open-source ControlNet repository.

Multiple photo-realistic renders of an Iron Man-like figure generated by ControlNet using edge and shuffle controls

Applications and Use Cases

ControlNet SD 1.5 Canny is primarily utilized for tasks where preserving or interpreting visual structure is essential. By conditioning image generation on edge maps, it has become a tool of choice for applications such as sketch-to-image synthesis, controlled style transfer, and content-aware modification. Artists and developers use the model to transform Canny edge-detected sketches or extracted edge maps from photographs into fully rendered outputs matching textual prompts, while reliably maintaining underlying composition.

The model further supports integration into multi-module workflows, enabling comprehensive scene control when combined with other conditioning sources such as pose estimation, segmentation, or depth maps. The supported batch inference and randomization features facilitate the rapid generation of diverse outputs from a single edge map source.

In professional and experimental contexts, this model enables workflows such as controlled data augmentation, rapid prototyping of concept art, and forensic applications where structural preservation from original images is required.

Limitations and Related Models

While ControlNet SD 1.5 Canny offers notable improvements in visual fidelity and robustness, certain limitations remain. For instance, when handling scenes with ambiguous or low-quality edge detection, the model's outputs may deviate from intended semantics. Additionally, some features—such as tiled upscaling or specific anime-trained variants—require auxiliary files or updated user interface extensions, as noted in the repository's guidance.

The broader ControlNet 1.1 family introduces several other conditioning modalities built on the same architecture, including models for depth map extraction, normal maps, semantic segmentation, scribble input, and more. Each variant is designed for specific structural signal integration, following the Standard ControlNet Naming Rules to ensure clarity in configuration and interoperability.

Licensing and Access

ControlNet SD 1.5 Canny, like its family counterparts, is released in a public repository, facilitating open research and reproducibility. While the specific license is not explicitly stated in the repository, public distribution is maintained on GitHub and HuggingFace, providing access to model weights, configuration files, and related resources. Users are advised to consult the repository for definitive licensing information before redistribution or derivative works.

References and Further Reading

ControlNet-v1-1-nightly GitHub Repository
ControlNet 1.1 HuggingFace Model Page
Stable Diffusion 1.5 HuggingFace Model
ControlNet Annotators HuggingFace Page
Main ControlNet Documentation
Surface Normal Uncertainty Method
ScanNet Dataset
InstructPix2Pix Dataset
ComfyUI GitHub
StableSwarmUI GitHub
Stability-AI ComfyUI Nodes
ClipDrop Portrait Depth Estimation