Browse Models

lllyasviel /

ControlNet SD 1.5 MLSD

Family

Stable Diffusion 1

Type

ControlNet Model

License

CreativeML Open RAIL-M License

Released

2023-04-13

How To Use

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run ControlNet SD 1.5 MLSD using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

comfyanonymous /

ComfyUI

Generate images and videos using a powerful low-level workflow graph builder - the fastest, most flexible, and most advanced visual generation UI.

lllyasviel /

Stable Diffusion WebUI Forge

Forge is a platform built on top of Stable Diffusion WebUI to make development easier, optimize resource management, speed up inference, and study experimental features.

Automatic1111 /

Stable Diffusion Web UI

Automatic1111's legendary web UI for Stable Diffusion, the most comprehensive and full-featured AI image generation application in existence.

bmaltais /

Kohya's GUI

Train your own LoRAs and finetunes for Stable Diffusion and Flux using this popular GUI for the Kohya trainers.

Model Report

lllyasviel / ControlNet SD 1.5 MLSD

ControlNet SD 1.5 MLSD is a controllable image generation model that uses Mobile-optimized Line Segment Detection to guide Stable Diffusion 1.5 outputs through straight line structures extracted from input images. This model maintains the same architecture as ControlNet 1.0 while incorporating improved training data quality, additional augmentation techniques, and enhanced robustness for applications requiring geometric consistency and linear feature preservation.

Explore the Future of AI

Your server, your data, under your control

Model Architecture and Technical Innovations

ControlNet SD 1.5 MLSD adopts the same neural network architecture as its predecessor, ControlNet 1.0. This architectural consistency facilitates compatibility across versions and ensures a stable foundation for ongoing research and deployment. The key input to the model is a set of straight lines extracted by the MLSD preprocessor, which serves as a structural constraint during image synthesis.

The model is distributed via the files control_v11p_sd15_mlsd.pth and control_v11p_sd15_mlsd.yaml. During inference, global average pooling is used between the ControlNet Encoder outputs and the Stable Diffusion UNet layers to optimize feature transmission across the network. Additionally, the model is designed to interact exclusively with the conditional side of the Classifier-Free Guidance (CFG) scale, a behavior that can be controlled using the YAML configuration file's global_average_pooling item.

Data, Training, and Improvements

Considerable enhancements were introduced in ControlNet 1.1 MLSD to resolve data-related limitations observed in previous releases. The training dataset was refined by eliminating duplicated grayscale images, removing low-quality samples, and correcting prompt-image pair mismatches, resulting in increased robustness and reliability.

The dataset was further expanded by appending 300,000 new images, each selected on the basis of containing more than sixteen straight lines as determined by MLSD analysis. Data augmentation strategies, including random left-right flipping, were applied to enhance the model's generalization. Training resumed from the MLSD 1.0 checkpoint and involved an additional 200 GPU hours of computation using A100 80G GPUs.

The central training signal for this model derives from M-LSD Lines, ensuring that structural accuracy in generated outputs is prioritized.

Screenshot of ControlNet SD 1.5 MLSD batch output

Batch test output from the ControlNet 1.1 MLSD model using prompt 'room' and random seed 12345, demonstrating M-LSD line structure control.

Full Size Image Image Source

Applications and Use Cases

ControlNet SD 1.5 MLSD is particularly well-suited to applications that require adherence to geometric structures, such as interior design visualizations, architectural sketches, and scenes where linearity and spatial coherence are vital. By using M-LSD as a preprocessor, the model enables users to infuse a high degree of shape guidance into image generation without sacrificing the creative flexibility afforded by Stable Diffusion 1.5. This approach empowers users to produce variations of visual content that are constrained by real or imagined sets of straight lines, facilitating tasks ranging from synthetic data generation for computer vision to design prototyping.

Comparative Models and Alternative Approaches

ControlNet 1.1 encompasses a broader suite of models, each optimized for different control signals. Other notable variants include models for Canny edge detection, depth maps, semantic segmentation, normal maps, and pose estimation, among others. Some models in the family, such as those for content shuffle or tile-based control, are marked as experimental and may require additional validation before use in production contexts.

Alongside ControlNet, the Control-LoRA approach provides an alternative mechanism for model control, achieving similar goals via low-rank parameter-efficient fine-tuning. Control-LoRAs enable model size reductions—down from 4.7GB for standard ControlNet models to approximately 738MB for Rank 256 variants and around 377MB for Rank 128—making them attractive for environments with limited resources.

Limitations and Operational Notes

While ControlNet SD 1.5 MLSD demonstrates robust performance in line-guided generation, certain limitations and caveats are noted. Experimental models within ControlNet 1.1, such as Shuffle, Instruct Pix2Pix, and Tile, may produce inconsistent results requiring selective curation.

For users seeking integration with the Automatic1111 (A1111) toolkit, it is recommended to use the dedicated sd-webui-controlnet extension, as the primary ControlNet 1.1 repository is not structured as an A1111 extension, and multi-ControlNet support is currently A1111-exclusive.

Some specialized models, such as Anime Lineart, impose further operational constraints, including the lack of support for Guess Mode and the requirement for external checkpoint files not bundled within the core release.

External Resources

ControlNet GitHub Repository — Home of the ControlNet open-source project.

ControlNet-v1-1 HuggingFace Model Page — Official repository for downloading ControlNet 1.1 model checkpoints.

Annotators HuggingFace Models — Preprocessors needed for running various ControlNet models.

runwayml/stable-diffusion-v1-5 HuggingFace — Official Stable Diffusion v1.5 model files.

sd-webui-controlnet GitHub Repository — The recommended Automatic1111 extension for ControlNet integration.

Control-LoRA documentation — Technical reference for parameter-efficient ControlNet alternatives.

Lineart Image-to-Line-Drawings Dataset — Dataset used in lineart-based ControlNet training.

Instruct Pix2Pix Dataset — Dataset underlying experimental Pix2Pix-based model variants.

Project Annotator Documentation — Technical details on annotator preprocessing and Shuffle model.

ControlNet SD 1.5 MLSD

Laboratory OS

Direct Download

ComfyUI

Stable Diffusion WebUI Forge

Stable Diffusion Web UI

Kohya's GUI

Explore the Future of AI

Your server, your data, under your control

ControlNet SD 1.5 MLSD

Laboratory OS

Direct Download

ComfyUI

Stable Diffusion WebUI Forge

Stable Diffusion Web UI

Kohya's GUI

Explore the Future of AI

Your server, your data, under your control

Model Architecture and Technical Innovations

Data, Training, and Improvements

Applications and Use Cases

Comparative Models and Alternative Approaches

Limitations and Operational Notes

Release History and Documentation

External Resources