Browse Models
The simplest way to self-host ControlNet SD 1.5 Scribble. Launch a dedicated cloud GPU server running Lab Station OS to download and serve the model using any compatible app or framework.
Download model weights for local inference. Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on your system resources, particularly GPU(s) and available VRAM.
ControlNet SD 1.5 Scribble enables sketch-based control of Stable Diffusion 1.5 image generation. It processes hand-drawn or computer-generated line art up to 24 pixels wide, offering adjustable balance between sketch adherence and text prompts. Notable for its Control-LoRA implementation reducing model size while maintaining effectiveness.
ControlNet SD 1.5 Scribble is a specialized model within the ControlNet family designed to add conditional control to text-to-image diffusion models, specifically Stable Diffusion 1.5. The model leverages a neural network architecture to guide image generation by incorporating information from sketches or line drawings, enabling users to create detailed images based on simple drawings while maintaining precise control over composition and subject placement.
The model is part of the ControlNet 1.1 release, which maintains the same architecture as ControlNet 1.0, with plans to keep this architecture stable until at least version 1.5. The Scribble variant was trained specifically on synthesized scribbles and can accept both synthesized scribbles (from preprocessors like Scribble_HED and Scribble_PIDI) and hand-drawn scribbles as input. The training process involved aggressive random morphological transforms to handle thicker scribbles up to 24 pixels wide in a 512-pixel canvas, making it more robust than previous versions.
Training was conducted using 200 GPU hours on A100 80G GPUs, building upon the foundation of Scribble 1.0. The training data improvements addressed issues present in the 1.0 version, such as duplicated images and incorrect prompts, leading to more reasonable and robust results. More details about the underlying technology can be found in the original ControlNet research paper.
The model excels at translating both synthesized and hand-drawn sketches into fully realized images, offering unique creative workflows not available in standard Stable Diffusion implementations. Multiple variants exist, including scribble_hed
, scribble_pidinet
, and t2ia_sketch_pidi
, each potentially producing slightly different results due to variations in training parameters.
When compared to other ControlNet models in the family, such as Canny (focused on edge detection) and OpenPose (specialized in pose estimation), the Scribble model stands out for its ability to work directly with sketches. The model performs particularly well with relatively thick scribbles, making it more forgiving for hand-drawn input compared to other edge-detection models like SoftEdge_PIDI and SoftEdge_HED.
The model can be implemented through various interfaces, with the most popular being the Automatic1111 ControlNet extension. The extension includes a "smart resampling algorithm" that maintains pixel-perfect control images regardless of resolution changes between input and output.
For optimal performance, users can adjust different control modes:
These modes allow fine-tuning of the balance between text prompts and control images.
A lighter alternative exists in the form of Control-LoRAs, which achieve similar capabilities through low-rank parameter-efficient fine-tuning. The LoRA versions come in Rank 256 (~738MB) and Rank 128 (~377MB) variants, significantly smaller than the original 4.7GB ControlNet models, making them more accessible for users with consumer-grade GPUs.