Browse Models

lllyasviel /

ControlNet SD 1.5 Open Pose

Family

Stable Diffusion 1

Type

ControlNet Model

License

CreativeML Open RAIL-M License

Released

2023-04-13

How To Use

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run ControlNet SD 1.5 Open Pose using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

comfyanonymous /

ComfyUI

Generate images and videos using a powerful low-level workflow graph builder - the fastest, most flexible, and most advanced visual generation UI.

lllyasviel /

Stable Diffusion WebUI Forge

Forge is a platform built on top of Stable Diffusion WebUI to make development easier, optimize resource management, speed up inference, and study experimental features.

Automatic1111 /

Stable Diffusion Web UI

Automatic1111's legendary web UI for Stable Diffusion, the most comprehensive and full-featured AI image generation application in existence.

bmaltais /

Kohya's GUI

Train your own LoRAs and finetunes for Stable Diffusion and Flux using this popular GUI for the Kohya trainers.

Model Report

lllyasviel / ControlNet SD 1.5 Open Pose

ControlNet SD 1.5 OpenPose is a conditioned generative model that enhances Stable Diffusion 1.5 by allowing precise control over human pose in generated images using OpenPose-extracted skeletal information. The model processes body, hand, and face keypoints to guide image synthesis, enabling pose transfer and controlled character generation. It features improved preprocessing for enhanced hand pose accuracy and refined training data to reduce artifacts and improve alignment between control signals and outputs.

Explore the Future of AI

Your server, your data, under your control

Model Architecture and Conditioning Mechanism

The ControlNet 1.1 architecture maintains the core structural properties established in ControlNet 1.0, with updates to enhance compatibility and robustness across conditioned image synthesis workflows. The OpenPose variant is engineered to accept various permutations of OpenPose outputs, including body, hand, face keypoints, or their combinations. Two principal input modes are typically utilized: "Openpose" (body only) and "Openpose Full" (body, hands, and face).

During the generation process, the model leverages a conditioning pathway in which OpenPose-derived pose information is processed and aligned with the Stable Diffusion U-Net backbone. A global average pooling layer is introduced between the encoder outputs of ControlNet and the U-Net to facilitate effective feature melding. As a core architectural principle, ControlNet manipulations are applied exclusively on the conditional branch of the classifier-free guidance (CFG) scale, allowing for controlled yet diverse output generation.

Training Data and Improved Preprocessing

The improvements in ControlNet SD 1.5 OpenPose result from an updated implementation of the OpenPose processor, which includes greater accuracy, especially in capturing hand poses. This preprocessing contributes to more reliable pose-conditioned outputs. The training set, compared to previous iterations, underwent several refinements, such as removing duplicate grayscale human images, filtering out low-fidelity or artifact-laden samples, and correcting errors in prompt-to-image pairs. These dataset enhancements mitigate prior issues where grayscale or poorly matched generated images would occur, leading to results more closely aligned with the intended control signals.

ControlNet 1.1 OpenPose model results for 'handsome boys in the party' prompt.

Output of ControlNet SD 1.5 OpenPose with 'handsome boys in the party' prompt and an OpenPose skeleton derived from a source image. Illustrates translation of input pose (from an image featuring women) into images of boys, highlighting model's ability to achieve pose transfer while adhering to semantic prompt.

Full Size Image Image Source

Applications and Use Cases

The primary utility of ControlNet SD 1.5 OpenPose manifests in scenarios demanding fine-grained control of character pose and composition, particularly within artistic illustration, animation, and synthetic media generation. By accepting pose skeletons as explicit guides, the model allows users to influence the body structure and dynamics present in generated images. This enables pose transfer, interactive design, and creative workflows where reproducible human pose and scene layouts are required. Additionally, it enables applications in visual storytelling, character design for games or virtual environments, and the rapid prototyping of human-centric visual content.

Comparison with Other ControlNet 1.1 Models

ControlNet 1.1 encompasses a broader set of conditionally guided models, each specialized for different types of structural cues. Alongside ControlNet SD 1.5 OpenPose, models in the suite include ControlNet SD 1.5 Depth, which utilizes depth estimation; ControlNet SD 1.5 NormalBae for normal map guidance; ControlNet SD 1.5 Canny for edge-based control; and ControlNet SD 1.5 Scribble, ControlNet SD 1.5 Lineart, and ControlNet SD 1.5 Seg variants for translating line-based or segmented image inputs into photorealistic outputs. Each model in the family is trained or fine-tuned on the Stable Diffusion 1.5 backbone, ensuring consistent compatibility across diverse conditional modalities.

Experimental models, such as ControlNet SD 1.5 Shuffle for image recomposition and ControlNet SD 1.5 Instruct Pix2Pix for instruction-driven image alteration, expand the functional repertoire. Additionally, memory-efficient variants, called Control-LoRAs, adopt parameter-efficient fine-tuning, which can reduce model size while retaining model flexibility.

Limitations

Despite the improvements introduced in the 1.1 release, certain limitations persist. ControlNet SD 1.5 OpenPose, like other ControlNet 1.1 models, may depend on the quality of its conditioning signal—performance can degrade if the extracted pose information is noisy or ambiguous. Some models in the repository, such as ControlNet SD 1.5 Instruct Pix2Pix or ControlNet SD 1.5 Tile, remain experimental and may not yield consistent results in all scenarios. It is recommended for users to utilize only supported plugins and extensions for integration to ensure optimal performance and compatibility, as documented in the official discussion threads.

ControlNet SD 1.5 Open Pose

Laboratory OS

Direct Download

ComfyUI

Stable Diffusion WebUI Forge

Stable Diffusion Web UI

Kohya's GUI

Explore the Future of AI

Your server, your data, under your control

ControlNet SD 1.5 Open Pose

Laboratory OS

Direct Download

ComfyUI

Stable Diffusion WebUI Forge

Stable Diffusion Web UI

Kohya's GUI

Explore the Future of AI

Your server, your data, under your control

Model Architecture and Conditioning Mechanism

Training Data and Improved Preprocessing

Applications and Use Cases

Comparison with Other ControlNet 1.1 Models

Limitations

Availability and Further Resources

Helpful Links