Launch a dedicated cloud GPU server running Laboratory OS to download and run ControlNet SD 1.5 Tile using any compatible app or framework.
Direct Download
Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.
Forge is a platform built on top of Stable Diffusion WebUI to make development easier, optimize resource management, speed up inference, and study experimental features.
Train your own LoRAs and finetunes for Stable Diffusion and Flux using this popular GUI for the Kohya trainers.
Model Report
lllyasviel / ControlNet SD 1.5 Tile
ControlNet SD 1.5 Tile is a specialized image generation model developed by lllyasviel that operates within the Stable Diffusion 1.5 ecosystem by segmenting images into discrete tiles for localized detail enhancement and reconstruction. The model excels at super-resolution, detail restoration, and region-specific reinterpretation while maintaining global image structure and semantic appropriateness within each tile, making it particularly effective for upscaling degraded or low-resolution images.
Explore the Future of AI
Your server, your data, under your control
ControlNet SD 1.5 Tile is a generative artificial intelligence model developed by lllyasviel as part of the ControlNet 1.1 series. This model enables fine-grained control over image synthesis and manipulation within the Stable Diffusion 1.5 ecosystem. Distinct from conventional diffusion models, ControlNet SD 1.5 Tile specializes in the handling and regeneration of local image details by dividing input images into discrete tiles, supporting tasks such as super-resolution, detail enhancement, and region-specific reinterpretation. It operates with the capacity to selectively ignore or regenerate details based on local semantic context, which facilitates advanced image editing and upscaling while maintaining fidelity to both global structure and localized content.
Diagram explaining the standard naming conventions for ControlNet 1.1 models.
ControlNet SD 1.5 Tile operates by segmenting images into tiles and performing diffusion-based generation or correction within these localized contexts. This approach is suitable for tasks where maintaining overall image structure is crucial, yet targeted improvement or modification of specific regions is required. The primary capabilities include the ability to ignore global prompts in favor of local tile semantics, avoiding uniform propagation of generative intent across disjoint regions, and the capacity to regenerate image details that are blurred, corrupted, or underresolved.
When provided with low-resolution or artifact-laden images, such as poorly upscaled images or those with limited contextual information, the model reconstructs new, high-fidelity details in each tile. For example, when given a 64×64 image of a dog, ControlNet SD 1.5 Tile generates multiple high-resolution reinterpretations of the input, preserving the basic structure while inventing refined local content, as demonstrated with the prompt "dog on grassland" and a denoising strength of 1.0.
Demonstration of ControlNet Tile performing 8× super resolution on a low-resolution dog image, prompted with 'dog on grassland'.
This ability extends to correcting corrupted images, such as those degraded by previous generative passes or image enhancement tools. The model can reconstruct plausible and photorealistic outputs even when the input lacks sufficient context for conventional super-resolution methods.
Batch test output showing how details in a corrupted dog image are fixed using ControlNet Tile with the prompt 'dog on grassland', denoising strength 1.0, and a random seed.
An additional feature is the model's localized prompt sensitivity, which ensures that the content generated within each tile is semantically appropriate. For instance, when a prompt refers to a "handsome man" but a tile contains the texture of palm leaves, the model refrains from placing face details in those regions, instead reproducing plausible leaf structures.
Generated outputs show that despite the global prompt 'a handsome man', the model preserves local semantics within tiles, generating palm fronds where expected.
The underlying architecture of ControlNet SD 1.5 Tile is based on the established structure of the original ControlNet 1.0 models, maintaining design continuity for consistent inference behavior across versions. Architectural updates in ControlNet 1.1 primarily address robustness and output quality, while preserving compatibility with the Stable Diffusion U-Net backbone.
Special attention is given to classifier-free guidance and local conditioning. Configuration details, such as the placement of global average pooling layers (e.g., for the Shuffle variant), are controlled through YAML parameters. This impacts how encoder outputs interact with the U-Net. For the Tile model, these settings optimize how diffusion influences each independently processed tile, ensuring only the conditional (not the unconditional) branch receives ControlNet input.
Example showcasing detail refinement and replacement. Input and five generated variants reflect the prompt 'Silver Armor' with denoising strength 1.0.
Although specific details regarding the datasets and augmentation methods for ControlNet SD 1.5 Tile are not exhaustively detailed in public documentation, ControlNet 1.1 models incorporate enhancements in training strategy relative to previous iterations. According to official release notes, systemic issues such as duplicated or low-quality samples, grayscale artifacts, and prompt-image mismatches present in ControlNet 1.0 were mitigated in the 1.1 update. The datasets include semantic and photorealistic diversity with augmented training through techniques such as random flips, contributing to improved generalization, especially for tasks involving region-specific synthesis and correction.
Applications and Output Quality
ControlNet SD 1.5 Tile is suitable for a variety of image processing tasks. Its primary application is in detail restoration and enhancement, such as upscaling small or degraded images where global super-resolution models like Real-ESRGAN may falter. The tile-based approach supports both broad scenic reconstructions and fine local corrections, with examples ranging from recovering photorealistic portraits from low-quality thumbnails to interpreting intricate environments.
The model can produce high-fidelity, high-resolution outputs at scale, maintaining both semantic integrity and local detail quality across complex subjects.
Output of tiled image upscaling: a high-resolution, photorealistic portrait of an elderly woman in a garden.
Beyond human-centric outputs, the model demonstrates proficiency in interpreting and generating complex architectural or environmental scenes and synthesizing plausible reconstructions in scenarios with ambiguous or damaged input.
High-resolution tile-based reconstruction of a destroyed airplane cabin, demonstrating the model's capacity for realistic environmental details.
The finalized version of the model, titled control_v11f1e_sd15_tile, was publicly released on April 25, 2023. The naming convention reflects internal release staging, with "f1" indicating a first bug fix and "e" denoting its experimental nature. Earlier, incomplete variants have been discontinued. While the model is robust for a wide variety of image manipulation tasks, certain limitations are noted:
The model is not expressly a super-resolution system, but rather one focused on regenerating and refining details in context.
Some features, such as tiled upscaling, may not be directly supported in all demonstration interfaces and may require integration with specific software extensions.
As an "experimental" release, some edge cases may remain suboptimal.
Comparisons and Related Architectures
ControlNet SD 1.5 Tile can be contrasted with several related technologies. While Stable Diffusion 1.5's image-to-image (I2I) features support high-level creative reinterpretation, the Tile model emphasizes structure-preservation across tiles even with maximal denoising. In comparison with Real-ESRGAN, which specializes in super-resolution, ControlNet Tile's tilewise generative intuition allows for plausible reconstructions even where source context is minimal.
Another development, Control-LoRA, integrates Low-Rank Parameter Efficient Fine Tuning (LoRA) to reduce the computational footprint of ControlNet models. However, this technique is distinct from and not incorporated into the ControlNet 1.1 Tile model lineage.