Stable Diffusion WebUI has a lot of features and can seem daunting to learn them at first. I want to share my process of learning how to use this. The sources are from AUTOMATIC1111 Github repo wiki, stable-diffusion-art blog by Andrew (so detailed for beginners, m(_ _)m), and other random websites.
First Step: run AUTOMATIC1111
cd ~/stable-diffusion-webui;./webui.sh
Text-to-image tab:

txt2img: turn text prompt into an image
Stable Diffusion checkpoint: select the model you want to use
Prompt: text description of what you want to see in the image
Negative Prompt: write what you don’t want to see in the image (can use a universal negative prompt
- ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, bad anatomy, watermark, signature, cut off, low contrast, underexposed, overexposed, bad art, beginner, amateur, distorted face
Quick side note: how do we import new models?
Import New Models
- Download compatible model file with .ckpt or .safetensors file extension (if both are available, safer to go with .safetensors file).
- Put the file in
stable-diffusion-webui\models\Stable-diffusiondirectory. - Refresh the models list by pressing the refresh button at the top left corner
Where to Find New Models
Quick side note: learn more about LoRA models to invoke your favorite styles
VAE files
You can find .vae files alongside the model’s download page.
VAE(variational autoencoder) files are used for post-processing, after image generation.
VAE: system that compress images to smaller, more manageable pieces and reconstruct them back into original form. It encodes optional input image before diffusion process begins and decodes the generated image.
- Encoding: takes input image and compress it into small representation called a latent space (similar to turning a detailed picture into a rough sketch).
- Latent space: simplified version of the original image, capturing essential features. smaller and easier to work with.
- Decoding: VAE takes this rough sketch and turns it back into a detailed picture, similar to the original image.
Latent diffusion: technique used in image generation to create new images based on information in the latent space.
- Start: blurry version of image in latent space
- Refine image: gradually refines image by adding more details until it becomes a clear and detailed image
- Guiding: throughout this process, VAE guides the system to ensure final image looks realistic and matches desired qualities.
VAEs and latent diffusion work together to create detailed images! (work besties)
Without using a dedicated VAE file for the model, it will default to SD VAE.
Default SD VAE downside: images might seem discolored, non-vibrant or highly desaturated.
Solution: .vae file designated for the model you’re using is placed inside your models folder right beside the model file. Both model file and the .vae need to be named the same way.
- can also use different .vae files dedicated for different models. This will only affect the post-processing.
VAE files may also be already merged into a model (no need to do anything!)
Leave a comment