Latent Diffusion Backbone
Unlike pixel-space diffusion, Midjourney utilizes Latent Diffusion Models (LDM). By compressing images into a lower-dimensional latent space via a VAE (Variational Autoencoder), the model performs diffusion on the compressed representation. This significantly reduces compute requirements while allowing for complex semantic manipulation.