FluxPipeline provides support for multiple image generation pipelines, including text-to-image (txt2img), image-to-image (img2img), and inpainting.

Important Note

Ensure that a volume mount is added to the deployment, as all images generated are dumped inside /data/outputs directory in the container.


Model Optimization Configuration

  • The standard base configuration presented in the UI is sufficient for optimal performance.
  • ControlNet models will not be loaded by default.

Optimization Settings

For optimization, under the optimization config, use:

"optimisations": {
    "attention_caching": {
      "type": "auto",
      "enabled": true,
      "extra_params": {
        "threshold": 0.1
      }
    }
}
  • Higher threshold values result in greater speed gains but may degrade image generation accuracy.
  • We recommend a threshold of 0.1, which can provide up to a 40% speed improvement during inference while maintaining reasonable quality.

Supported Pipelines

  1. txt2img - Generates an image from text input.
  2. img2img - Generates an image based on an input image and a given prompt.
  3. inpaint - Modifies specific regions of an image based on a mask and a given prompt.

Example Requests

txt2img

{
    "prompt": "A girl in city, 25 years old, cool, futuristic <lora:multimodalart/plstps-local-feature:0.3> <lora:XLabs-AI/flux-RealismLora:0.3>",
    "negative_prompt": "canvas frame, (high contrast:1.2), (over saturated:1.2), (glossy:1.1), cartoon, 3d, ((disfigured)), ((bad art)), ((b&w)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, 3d render",
    "height": 1024,
    "width": 1024,
    "num_images_per_prompt": 4,
    "num_inference_steps": 20,
    "seed": 2064977189,
    "guidance_scale": 4.5,
    "strength": 0.8,
    "scheduler": "EULER-A",
    "model_type": "txt2img"
}

Img2Img

{
    "prompt": "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k <lora:multimodalart/plstps-local-feature:0.3> <lora:XLabs-AI/flux-RealismLora:0.3>",
    "negative_prompt": "canvas frame, (high contrast:1.2), (over saturated:1.2), (glossy:1.1), cartoon, 3d, ((disfigured)), ((bad art)), ((b&w)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, 3d render",
    "height": 1024,
    "width": 1024,
    "num_images_per_prompt": 4,
    "num_inference_steps": 20,
    "image": "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg",
    "seed": 89395930,
    "guidance_scale": 7.0,
    "strength": 0.5,
    "scheduler": "EULER-A",
    "model_type": "img2img"
}

Inpaint

{
    "prompt": "Face of a yellow cat, high resolution, sitting on a park bench <lora:multimodalart/plstps-local-feature:0.3> <lora:XLabs-AI/flux-RealismLora:0.3>",
    "height": 1024,
    "width": 1024,
    "num_images_per_prompt": 4,
    "num_inference_steps": 20,
    "image": "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png",
    "mask_image": "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png",
    "seed": 89395930,
    "guidance_scale": 7.0,
    "strength": 0.5,
    "scheduler": "EULER-A",
    "clip_skip": 0,
    "use_foocus": true,
    "model_type": "inpaint"
}

Example Response

{
    "response_id": "afbc439946a44d98bb8062c8b36ec16d",
    "inference_time_taken": 6.336474418640137,
    "lora_time": 1.8092551231384277,
    "total_time_taken": 7.176232099533081,
    "request_id": "6b060ab415d84117b7b6403d622414f5",
    "error": null
}

Key Notes

  • Ensure volume mounting in deployment for image storage.
  • ControlNet models are not loaded by default.
  • The default configuration in the UI should work optimally for most use cases.
  • Supports multiple pipelines for text-to-image, image-to-image, and inpainting.