Skip to content

Faceswap: IMG2Filter

Image-to-Filter allows you to transform any image — whether it’s created using tools like ChatGPT, our AI image generators, or even photos provided by your clients — into a fully usable AI filter for your next event. It is ideal for business clients!

In this guide, you’ll learn:

Group 1410082156.png

1. Transformation Logics: Img2FIlter (via Inpainting, using Faceswap Models)

Img2Filter transforms selected regions of an image while preserving the unmasked areas. It uses:

  • an input image,
  • a mask to define the area of change,
  • and the user’s face image to personalize the transformation.

Mask Input

Masks define the area of change in the input image.

Three main types of mask annotations (check the table below to get a full overview):

  • Full-body: masks drawn over full person (face, full body, any visible skin area, clothes & even shoes)
  • Head & Skin-only : masks drawn only over head & visible skin areas (face, arms, etc.)
  • Head only: masks drawn only over visible headareas (Head

Tips for masking

  • Masks should be rough, general shapes
  • Avoid tracing the subject exactly
  • Keep it loose and soft-edged so filters can adapt to different body types and poses
  • The goal is to define a general region, not a precise cutout
Category Example 1: Fully Body Mask Example 2: only mask Skin & Head Example 3: only mask Head
Masking Reasoning Full body mask since no necessary details on the body need to stay identical (e.g., no jersey or fixed outfit elements). The body fits well in the scenery and appears unisex, therefore no full body mask required (=head + visible skin areas) The armor must stay identical, therefore only visible skin areas are masked.
If this filter should support females/children, additional base images are required.
Prompt V5 photo of confident person in action movie style, wearing a dirty grey tank top and jungle attire, angry expression, looking at the camera, high quality photo of a confident person, confident expression, looking at the camera, high quality photo of a confident person wearing a astronaut helmet, focused expression, looking at the camera, high quality, 16k,
Prompt V6 replace the exact same person, adapt bodytype to person replace the exact same person replace the face
Benefit All Users have different clothes and slightly different poses. The armor stays exactly the same Due to the perspective no other
Input image base (33231231)-20260218-103137.png base (31)-20260218-103011.png base (312)-20260218-103115.png
Example for Mask image - 2025-11-06T151959.394-20260218-103137.png image - 2025-11-06T151959.3943123-20260218-103147.png image - 2025-11-06T151959.39423-20260218-103115.png
Result V5 123123-20260218-103137.png image - 2025-11-06T151959.394-1-20260218-103011.png 123123321-20260218-103115.png
Result V6 result (82)-20260225-123444.png result (79)-20260225-123136.png image-20260225-124520.png

When to Use V5 vs V6

Both models use the same masking and workflow — but they perform best in different scenarios.

Faceswap V6 (Flagship Model)

V6 is our highest-quality model and delivers the most realistic and consistent results when the conditions are right.

Use V6 when:

  • The person is clearly visible
  • The face occupies a good portion of the image
  • The pose is clean and easy to read
  • You want the highest realism and detail
  • You work with cinematic, fantasy, or stylized concepts
  • The composition focuses on one main person

V6 performs best when the subject is medium-to-close in frame and visually clear.

Current limitations
V6 may not perform as well when:

  • The person is very small in the image
  • Faces have low pixel detail
  • There are dynamic sports or action poses
  • Multiple people appear in one image
  • Body or skin areas are very small or unclear

We are continuously improving these scenarios.

Faceswap V5 (Reliable for Complex Scenes)

V5 is slightly less detailed than V6 but more robust in challenging situations.

Use V5 when:

  • The person is small in the image
  • You work with sports or action shots
  • Group or team images are used
  • Poses are dynamic or complex
  • Face visibility is limited
  • The scene is crowded or busy

V5 handles these edge cases more consistently.

Quick rule of thumb

Clear, visible subject → Use V6
Small, dynamic, or group scenes → Use V5

Prompting for V5

V5 works best with flexible, descriptive prompts that match the base image.

Do

  • Be specific but brief
  • Focus on visual details
  • Match the prompt to the base image
  • Use keywords (lighting, style, mood)
  • Use negative prompts to block unwanted elements

Example
photo of confident person in action movie style wearing jungle outfit, angry expression, looking at the camera, cinematic lighting, high quality

Negative prompt example
helmet, hat, sunglasses, weapon

Avoid

  • Indirect phrasing

  • ❌ astronaut without helmet

  • ✅ astronaut + negative: helmet
  • Too many ideas in one prompt
  • Changing camera angle drastically from base image

Always align your prompt with what is already visible in the base image.

Prompting for Img2Filter with V6

V6 prompting works differently from V5.
The base image already defines the scene, pose, lighting, and composition — your prompt should only describe what gets replaced.

Keep prompts short and direct.

Core principle

Tell the model:
Who is replaced and optionally what they should wear.
The pose and perspective always come from the base image.

Use simple replacement instructions:

  • replace the exact same person
  • replace the exact same person, wearing [outfit]
  • replace the face
  • replace the exact same person, adapt bodytype to person

Outfit descriptions are allowed, but keep them concise.

Prompting Do’s

Keep prompts short and functional

  • Focus only on the replacement
  • Add outfit only if needed
  • Let the base image define everything else

Match prompt to mask type

Full body mask

  • replace the exact same person, adapt bodytype to person
  • Optional: add outfit if clothing should change

Head + skin mask

  • replace the exact same person
  • Optional: add small outfit adjustments

Head only

  • replace the face
  • Keeps outfit/armor identical

Good concise examples

  • replace the exact same person wearing a luxury suit
  • replace the exact same person, adapt bodytype to person, wearing football jersey
  • replace the face

Prompting Don’ts

  • Don’t describe the pose
  • Don’t describe the environment
  • Don’t rewrite the whole scene
  • Don’t use long cinematic prompts
  • Don’t change camera angle or perspective

The base image already controls these elements.

How to choose good Base Images

The subject’s face must be fully visible. Avoid images where the face is covered or partially blocked by hands, masks, visors, or any other objects.

Group 1410082277-20260218-104931.png

Provide Sharp Images With Simple, Readable Poses

Choose images that are in focus and easy to interpret. Blurry photos or complex poses can significantly reduce output quality.

Group 1410082279-20260218-104931.png

Prefer Front-Facing Photos

The face should be facing forward with minimal head rotation and minimal shadows. Side profiles or heavily angled shots are not recommended as they result in deformed representations of the users.

Group 1410082282-20260218-104931.png

Avoid Interaction With Brand Logos

Images where the subject touches or overlaps with brand logos may introduce unwanted distortions during processing. Ensure there is no physical interaction with any branded elements.

Group 1410082280-20260218-104931.png

Use Subjects With Neutral, Non-Flowing Hair

For best compatibility across different face swaps, avoid subjects with brightly colored, dynamic, or flowing hair that may complicate the inpainting process.

Group 1410082278-20260218-104931.png

Limit Close Contact With Other People

Images where faces, arms, or bodies are very close together — especially around the areas to be swapped — can lead to unintended facial distortions. Use photos with clear separation between individuals.

Group 1410082281-20260218-104931.png

Note

When working with low-quality or unsuitable base images, it may be necessary to make small adjustments such as removing distracting or unwanted elements to achieve acceptable results.
In some cases, however, it may be significantly more efficient to recreate the entire image in a more appropriate and face-swap-friendly manner

How to achieve the highest and consistent results

Beside a clear prompt and the input photo, esp. the output scene has a big impact on the output quality.

The most important factor for high-quality deepfake results is how many pixels the face occupies in the image.

  • Larger face area → More facial detail captured
  • More detail → Better identity reconstruction
  • Better reconstruction → Higher realism

Close-up portraits consistently produce the strongest results because the model can focus its precision on facial features instead of distributing resources across a large, complex scene.

Scenario Image Type Face Pixel Density Expected Quality Recommendation
✅ Best Case Close-up / Portrait High (large visible face area) Highest quality results Strongly recommended
⚠️ Medium Half body Moderate Good quality results Acceptable
❌ Worst Case Full body (small head area) Low (small visible face area) Reduced quality / less detail Avoid if possible

A Scene complexity directly influences recognizability and rendering accuracy.There is always a balance between identity precision and cinematic storytelling.

Group 1410082264.png