How to Use SadTalker AI Tool (Stable Diffusion's ComfyUI)

Today, I’ll provide you with a detailed guide on using Stable Diffusion’s ComfyUI, specifically the SadTalker extension, to give your hero a natural voice.

Accessing SadTalker Extension:

Access the WebUI Automatic1111 Interface:

Head to the Extensions tab.
Under Available, click the Load from the button.
In the Order option, select ‘stars.’
Locate and install the SadTalker extension, a second-order extension with about 6600 stars.

Installation Process:

After installation, navigate back to the Installed submenu.
Click the SadTalker URL link to visit the SadTalker GitHub homepage.
Check for updates and apply them, restarting the UI.
Restart ‘Webui-user.bat.’

Setting Up Pre-Trained Models:

On the SadTalker homepage, find ‘Pre-Trained Models.’
Download one or both models (256 or 512 safetensors models).
Move the downloaded models into the ‘checkpoints’ folder within the SadTalker folder.

Running SadTalker:

Run the ‘webUI.bat’ file. This process installs the necessary packages and creates a virtual environment folder.
Once the installation completes and you see ‘Running on Local URL,’ access the application via your favorite browser.

How to use SadTalker AI:

Launching the Application:

Load the source image or any preferred image.
Choose an audio source fitting for your project.

Settings and Options:

Utilize the full resolution for the face image.
Opt for still mode to enhance image quality.
Click ‘generate’ to initiate the process.

Troubleshooting Errors:

If encounter an error regarding a mapping model, revisit the SadTalker homepage.
Download the required mapping files and place them in the ‘checkpoints’ folder.
Rerun the application.

Reviewing Results:

Once generated, preview and download your creation.

Additional Features and Enhancements

WebUI A1111 Integration:
- Restart the WebUI A1111 interface and access the SadTalker submenu.
- Repeat the process using different images and voices, achieving similar successful outcomes.
Access to Quality Voices:
- Explore and download voices from reliable sources available online.
- Check out recommended AI projects for voice-over tasks.
Customizing Voice Speed:
- Use Audacity, an open-source software, to adjust voice speed.
- Follow the steps: load the file, select ‘Change Speed’ in the Effect submenu, adjust the speed, and save the modified file.
Optimizing Playback Speed:
- After movie generation, video editing software enhances playback speed for smoother results.

Tips and Tricks

Experiment with different face models for varied outcomes.
Consider slowing down the voice speed for better synchronization with the generated video.

Conclusion

SadTalker is a versatile tool that operates both independently and as an extension of Stable Diffusion A1111, enabling users to bring photos to life with realistic voices. By following these steps and exploring additional enhancements, you can create engaging and dynamic content.

Demi Franco

Demi Franco, a BTech in AI from CQUniversity, is a passionate writer focused on AI. She crafts insightful articles and blog posts that make complex AI topics accessible and engaging.