SadTalker Github: Creating Realistic Talking Head Videos

SadTalker is an exciting project that combines single portrait images with audio to generate realistic talking head videos. Developed by researchers from Xi’an Jiaotong University, Tencent AI Lab, and Ant Group, SadTalker has gained attention for its ability to create animated faces that appear to speak based on audio input.

What Is Sadtalker Project?

SadTalker is a novel approach to audio-driven single-image talking face animation. It takes a static portrait image and animates it to simulate speech using audio. The result is a dynamic talking head video that can be used for various applications, including virtual avatars, video games, and entertainment.

Key Features

Image-to-Video Conversion: SadTalker transforms a single image into a talking head video by synchronizing it with audio.
Apache 2.0 License: The project’s license has been updated to Apache 2.0, allowing broader use and integration.
Integration with Discord: SadTalker is now officially integrated into Discord, enabling users to generate videos by sending files.
High-Quality Video Generation: Users can create high-quality videos from text prompts.
WebUI Extension: A stable-diffusion-webui extension has been published, enhancing usability.

Getting Started

To use SadTalker, follow these steps:

1. Installation

Install Anaconda, Python, and Git.
Create a virtual environment:

git clone https://github.com/OpenTalker/SadTalker.git
conda create -n sadtalker python=3.8
conda activate sadtalker
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
conda install ffmpeg
pip install -r requirements.txt

Windows:

You can also follow the following instructions:

Install Python 3.8 and check “Add Python to PATH“.
Install git manually or using Scoop: scoop install git.
Install ffmpeg, using following command: scoop install ffmpeg.
Download the SadTalker repository by running git clone https://github.com/Winfredy/SadTalker.git.
Download the checkpoints and gfpgan models in the downloads section.
Run start.bat from Windows Explorer as normal, non-administrator, user, and a Gradio-powered WebUI demo will be started.

2. Download Models

You can run the following script on Linux/macOS to automatically download all the models:

bash scripts/download_models.sh

We also provide an offline patch (gfpgan), so no model will be downloaded when generating.

3. Demo Video

Watch the demo video to see SadTalker in action. Full image mode is now available, allowing even more expressive animations.

Community Contributions

SadTalker has gained popularity, and community members have created tutorials and demos. Check out community demos on platforms like Bilibili and YouTube.

Changelog

[2023.06.12]: Added new features in the WebUI extension.
[2023.06.05]: Released a new 512x512px (beta) face model and improved performance.
[2023.04.15]: Added a WebUI Colab notebook.
[2023.04.12]: Detailed WebUI installation document and fixed reinstallation issues.
[2023.04.08]: Added a logo watermark to prevent abuse (later removed).
[2023.04.08]: Introduced full image animation and download checkpoints from Baidu.

Latest Posts:

Demi Franco

Demi Franco, a BTech in AI from CQUniversity, is a passionate writer focused on AI. She crafts insightful articles and blog posts that make complex AI topics accessible and engaging.