This step-by-step guide will walk you through the process of using the SadTalker Colab Notebook to create talking head images.
Sadtalker Google Colab NoteBook: Step by Step
Step 1: Go to Sadtalker Colab
Visit the Sadtalker Google Colab notebook here. Click on the Connect button.
Step 2: Setup (Approx. 5 minutes)
- Update Python Version: First, ensure you have the required Python version. Run the following commands to set up Python 3.9 as the primary version:
!update-alternatives --install /usr/local/bin/python3 python3 /usr/bin/python3.8 2
!update-alternatives --install /usr/local/bin/python3 python3 /usr/bin/python3.9 1
- Check Python Version: Confirm the Python version with:
!python --version
- Update Packages: Update package repositories:
!apt-get update
- Install Dependencies: Install necessary packages and libraries:
!apt install software-properties-common
!sudo dpkg --remove --force-remove-reinstreq python3-pip python3-setuptools python3-wheel
!apt-get install python3-pip
- Clone Project and Install Requirements: Clone the SadTalker project from GitHub and install its requirements:
print('Git clone project and install requirements...')
!git clone https://github.com/cedro3/SadTalker.git &> /dev/null
%cd SadTalker
!export PYTHONPATH=/content/SadTalker:$PYTHONPATH
!python3.8 -m pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
!apt update
!apt install ffmpeg &> /dev/null
!python3.8 -m pip install -r requirements.txt
Step 3: Download Model (Approx. 1 minute)
- Download Pre-trained Models: Fetch pre-trained models for SadTalker:
print('Download pre-trained models...')
!rm -rf checkpoints
!bash scripts/download_models.sh
Step 4: Inference for Face
- Perform Inference for Face: Choose your desired image and audio for generating the animation. Replace ‘full3.png’ and ‘eluosi.wav’ with your preferred files:
image ='full3.png' #@param {type:"string"}
audio ='eluosi.wav' #@param {type:"string"}
source_image = 'examples/source_image/' + image
driven_audio = 'examples/driven_audio/' + audio
!python3.8 inference.py --driven_audio $driven_audio \
--source_image $source_image \
--result_dir ./results --enhancer gfpgan
Step 5: Play Movie
- Display Animation: After generating the animation, run the following code to display it:
import glob
from IPython.display import HTML
from base64 import b64encode
import os, sys
# Get the latest generated movie file
mp4_name = sorted(glob.glob('./results/*.mp4'))[-1]
mp4 = open('{}'.format(mp4_name),'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
print('Display animation: {}'.format(mp4_name), file=sys.stderr)
display(HTML("""
<video width=256 controls>
<source src="%s" type="video/mp4">
</video>
""" % data_url))
Step 6: Inference for Portrait
- Perform Inference for Portrait: Similarly, conduct inference for portrait animations with the following code:
image ='full3.png' #@param {type:"string"}
audio ='eluosi.wav' #@param {type:"string"}
source_image = 'examples/source_image/' + image
driven_audio = 'examples/driven_audio/' + audio
!python3.8 inference.py --driven_audio $driven_audio \
--source_image $source_image \
--result_dir ./results --still --preprocess full --enhancer gfpgan
Step 7: Play Movie
- Display Animation: Lastly, display the generated portrait animation using the code provided earlier.
With these simple steps, you can unleash the power of the SadTalker Colab Notebook to create animations effortlessly.
Demi Franco, a BTech in AI from CQUniversity, is a passionate writer focused on AI. She crafts insightful articles and blog posts that make complex AI topics accessible and engaging.