Sadtalker AI Google Colab Notebook (Quick Guide)

This step-by-step guide will walk you through the process of using the SadTalker Colab Notebook to create talking head images.

Sadtalker Google Colab NoteBook: Step by Step

Step 1: Go to Sadtalker Colab

Visit the Sadtalker Google Colab notebook here. Click on the Connect button.

Step 2: Setup (Approx. 5 minutes)

Update Python Version: First, ensure you have the required Python version. Run the following commands to set up Python 3.9 as the primary version:

!update-alternatives --install /usr/local/bin/python3 python3 /usr/bin/python3.8 2  
!update-alternatives --install /usr/local/bin/python3 python3 /usr/bin/python3.9 1

Check Python Version: Confirm the Python version with:

!python --version

Update Packages: Update package repositories:

!apt-get update

Install Dependencies: Install necessary packages and libraries:

!apt install software-properties-common
!sudo dpkg --remove --force-remove-reinstreq python3-pip python3-setuptools python3-wheel
!apt-get install python3-pip

Clone Project and Install Requirements: Clone the SadTalker project from GitHub and install its requirements:

print('Git clone project and install requirements...')
!git clone https://github.com/cedro3/SadTalker.git &> /dev/null
%cd SadTalker 
!export PYTHONPATH=/content/SadTalker:$PYTHONPATH 
!python3.8 -m pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
!apt update
!apt install ffmpeg &> /dev/null  
!python3.8 -m pip install -r requirements.txt

Step 3: Download Model (Approx. 1 minute)

Download Pre-trained Models: Fetch pre-trained models for SadTalker:

print('Download pre-trained models...')
!rm -rf checkpoints
!bash scripts/download_models.sh

Step 4: Inference for Face

Perform Inference for Face: Choose your desired image and audio for generating the animation. Replace ‘full3.png’ and ‘eluosi.wav’ with your preferred files:

image ='full3.png' #@param {type:"string"}
audio ='eluosi.wav' #@param {type:"string"}
source_image = 'examples/source_image/' + image
driven_audio = 'examples/driven_audio/' + audio

!python3.8 inference.py --driven_audio $driven_audio \
           --source_image $source_image \
           --result_dir ./results --enhancer gfpgan

Step 5: Play Movie

Display Animation: After generating the animation, run the following code to display it:

import glob
from IPython.display import HTML
from base64 import b64encode
import os, sys

# Get the latest generated movie file
mp4_name = sorted(glob.glob('./results/*.mp4'))[-1]

mp4 = open('{}'.format(mp4_name),'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()

print('Display animation: {}'.format(mp4_name), file=sys.stderr)
display(HTML("""
  <video width=256 controls>
        <source src="%s" type="video/mp4">
  </video>
  """ % data_url))

Step 6: Inference for Portrait

Perform Inference for Portrait: Similarly, conduct inference for portrait animations with the following code:

image ='full3.png' #@param {type:"string"}
audio ='eluosi.wav' #@param {type:"string"}
source_image = 'examples/source_image/' + image
driven_audio = 'examples/driven_audio/' + audio

!python3.8 inference.py --driven_audio $driven_audio \
           --source_image $source_image \
           --result_dir ./results --still --preprocess full --enhancer gfpgan

Step 7: Play Movie

Display Animation: Lastly, display the generated portrait animation using the code provided earlier.

With these simple steps, you can unleash the power of the SadTalker Colab Notebook to create animations.

Demi Franco

Demi Franco, a BTech in AI from CQUniversity, is a passionate writer focused on AI. She crafts insightful articles and blog posts that make complex AI topics accessible and engaging.