Sadtalker AI Google Colab Notebook (Quick Guide)

This step-by-step guide will walk you through the process of using the SadTalker Colab Notebook to create talking head images.

Sadtalker Google Colab NoteBook: Step by Step

Step 1: Go to Sadtalker Colab

Visit the Sadtalker Google Colab notebook here. Click on the Connect button.

Step 2: Setup (Approx. 5 minutes)

  1. Update Python Version: First, ensure you have the required Python version. Run the following commands to set up Python 3.9 as the primary version:
!update-alternatives --install /usr/local/bin/python3 python3 /usr/bin/python3.8 2  
!update-alternatives --install /usr/local/bin/python3 python3 /usr/bin/python3.9 1  
  1. Check Python Version: Confirm the Python version with:
!python --version  
  1. Update Packages: Update package repositories:
!apt-get update
  1. Install Dependencies: Install necessary packages and libraries:
!apt install software-properties-common
!sudo dpkg --remove --force-remove-reinstreq python3-pip python3-setuptools python3-wheel
!apt-get install python3-pip
  1. Clone Project and Install Requirements: Clone the SadTalker project from GitHub and install its requirements:
print('Git clone project and install requirements...')
!git clone https://github.com/cedro3/SadTalker.git &> /dev/null
%cd SadTalker 
!export PYTHONPATH=/content/SadTalker:$PYTHONPATH 
!python3.8 -m pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
!apt update
!apt install ffmpeg &> /dev/null  
!python3.8 -m pip install -r requirements.txt

Step 3: Download Model (Approx. 1 minute)

  1. Download Pre-trained Models: Fetch pre-trained models for SadTalker:
print('Download pre-trained models...')
!rm -rf checkpoints
!bash scripts/download_models.sh

Step 4: Inference for Face

  1. Perform Inference for Face: Choose your desired image and audio for generating the animation. Replace ‘full3.png’ and ‘eluosi.wav’ with your preferred files:
image ='full3.png' #@param {type:"string"}
audio ='eluosi.wav' #@param {type:"string"}
source_image = 'examples/source_image/' + image
driven_audio = 'examples/driven_audio/' + audio

!python3.8 inference.py --driven_audio $driven_audio \
           --source_image $source_image \
           --result_dir ./results --enhancer gfpgan

Step 5: Play Movie

  1. Display Animation: After generating the animation, run the following code to display it:
import glob
from IPython.display import HTML
from base64 import b64encode
import os, sys

# Get the latest generated movie file
mp4_name = sorted(glob.glob('./results/*.mp4'))[-1]

mp4 = open('{}'.format(mp4_name),'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()

print('Display animation: {}'.format(mp4_name), file=sys.stderr)
display(HTML("""
  <video width=256 controls>
        <source src="%s" type="video/mp4">
  </video>
  """ % data_url))

Step 6: Inference for Portrait

  1. Perform Inference for Portrait: Similarly, conduct inference for portrait animations with the following code:
image ='full3.png' #@param {type:"string"}
audio ='eluosi.wav' #@param {type:"string"}
source_image = 'examples/source_image/' + image
driven_audio = 'examples/driven_audio/' + audio

!python3.8 inference.py --driven_audio $driven_audio \
           --source_image $source_image \
           --result_dir ./results --still --preprocess full --enhancer gfpgan

Step 7: Play Movie

  1. Display Animation: Lastly, display the generated portrait animation using the code provided earlier.

With these simple steps, you can unleash the power of the SadTalker Colab Notebook to create animations effortlessly.