Getting Started

This guide will help you set up Emotion-LLaMA on your system and get started with multimodal emotion recognition.


Table of Contents

  1. System Requirements
    1. Hardware Requirements
    2. Software Requirements
  2. Quick Installation
    1. 1. Clone the Repository
    2. 2. Create Conda Environment
    3. 3. Install Additional Dependencies
  3. Preparing Pre-trained Models
    1. Llama-2-7b-chat-hf
    2. Configure Model Path
    3. MiniGPT-v2 Checkpoint
    4. HuBERT-large Model (For Demo)
  4. Project Structure
    1. Key Directories
  5. Verification
  6. Next Steps
  7. Troubleshooting
    1. Common Issues

System Requirements

Hardware Requirements

  • GPU: NVIDIA GPU with at least 24GB VRAM (for training)
  • RAM: 32GB or more recommended
  • Storage: At least 50GB free space for models and datasets

Software Requirements

  • Operating System: Linux (Ubuntu 18.04+), Windows 10/11, or macOS
  • Python: 3.8 or later
  • CUDA: 11.0 or later (for GPU acceleration)
  • Conda: Anaconda or Miniconda

Quick Installation

1. Clone the Repository

git clone https://github.com/ZebangCheng/Emotion-LLaMA.git
cd Emotion-LLaMA

2. Create Conda Environment

conda env create -f environment.yaml
conda activate llama

The environment setup may take several minutes depending on your internet connection.

3. Install Additional Dependencies

pip install moviepy==1.0.3
pip install soundfile==0.12.1
pip install opencv-python==4.7.0.72

Preparing Pre-trained Models

Llama-2-7b-chat-hf

Download the Llama-2-7b-chat-hf model from Hugging Face:

https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

Save the model to Emotion-LLaMA/checkpoints/Llama-2-7b-chat-hf/

Configure Model Path

Specify the path to Llama-2 in the model config file (minigpt4/configs/models/minigpt_v2.yaml):

# Set Llama-2-7b-chat-hf path
llama_model: "/path/to/Emotion-LLaMA/checkpoints/Llama-2-7b-chat-hf"

MiniGPT-v2 Checkpoint

Specify the path to MiniGPT-v2 in the training config file (train_configs/Emotion-LLaMA_finetune.yaml):

# Set MiniGPT-v2 path
ckpt: "/path/to/Emotion-LLaMA/checkpoints/minigptv2_checkpoint.pth"

Replace /path/to/ with your actual installation path.

HuBERT-large Model (For Demo)

Download the HuBERT-large model from Hugging Face:

https://huggingface.co/TencentGameMate/chinese-hubert-large

Save to Emotion-LLaMA/checkpoints/transformer/chinese-hubert-large/

Specify the path in minigpt4/conversation/conversation.py:

# Set HuBERT-large model path
model_file = "checkpoints/transformer/chinese-hubert-large"

Project Structure

The complete project structure including datasets and checkpoints:

πŸ“¦ Dataset (External)
 └── πŸ“¦ Emotion
     └── πŸ“‚ MER2023
         β”œβ”€β”€ πŸ“‚ video                      # Raw video files
         β”œβ”€β”€ πŸ“‚ HL-UTT                     # HuBERT features
         β”œβ”€β”€ πŸ“‚ mae_340_UTT                # MAE features
         β”œβ”€β”€ πŸ“‚ maeV_399_UTT               # VideoMAE features
         β”œβ”€β”€ πŸ“„ transcription_en_all.csv   # Video transcriptions
         β”œβ”€β”€ πŸ“„ MERR_coarse_grained.txt    # 28,618 coarse-grained annotations
         β”œβ”€β”€ πŸ“„ MERR_coarse_grained.json
         β”œβ”€β”€ πŸ“„ MERR_fine_grained.txt      # 4,487 fine-grained annotations
         └── πŸ“„ MERR_fine_grained.json

πŸ“¦ Emotion-LLaMA (Project Root)
 β”œβ”€β”€ πŸ“‚ checkpoints/                      # Pre-trained models
 β”‚   β”œβ”€β”€ πŸ“‚ Llama-2-7b-chat-hf/          # Base LLaMA model
 β”‚   β”œβ”€β”€ πŸ“‚ save_checkpoint/             # Trained checkpoints
 β”‚   β”‚   β”œβ”€β”€ πŸ“‚ stage2/
 β”‚   β”‚   β”‚   β”œβ”€β”€ checkpoint_best.pth     # Best model from stage 2
 β”‚   β”‚   β”‚   └── log.txt                 # Training logs
 β”‚   β”‚   └── Emoation_LLaMA.pth          # Demo model
 β”‚   β”œβ”€β”€ πŸ“‚ transformer/
 β”‚   β”‚   └── πŸ“‚ chinese-hubert-large/    # Audio encoder
 β”‚   └── minigptv2_checkpoint.pth        # MiniGPT-v2 base
 β”œβ”€β”€ πŸ“‚ minigpt4/                        # Core model implementation
 β”‚   β”œβ”€β”€ πŸ“‚ configs/                     # Model and dataset configs
 β”‚   β”œβ”€β”€ πŸ“‚ datasets/                    # Dataset loaders
 β”‚   β”œβ”€β”€ πŸ“‚ models/                      # Model architectures
 β”‚   β”œβ”€β”€ πŸ“‚ processors/                  # Data processors
 β”‚   └── πŸ“‚ conversation/                # Conversation templates
 β”œβ”€β”€ πŸ“‚ train_configs/                   # Training configurations
 β”‚   β”œβ”€β”€ Emotion-LLaMA_finetune.yaml     # Stage 1 config
 β”‚   └── minigptv2_tuning_stage_2.yaml   # Stage 2 config
 β”œβ”€β”€ πŸ“‚ eval_configs/                    # Evaluation configurations
 β”‚   β”œβ”€β”€ demo.yaml                       # Demo config
 β”‚   β”œβ”€β”€ eval_emotion.yaml               # MER2023 eval config
 β”‚   └── eval_emotion_EMER.yaml          # EMER eval config
 β”œβ”€β”€ πŸ“‚ examples/                        # Example video clips
 β”œβ”€β”€ πŸ“‚ images/                          # Documentation images
 β”œβ”€β”€ πŸ“‚ docs/                            # Documentation site (this site!)
 β”œβ”€β”€ πŸ“‘ train.py                         # Training script
 β”œβ”€β”€ πŸ“‘ eval_emotion.py                  # Evaluation script (MER2023)
 β”œβ”€β”€ πŸ“‘ eval_emotion_EMER.py             # Evaluation script (EMER)
 β”œβ”€β”€ πŸ“‘ app.py                           # Gradio demo
 β”œβ”€β”€ πŸ“‘ app_EmotionLlamaClient.py        # API client
 β”œβ”€β”€ πŸ“œ environment.yml                  # Conda environment
 └── πŸ“œ requirements.txt                 # Python dependencies

Key Directories

Checkpoints: Store all pre-trained models and trained checkpoints

  • Download Llama-2-7b-chat-hf from Hugging Face
  • Save trained models from training runs

Dataset: External dataset directory (apply for access)

  • Must be organized as shown above
  • Features should be pre-extracted for faster training

Documentation: This documentation site

  • Built with Jekyll and Just the Docs theme
  • Deployed to GitHub Pages

Verification

To verify your installation, you can run a quick test:

python -c "import torch; print('PyTorch version:', torch.__version__); print('CUDA available:', torch.cuda.is_available())"

Expected output:

PyTorch version: 2.x.x
CUDA available: True

Next Steps


Troubleshooting

Common Issues

Issue: CUDA out of memory error

  • Solution: Reduce batch size in configuration files or use gradient accumulation

Issue: Missing dependencies

  • Solution: Reinstall the conda environment: conda env remove -n llama && conda env create -f environment.yaml

Issue: Model download fails

  • Solution: Check your internet connection and Hugging Face access permissions

For more help, please open an issue on GitHub.


Table of contents