API Documentation

Programmatic access to Emotion-LLaMA for integration into your applications.


Table of Contents

  1. Overview
  2. Quick Start
    1. 1. Start the API Server
    2. 2. Make a Request
  3. Language Selection
  4. API Methods
    1. Gradio API
    2. FastAPI
  5. Authentication
  6. Rate Limiting
  7. Error Handling
    1. Common Error Codes
    2. Error Response Format
    3. Handling Errors in Python
  8. Performance Optimization
    1. Batch Processing
    2. Caching
  9. Deployment
    1. Production Deployment
    2. Docker Deployment
    3. Kubernetes Deployment
  10. SDK and Libraries
    1. Python SDK (Coming Soon)
    2. JavaScript SDK (Coming Soon)
  11. API Versions
  12. Support
  13. Next Steps

Overview

Emotion-LLaMA provides multiple API interfaces for programmatic access:

  • Gradio API - Simple HTTP API automatically generated by Gradio
  • FastAPI - High-performance asynchronous API
  • Python Client - Direct Python integration

Quick Start

1. Start the API Server

Launch the Emotion-LLaMA client API:

python app_EmotionLlamaClient.py

The API will be available at:

  • Gradio API: http://localhost:7889
  • Custom endpoints can be configured

2. Make a Request

Python Example:

import requests
import json

url = "http://localhost:7889/api/predict/"
headers = {"Content-Type": "application/json"}

data = {
    "data": [
        "/path/to/video.mp4",
        "[emotion] What emotion is expressed in this video?"
    ]
}

response = requests.post(url, headers=headers, data=json.dumps(data))
print(response.json())

cURL Example:

curl -X POST "http://localhost:7889/api/predict/" \
     -H "Content-Type: application/json" \
     -d '{"data": ["/path/to/video.mp4", "[emotion] What emotion is shown?"]}'

Language Selection

Choose your preferred language for detailed documentation:


API Methods

Gradio API

The Gradio API provides a simple interface for emotion recognition:

Endpoint: /api/predict/

Method: POST

Request Body:

{
  "data": [
    "video_path",
    "prompt_text"
  ]
}

Response:

{
  "data": ["Generated response text"],
  "duration": 1.23
}

FastAPI

For production use, we recommend FastAPI:

Endpoint: /process_video

Method: POST

Request Body:

{
  "video_path": "/path/to/video.mp4",
  "question": "What emotion is expressed?"
}

Response:

{
  "response": "The emotion expressed is happiness."
}

Authentication

Currently, the API does not require authentication. For production deployment, consider adding:

  • API Keys: Simple token-based authentication
  • OAuth 2.0: For more secure applications
  • Rate Limiting: Prevent abuse

Example with API key:

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

Rate Limiting

To prevent server overload, consider implementing rate limiting:

from fastapi import FastAPI
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)
app = FastAPI()
app.state.limiter = limiter

@app.post("/process_video")
@limiter.limit("10/minute")
def process_video(request: Request, video_req: VideoRequest):
    # Process video
    pass

Error Handling

Common Error Codes

Code Error Description
200 Success Request processed successfully
400 Bad Request Invalid input parameters
404 Not Found Video file not found
500 Server Error Internal processing error
503 Service Unavailable Server overloaded

Error Response Format

{
  "error": "Video file not found",
  "code": 404,
  "details": "/path/to/video.mp4 does not exist"
}

Handling Errors in Python

try:
    response = requests.post(url, headers=headers, data=json.dumps(data))
    response.raise_for_status()
    result = response.json()
except requests.exceptions.HTTPError as e:
    print(f"HTTP Error: {e}")
except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")

Performance Optimization

Batch Processing

Process multiple videos efficiently:

videos = ["video1.mp4", "video2.mp4", "video3.mp4"]
prompt = "[emotion] What emotion is expressed?"

# Sequential processing
results = []
for video in videos:
    data = {"data": [video, prompt]}
    response = requests.post(url, headers=headers, data=json.dumps(data))
    results.append(response.json())

# Async processing (requires async client)
import asyncio
import aiohttp

async def process_video_async(session, video):
    data = {"data": [video, prompt]}
    async with session.post(url, json=data) as response:
        return await response.json()

async def process_all():
    async with aiohttp.ClientSession() as session:
        tasks = [process_video_async(session, v) for v in videos]
        return await asyncio.gather(*tasks)

results = asyncio.run(process_all())

Caching

Implement caching for frequently requested videos:

from functools import lru_cache
import hashlib

@lru_cache(maxsize=100)
def get_video_hash(video_path):
    with open(video_path, 'rb') as f:
        return hashlib.md5(f.read()).hexdigest()

# Cache results based on video hash and prompt
cache = {}
key = (get_video_hash(video_path), prompt)
if key in cache:
    result = cache[key]
else:
    result = process_video(video_path, prompt)
    cache[key] = result

Deployment

Production Deployment

For production use:

  1. Use HTTPS: Encrypt API traffic
  2. Add Authentication: Secure your endpoints
  3. Implement Rate Limiting: Prevent abuse
  4. Monitor Performance: Track API usage
  5. Scale Horizontally: Multiple API servers behind load balancer

Docker Deployment

FROM pytorch/pytorch:2.0.0-cuda11.8-cudnn8-runtime

WORKDIR /app
COPY . /app

RUN pip install -r requirements.txt

EXPOSE 7889

CMD ["python", "app_EmotionLlamaClient.py"]

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: emotion-llama-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: emotion-llama
  template:
    metadata:
      labels:
        app: emotion-llama
    spec:
      containers:
      - name: api
        image: emotion-llama:latest
        ports:
        - containerPort: 7889
        resources:
          limits:
            nvidia.com/gpu: 1

SDK and Libraries

Python SDK (Coming Soon)

from emotion_llama import EmotionLLaMA

# Initialize client
client = EmotionLLaMA(api_url="http://localhost:7889")

# Analyze video
result = client.analyze(
    video_path="/path/to/video.mp4",
    task="emotion"  # or "reason"
)

print(result.emotion)
print(result.confidence)
print(result.explanation)

JavaScript SDK (Coming Soon)

import { EmotionLLaMA } from 'emotion-llama-sdk';

const client = new EmotionLLaMA({
  apiUrl: 'http://localhost:7889'
});

const result = await client.analyze({
  videoPath: '/path/to/video.mp4',
  prompt: '[emotion] What emotion is shown?'
});

console.log(result);

API Versions

Current API version: v1.0

Future versions will maintain backward compatibility.


Support

For API support:


Next Steps


Table of contents