About Emotion-LLaMA

Learn more about the project, team, and how to contribute.


Project Overview

Emotion-LLaMA is a state-of-the-art multimodal emotion recognition and reasoning model developed through collaboration between multiple research institutions. The project aims to advance the field of affective computing by combining the power of large language models with multimodal emotion understanding.

Key Achievements

  • πŸ† NeurIPS 2024 - Accepted at one of the top AI conferences
  • πŸ₯‡ MER2024 Champion - 1st place in MER-NOISE track (F1: 0.8530)
  • πŸ₯‰ MER2024 3rd Place - Top individual model in MER-OV track
  • πŸ“Š State-of-the-art - Best performance on MER2023 (F1: 0.9036) and EMER datasets

Research Team

Principal Investigators

  • Zebang Cheng - Shenzhen Technology University & Carnegie Mellon University
  • Zhi-Qi Cheng - Carnegie Mellon University
  • Alexander Hauptmann - Carnegie Mellon University
  • Xiaojiang Peng - Shenzhen University

Contributors

  • Jun-Yan He
  • Kai Wang
  • Yuxiang Lin
  • Zheng Lian
  • Shuyuan Tu
  • Dawei Huang
  • Minghan Li

Institutional Affiliations

  • Carnegie Mellon University (CMU) - Language Technologies Institute
  • Shenzhen Technology University (SZTU) - College of Big Data and Internet
  • Shenzhen University - School of Computer Science and Software Engineering
  • Harbin Institute of Technology - School of Computer Science

Publications

Main Paper (NeurIPS 2024)

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

Zebang Cheng, Zhi-Qi Cheng, Jun-Yan He, Kai Wang, Yuxiang Lin, Zheng Lian, Xiaojiang Peng, Alexander Hauptmann

Published in: Advances in Neural Information Processing Systems (NeurIPS), 2024

Challenge Paper (ACM MRAC 2024)

SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition

Zebang Cheng, Shuyuan Tu, Dawei Huang, Minghan Li, Xiaojiang Peng, Zhi-Qi Cheng, Alexander G. Hauptmann

Published in: Proceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing (MRAC), 2024

πŸ“„ Read the Paper


MER-Factory

A unified pipeline for MER dataset construction.

MER Challenges

  • MER2023: Multimodal Emotion Recognition Challenge 2023
  • MER2024: Multimodal Emotion Recognition Challenge 2024
  • Website: merchallenge.cn

Resources

Code and Models

Datasets


Acknowledgements

Emotion-LLaMA builds upon excellent prior work:

Foundation Models

  • MiniGPT-v2 - Vision-language multi-task learning (Paper)
  • LLaMA-2 - Large language model by Meta AI
  • AffectGPT - Explainable emotion recognition (Paper)
  • LLaVA - Visual instruction tuning (Website)

Feature Extractors

  • HuBERT - Audio representation learning
  • EVA - Visual representation
  • MAE - Masked autoencoders
  • VideoMAE - Video understanding

Tools and Frameworks

  • PyTorch - Deep learning framework
  • Hugging Face - Model hub and tools
  • Gradio - Demo interface
  • OpenFace - Facial analysis

Contributing

We welcome contributions from the community!

Ways to Contribute

  • πŸ› Report Bugs: Open an issue
  • πŸ’‘ Suggest Features: Share your ideas
  • πŸ“– Improve Documentation: Submit pull requests
  • πŸ§ͺ Share Results: Contribute benchmark results
  • πŸ’¬ Help Others: Answer questions in discussions

Development

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Community

GitHub

  • Repository: github.com/ZebangCheng/Emotion-LLaMA
  • Issues: Report bugs and request features
  • Discussions: Ask questions and share ideas
  • Pull Requests: Contribute code and documentation

Star History

Support the project by giving it a star! ⭐

Star History Chart


Contact

For Research Collaboration

  • Email: Contact the corresponding authors
  • Institution: Carnegie Mellon University, Shenzhen Technology University

For Technical Support

  • GitHub Issues: Report issues
  • Documentation: Review the guides on this site

For Media Inquiries


License and Citation

License

Emotion-LLaMA is released under multiple licenses:

  • Code: BSD 3-Clause License
  • Dataset: EULA (research only)
  • Documentation: CC BY-NC 4.0

Learn more about licensing

Citation

If you use Emotion-LLaMA in your research, please cite:

@inproceedings{NEURIPS2024_c7f43ada,
  author = {Cheng, Zebang and Cheng, Zhi-Qi and He, Jun-Yan and Wang, Kai and Lin, Yuxiang and Lian, Zheng and Peng, Xiaojiang and Hauptmann, Alexander},
  title = {Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning},
  booktitle = {Advances in Neural Information Processing Systems},
  year = {2024}
}

View all citation formats


Roadmap

Current Version (v1.0)

  • βœ… NeurIPS 2024 publication
  • βœ… MERR dataset release
  • βœ… Pre-trained models available
  • βœ… Demo and API

Future Plans

  • πŸ”„ Support for more languages
  • πŸ”„ Real-time emotion recognition
  • πŸ”„ Mobile and edge deployment
  • πŸ”„ Additional emotion categories
  • πŸ”„ Improved model efficiency
  • πŸ”„ Extended documentation

FAQ

Is Emotion-LLaMA free to use?

Yes, for research and non-commercial purposes. See license for details.

Can I use it for commercial applications?

Commercial use requires permission. Contact the authors and review all component licenses.

How can I cite this work?

See the citation guide for BibTeX and other formats.

Where can I get help?

How can I contribute?

See the Contributing section above.


Updates and News

Stay updated with the latest developments:

  • GitHub: Watch the repository for updates
  • Paper: Check for new publications
  • Demo: Try the latest features on Hugging Face

Thank You

Thank you for your interest in Emotion-LLaMA! We hope this project advances research in affective computing and multimodal understanding.

If you find our work helpful, please:

  • ⭐ Star the repository
  • πŸ“„ Cite our papers
  • πŸ”— Share with others
  • πŸ’¬ Provide feedback

Get Started View on GitHub Read the Paper


Table of contents