About me
I’m a Ph.D. student at CSE, HKUST, supervised by Prof. Binhang Yuan. Before that, I got my B.S. degree in Statistics and B.Eng. degree in Artificial Intelligence from Beijing Institute of Technology (BIT). Previously I worked as a research assistant at Peking University DAIR lab, supervised by Prof. Bin Cui.
My research interests mainly focus on Data Efficient Large Language Models and Multimodal Large Language Models. Previously, I focused on Automatic Machine Learning (AutoML), especially Hyperparameter Optimization problem.
Selected Work
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Tianyi Bai, Ling Yang, Zhen Hao Wong, Jiahui Peng, Xinlin Zhuang, Chi Zhang, Lijun Wu, Jiantao Qiu, Wentao Zhang, Binhang Yuan, Conghui He
In submissionA Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai, Hao Liang, Binwang Wan, Ling Yang, Bozhou Li, Yifan Wang, Bin Cui, Conghui He, Binhang Yuan, Wentao Zhang
ACM Computing Survey in submission
Full list in Google Scholar.
Education
Hong Kong University of Science and Technology
PhD in Computer Science and Engineering
September 2023-presentBeijing Institute of Technology
BS in Statistics, School of Mathematics and Statistics
September 2019 to June 2023
BEng in Artificial Intelligence, School of Computer Science
February 2020 to June 2023
Intern & Work Experience
- Shanghai Artificial Intelligence Laboratory, Shanghai, China
May 2024 to Present
Position: Research Intern
Project:- Data-efficient LLM pretraining (supervised by Dr. Conghui He and Dr. Jiantao Qiu) –> Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
- Synthetic Data Detection (supervised by Dr. Conghui He and Prof. Weijia Li) –> LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
- Peking University, Beijing, China
January 2024 to Present
Position: Research Intern supervised by Prof. Wentao Zhang
Project:- Data-centric Generative AI –> A Survey of Multimodal Large Language Model from A Data-centric Perspective
- Peking University, Beijing, China
July 2021 to July 2023
Position: Research Intern in Prof. Bin Cui’s Group
Projects:- Transfer Learning for Bayesian Optimization –> First author preprint review: Transfer Learning for Bayesian Optimization: A Survey
- Transfer Learning based Hyperparameter Optimization –> KDD2022: Transfer Learning based Search Space Design for Hyperparameter Tuning