Tianyi Bai

HKUST CSE Ph.D. / Agentic AI / Data-Centric AI

Tianyi Bai

I work on agentic AI, multimodal visual reasoning, and data-centric AI. I am a Ph.D. student at HKUST supervised by Prof. Binhang Yuan. I am also fortunate to intern at Qwen, where I am supervised by Binyuan Hui and Junyang Lin. I also collaborate with Prof. Wentao Zhang at PKU DCAI.

Tianyi Bai
Research Interests Computer Use Agent Coding Agent Multimodal Reasoning Data-Centric AI

Representative Work

Computer Use Agent

Qwen3.5 / Qwen3.6

Leading Computer Use Agent capability work across RL infrastructure, annotation quality, data pipelines, training, evaluation, and bad case analysis.

Coding Agent

Qwen3-Coder

Contributed to the Browser Use Agent module, including browser interaction data construction, capability improvement, training pipeline support, and evaluation.

Data Pipeline

DataFlow

Responsible for the coding pipeline, including code data processing, quality filtering, pipeline orchestration, and preparation of training-ready code data.

Selected Work

Experience

  • May 2025 - Present / Alibaba Qwen TeamResearch Intern
    I work on agent capabilities for Qwen models, including Browser Use Agent for Qwen3-Coder and Computer Use Agent for Qwen3.5/Qwen3.6. My work spans data construction, RL infrastructure, training, evaluation, and failure analysis.
  • Dec 2023 - Present / Peking University DCAI GroupResearch Assistant
    I contribute to DataFlow, with a focus on code-data workflows, coding pipeline construction, data processing, and quality filtering for training-ready code data.
  • Mar 2024 - May 2025 / Shanghai Artificial Intelligence Laboratory, OpenDataLabResearch Intern
    I worked on data preparation and selection for LLM pretraining, including data management strategies for InternLM3-8B and Ray-based labeling pipelines for data selection.
  • Jul 2021 - Jul 2023 / Peking University DAIR GroupResearch Assistant
    I studied transfer learning for Bayesian optimization and hyperparameter tuning. This work led to a KDD 2022 paper on transfer-learning-based search space design.