Tianyi Bai

HKUST CSE Ph.D. / Agentic AI / Data-Centric AI

Tianyi Bai

I am a Ph.D. student in Computer Science at HKUST, advised by Prof. Binhang Yuan. My research focuses on building capable computer-use and coding agents, with an emphasis on data-centric methods for improving their training, evaluation, and multimodal reasoning abilities. I am fortunate to intern at Qwen, where I work with Binyuan Hui and Junyang Lin. I also collaborate with Prof. Wentao Zhang at PKU DCAI.

Tianyi Bai
Research Interests Computer Use Agent Coding Agent Multimodal Reasoning Data-Centric AI

Representative Work

Computer Use Agent benchmark results Computer Use Agent

Qwen3.5 / Qwen3.6

Leading Computer Use Agent capability work across RL infrastructure, annotation quality, data pipelines, training, evaluation, and bad case analysis.

Qwen3-Coder-Next coding benchmark Pareto frontier Coding Agent

Qwen3-Coder

Contributed to the Browser Use Agent module, including browser interaction data construction, capability improvement, training pipeline support, and evaluation.

DataFlow generate and clean high-quality LLM data workflow Data Pipeline

DataFlow

Responsible for the code data pipeline, including code data processing, quality filtering, pipeline orchestration, and preparation of training-ready code data.

Selected Work

Experience

  • May 2025 - Present / Alibaba Qwen TeamResearch Intern
    I work on agent capabilities for Qwen models, including Browser Use Agent for Qwen3-Coder and Computer Use Agent for Qwen3.5/Qwen3.6. My work spans data construction, RL infrastructure, training, evaluation, and failure analysis.
  • Dec 2023 - Dec 2025 / Peking University DCAI GroupResearch Assistant
    I contribute to DataFlow, with a focus on code-data workflows, code data pipeline construction, data processing, and quality filtering for training-ready code data.
  • Mar 2024 - May 2025 / Shanghai Artificial Intelligence Laboratory, OpenDataLabResearch Intern
    I worked on data preparation and selection for LLM pretraining, including data management strategies for InternLM3-8B and Ray-based labeling pipelines for data selection.
  • Jul 2021 - Jul 2023 / Peking University DAIR GroupResearch Assistant
    I studied transfer learning for Bayesian optimization and hyperparameter tuning. This work led to a KDD 2022 paper on transfer-learning-based search space design.