profile_picture
Zhihui Xie (谢知晖)
Ph.D. Student, The University of Hong Kong
zhxieml@gmail.com

I am a 1st year Ph.D. student at HKU, advised by Lingpeng Kong and Qi Liu. My research focuses on AI alignment, with a particular emphasis on developing scalable methods to incentivize models to provide feedback (i.e., rewards) and learn from this feedback effectively.

Previously, I obtained my Master’s degree at Shanghai Jiao Tong University, under the supervision of Shuai Li. I received my Bachelor’s degree from IEEE Honor Class, Shanghai Jiao Tong University, where I was fortunate to work with Junchi Yan. I was a research intern at Reka AI (Oct 2023 - Jan 2024), Tencent AI Lab (Apr 2021 - Sep 2021) and XYZ Robotics (Jul 2020 - Oct 2020).

Interests

  • AI Alignment
  • Multi-modal Models
  • Reinforcement Learning

Academia

The University of Hong Kong
2024 - present
Ph.D. @HKUNLP
Shanghai Jiao Tong University
2021 - 2024
M.Sc. Computer Science
Shanghai Jiao Tong University
2017 - 2021
B.Sc. Computer Science (IEEE Honor Class)

News

Selected Works

Please find more on my Google Scholar profile
Calibrating Reasoning in Language Models with Internal Consistency, 2024, NeurIPS
Zhihui Xie , Jizhou Guo , Tong Yu , Shuai Li
Learning Versatile Skills with Curriculum Masking, 2024, NeurIPS
Yao Tang* , Zhihui Xie* , Zichuan Lin , Deheng Ye , Shuai Li
VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment, 2024, EMNLP
Lei Li* , Zhihui Xie* , Mukai Li , Shunian Chen , Peiyi Wang , Liang Chen , Yazheng Yang , Benyou Wang , Lingpeng Kong
Future-conditioned Unsupervised Pretraining for Decision Transformer, 2023, ICML
Zhihui Xie , Zichuan Lin , Deheng Ye , Qiang Fu , Wei Yang , Shuai Li
Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations, 2022, EMNLP
Zhihui Xie , Handong Zhao , Tong Yu , Shuai Li
Doubly-Adaptive Reinforcement Learning for Cross-Domain Interactive Recommendation, 2022, SIGIR
Junda Wu* , Zhihui Xie* , Tong Yu , Handong Zhao , Ruiyi Zhang , Shuai Li
Comparison-based Conversational Recommender System with Relative Bandit Feedback, 2021, SIGIR
Zhihui Xie , Tong Yu , Canzhe Zhao , Shuai Li