About Me
I am currently a Founding Member of Technical Staff at a startup: Player2. I focused on developing real-time action agent that can interact with commercial game environment and play with human. Please checkout the game-play videos from our model here. We released the research details in the paper and also release all the dataset. Prior to Player2, I worked at Amazon AGI team as an Applied Scientist, focused on the pretraining and SFT of LLM. Before joining Amazon, I worked as a Machine Learning Researcher at the Cortex Applied Research team at Twitter, with a focus on recommender system and reinforcement learning. I obtained my PhD from Statistics and Data Science Department in University of Texas at Austin, supervised by Dr.Mingyuan Zhou. Myresearch interest are vision-language-action model, reinforcement learning and Multimodal LLM.
Education
- Ph.D in University of Texas at Austin, 2021
- M.S. in University of California, Los Angeles, 2017
- B.S. in Fudan University, 2015
Publications
Clinical Implications of the T790M Mutation in Disease Characteristics and Treatment Response in Patients With Epidermal Growth Factor Receptor (EGFR)-Mutated Non–Small-Cell Lung Cancer (NSCLC)
D. Gaut, M. Sim, Y. Yue, B. Wolf, P. Abarca, J. Carroll, J. Goldman, E. Garon. "Clinical Implications of the T790M Mutation in Disease Characteristics and Treatment Response in Patients With Epidermal Growth Factor Receptor (EGFR)-Mutated Non–Small-Cell Lung Cancer (NSCLC)" Clinical Lung Cancer(2018).
T-optimal designs for multi-factor polynomial regression models via a semidefinite relaxation method
Y. Yue, L. Vandenberghe, W.K. Wong. "T-optimal designs for multi-factor polynomial regression models via a semidefinite relaxation method" Statistics and Computing(2018).
ARSM: Augment-REINFORCE-swap-merge estimator for gradient backpropagation through categorical variables
M. Yin*, Y. Yue*, M. Zhou . "ARSM: Augment-REINFORCE-swap-merge estimator for gradient backpropagation through categorical variables" ICML(2019).
Semi-supervised Learning using Adversarial Training with Good and Bad Samples
W. Li, Z. Wang, Y. Yue, J. Li, W. Speier, M. Zhou, C. Arnold. "Semi-supervised Learning using Adversarial Training with Good and Bad Samples" Machine Vision and Applications(2020).
Discrete action on-policy learning with action-value critic
Y. Yue, Y. Tang, M. Yin, and M. Zhou . "Discrete action on-policy learning with action-value critic" AISTATS(2020).
A Unified Framework for Tuning Hyperparameters in Clustering Problems
X. Fan, Y. Yue, P. Sarkar, R. Wang . "A Unified Framework for Tuning Hyperparameters in Clustering Problems" ICML(2020).
Implicit Distributional Reinforcement Learning
Y. Yue*, Z. Wang*, and M. Zhou . "Implicit Distributional Reinforcement Learning" Neurips(2020).
Learning to Rank For Push Notifications Using Pairwise Expected Regret
Y. Yue*, Y. Xie, H. Wu, H. Jia, S. Zhai, W. Shi, J. Hunt*. "Learning to Rank For Push Notifications Using Pairwise Expected Regret" Arxiv(2022).
Maml-en-llm: Model agnostic meta-training of llms for improved in-context learning
S. Sinha, Y. Yue, V. Soto, M Kulkarni, J Lu, A Zhang. "Maml-en-llm: Model agnostic meta-training of llms for improved in-context learning" SIGKDD(2024).
Pixels to play: A foundation model for 3d gameplay
Y. Yue, C Green, S Hunt, I Salia, W Shi, J Hunt. "Pixels to play: A foundation model for 3d gameplay" CoG(2025).
Learning to play: A Multimodal Agent for 3D Game-Play
Y. Yue, I Salia, S Hunt, C Green, W Shi, J Hunt. "Learning to play: A Multimodal Agent for 3D Game-Play" ICCV Workshop(2025).
Work experience
- Player2: Founding Member of Technical Staff (May 2024 - Now)
- VLA model for game playing (website)
- Amazon: Applied Scientist (Sept 2022 - May 2024)
- SFT and Pretraining of LLM
- Twitter: Machine Learning Researcher (June 2021 - Sept 2022)
- Work on building recommender for the new explore page
- Improve Ads ranking model with multitask learning techniques
- Build simulation systems with JAX
- Nuro: Machine Learning Researcher Intern (Spring 2021)
- Improve the agent using imitation learning technique
- Twitter: Machine Learning Research Engineer Intern (Summer, Fall 2020)
- Improve push notification recommendation system
- Bytedance: Research Intern (Winter 2019)
- Build Bayesian testing platform
- Tune hyperparameter with Bayesian Optimization
