About
I am a Research Scientist at ByteDance Seed working on post-training and large-scale RL for LLMs. I received my Ph.D. in EECS from UC Berkeley (Dec 2024), where I had the privilege of being advised by Prof. Anca Dragan in the InterACT Lab. Before Berkeley I was at Stanford (class of 2018), where I was fortunate to work with Amir Zamir, Silvio Savarese, and Dorsa Sadigh at SVL and ILIAD.
I'm interested in making digital and physical agents more powerful, reliable, and safely aligned. Currently I'm focused on agentic systems, reasoning models, efficient large language models, and RL with verifiable rewards (RLVR). At ByteDance Seed I've contributed to Seed-OSS and the Seed-Thinking series (v1.5, v1.6, v2.0, and beyond 👀).
During my Ph.D., I worked on the safety of human–robot systems from three angles: (1) causal confusion — how learned reward models latch onto spurious correlates of human preference, and how to diagnose it; (2) adversarial perturbations — quantifying robustness of assistive policies along a natural–adversarial frontier; and (3) steerability — making model behavior controllable and personalizable at inference time without retraining.
Publications
Full list on Google Scholar · * indicates equal contribution
- 2026 Seed 2.0: Next-Generation Foundation Model Tech Report [overview]
- 2025 Seed-OSS: Open-Source Foundation Models Tech Report [code] [models]
- 2025 Seed1.6: Adaptive Thinking and Multimodal Reasoning Tech Report [overview] [tech blog]
- 2025 Context Steering: Controllable Personalization at Inference Time ICLR 2025
- 2025 Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning Tech Report [arXiv] [code]
- 2023 Quantifying Assistive Robustness Via the Natural-Adversarial Frontier CoRL 2023
- 2023 Causal Confusion and Reward Misidentification in Preference-Based Reward Learning ICLR 2023
- 2022 Learning Representations that Enable Generalization in Assistive Tasks CoRL 2022
- 2021 Assisted Robust Reward Design CoRL 2021
- 2019 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera ICCV 2019 [website] [code] [paper]
- 2018 Gibson Env: Real-World Perception for Embodied Agents CVPR 2018 (Spotlight) [website] [code] [paper]
Selected Projects
- Gibson Environment Co-led the development of a simulation platform for real-world active perception. Recipient of the 2018 NVIDIA Pioneering Research Award.
- ThingPedia Open-source platform for a personalized Internet of Things, with Prof. Monica Lam.