I'm Miles Wang. I’m a researcher at OpenAI trying to build beneficial and safe AGI.

I’m on the RL team, but my interests span alignment, evaluations, reasoning, and science. I’ve worked on a number of research directions, including:

  • Scalable oversight of increasingly capable models, such as monitoring chains-of-thought for reward hacking.
  • Frontier evaluations for high-compute RL runs.
  • AI for science (especially biology) with agents that learn online.
  • Frontier risk evaluations for models, including maximal capability elicitation.
  • Alignment of behavior and understanding when misalignment generalizes.
  • Adversarial robustness to jailbreaks.
  • Machines that learn over long horizons: Currently top of mind.

I studied Computer Science at Harvard before leaving to join OpenAI in March 2024. Feel free to contact me at milesw [at] openai [dot] com.

Selected Papers