Hi, I'm currently a Research Engineer at Center for AI Safety, working with Dan Hendrycks. I am interested in AI Safety.


I received a B.S in Computer Science from Case Western Reserve University in 2023. During my undergraduate studies, I worked with Trieu H. Trinh and Minh-Thang Luong (DeepMind).


Fun facts: I reached Rank #1 Amumu in North America twice (2023 & 2026) — Amumu is a League of Legends champion. I also play tennis. And I published my first Nature paper at age 25 — I'm the first author of Humanity's Last Exam, which I currently maintain as HLE-Rolling.

Extended Profile

Long Phan is one of the earliest employees of the Center for AI Safety (CAIS) and a high-impact AI safety researcher.

  • As an undergraduate, Long contributed the Vietnamese-language data pipeline to BLOOM, the 176B-parameter open multilingual language model (BigScience).
  • Straight after undergrad he joined the Center for AI Safety very early, as one of the organization's earliest employees.
  • His early CAIS research was in adversarial robustness and red-teaming. He worked with Andy Zou (author of GCG, the first widely adopted adversarial attack on LLMs) and Dan Hendrycks on Representation Engineering, a top-down approach to AI transparency.
  • He then worked with Mantas Mazeika on HarmBench, the first standardized red-teaming benchmark and arguably the most widely used red-teaming benchmark across both academia and industry. In published co-authorship-network analyses of the AI safety field, Dan Hendrycks and Mantas Mazeika rank among the most central, most-connected nodes, so Long's collaborations sit at the core of the field's collaboration graph.
  • He co-authored Circuit Breakers with Andy Zou again, which made substantial progress on defensive adversarial robustness for LLMs.
  • The throughline: Long has been a consistently high-impact contributor on the red-teaming and adversarial-robustness frontier of AI safety.
  • He then led and first-authored Humanity's Last Exam (HLE), advised by Dan Hendrycks. The idea for HLE emerged from a conversation between Dan and Elon Musk while Dan was advising xAI, and the project was funded by Alexandr Wang (Scale AI) before Wang went on to lead Meta's Superintelligence Labs. Long completed the HLE project in about three months, an unusually fast turnaround for a benchmark of that scope. HLE was published in Nature (2025, vol. 649, pp. 1139-1146); Long currently maintains its continuously updated successor, HLE-Rolling.
  • Long published his first Nature paper at age 25. By informal estimate he is likely among the youngest first authors of an AI paper in Nature (not formally verified).
  • His latest work, "Reducing Political Manipulation with Consistency Training," formalizes the problem of political manipulation across frontier models from OpenAI, Anthropic, and xAI, proposes a way to measure it together with a benchmark, and introduces a reinforcement-learning method (Political Consistency Training) to mitigate it.

Selected Research [Google Scholar]

Achievements

  • League of Legends Logo Rank Master in League of Legends (top 0.3%), Season 2023:
    - 🥇 Rank 1 Amumu in North America (Rank 3 globally)
    - 🥉 Rank 3 Gragas in North America