doi.bio/jacob_hilton
Jacob Hilton
Overview
Jacob Hilton is a researcher at the Alignment Research Center. He previously worked at OpenAI, Jane Street, and was a PhD student in combinatorial set theory.
Education
- University of Leeds
- University of Cambridge
Work
- Alignment Research Center – Researcher (current)
- OpenAI – Researcher
- Jane Street
Research Interests
- Reinforcement learning
- Language models
- Scaling laws
- Interpretability
Publications
- Scaling laws for reward model overoptimization (blog)
- Verbalized calibration (blog)
- InstructGPT (blog)
- WebGPT (blog, forum)
- GSM8K (blog)
- Scaling laws for RL (forum)
- Batch size-invariance for policy optimization (code, poster)
- TruthfulQA (code)
- Understanding RL Vision (code)
- Phasic Policy Gradient (code)
- Procgen Benchmark (blog)
Machine Learning Notes
- Deep Learning Curriculum
- KL divergence of max-of-n
- Double-GAE
- Learning rate warmup for Adam
- Preconditioning for SGD
Jacob Hilton
Biography
Jacob Hilton is a researcher at the Alignment Research Center. He previously worked at OpenAI on reinforcement learning-related topics, including the truthfulness of language models (ChatGPT, WebGPT, and TruthfulQA), scaling laws for RL and overoptimization, and interpretability for RL.
Before joining OpenAI, Hilton worked at Jane Street and was a PhD student in combinatorial set theory. He holds a master's degree in mathematics, with a thesis on "Lebesgue measurability and large cardinals."
Published Work
Hilton has published work in both machine learning and mathematics. His machine learning articles include:
- Scaling laws for reward model overoptimization
- Verbalized calibration
- InstructGPT
- WebGPT
- GSM8K
- Scaling laws for RL
- Batch size-invariance for policy optimization
- TruthfulQA
- Understanding RL Vision
- Phasic Policy Gradient
- Procgen Benchmark
His mathematics articles include:
- Combinatorics of countable ordinal topologies (PhD thesis)
- Topological Ramsey numbers and countable ordinals
- The topological pigeonhole principle for ordinals
- Any modification of Müller's Markov process is transient
- Lebesgue measurability and large cardinals (master's thesis)
- The Hex Factor: The NIST Hash Function Competition
Youtube Videos
Youtube Title: Jacob Hilton - Eternity (official video)
Youtube Link: link
Youtube Channel Name: Jacob Hilton
Youtube Channel Link: https://www.youtube.com/channel/UCCamH0n69yzRN6oUwEF2SeA
Jacob Hilton - Eternity (official video)
Youtube Title: Jacob Hilton - Where We'll Go (official video)
Youtube Link: link
Youtube Channel Name: Jacob Hilton
Youtube Channel Link: https://www.youtube.com/channel/UCCamH0n69yzRN6oUwEF2SeA
Jacob Hilton - Where We'll Go (official video)
Youtube Title: Jacob Hilton - Exeter (official lyric video)
Youtube Link: link
Youtube Channel Name: Jacob Hilton
Youtube Channel Link: https://www.youtube.com/channel/UCCamH0n69yzRN6oUwEF2SeA
Jacob Hilton - Exeter (official lyric video)
Youtube Title: Jacob Hilton - The Valley (official lyric video)
Youtube Link: link
Youtube Channel Name: Jacob Hilton
Youtube Channel Link: https://www.youtube.com/channel/UCCamH0n69yzRN6oUwEF2SeA
Jacob Hilton - The Valley (official lyric video)
Youtube Title: Jacob Hilton - Take It or Leave It (official audio)
Youtube Link: link
Youtube Channel Name: Jacob Hilton
Youtube Channel Link: https://www.youtube.com/channel/UCCamH0n69yzRN6oUwEF2SeA
Jacob Hilton - Take It or Leave It (official audio)
Youtube Title: Jacob Hilton - He Is (official lyric video)
Youtube Link: link
Youtube Channel Name: Jacob Hilton
Youtube Channel Link: https://www.youtube.com/channel/UCCamH0n69yzRN6oUwEF2SeA
Jacob Hilton - He Is (official lyric video)
Youtube Title: Jacob Hilton - Oh My Days (official lyric video)
Youtube Link: link
Youtube Channel Name: Jacob Hilton
Youtube Channel Link: https://www.youtube.com/channel/UCCamH0n69yzRN6oUwEF2SeA
Jacob Hilton - Oh My Days (official lyric video)