doi.bio/jan_leike

Jan Leike

Early Life and Education

Jan Leike (born 1986 or 1987) is an AI alignment researcher with a background in machine learning and reinforcement learning theory. He obtained his undergraduate degree from the University of Freiburg in Germany, followed by a master's degree in computer science. Leike then pursued a PhD in reinforcement learning theory/machine learning at the Australian National University under the supervision of Marcus Hutter. His PhD thesis, "Nonparametric General Reinforcement Learning," was completed in 2016.

Career

DeepMind

After a six-month postdoctoral fellowship at the Future of Humanity Institute, Leike joined DeepMind to focus on empirical AI safety research, where he collaborated with Shane Legg. At DeepMind, he worked as an alignment researcher, prototyping reinforcement learning from human feedback.

OpenAI

In 2021, Leike joined OpenAI, where he co-led the Superalignment Team with Ilya Sutskever, aiming to determine how to align future artificial superintelligences to ensure their safety. This project involved automating AI alignment research using advanced AI systems. During his time at OpenAI, Leike was involved in the development of InstructGPT, ChatGPT, and the alignment of GPT-4. He also co-authored the Superalignment Team's research roadmap. In May 2024, Leike announced his resignation from OpenAI, citing concerns over the company's safety culture and leadership.

Anthropic

Following his departure from OpenAI, Leike joined Anthropic, an AI company founded by former OpenAI employees, in May 2024.

Research and Publications

Leike's research aims to solve the "hard problem of alignment," specifically how to train AI systems to follow human intent on tasks that are difficult for humans to evaluate directly. His notable publications include:

"Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision" (2023)
"Language models can explain neurons in language models" (2023)
"Self-critiquing models for assisting human evaluators" (2022)
"Training language models to follow instructions with human feedback" (2022)
"Scalable agent alignment via reward modeling: a research direction" (2018)
"Deep Reinforcement Learning from Human Preferences" (2017)

Leike has also written a Substack newsletter, "Musings on the Alignment Problem", where he explores various topics related to the alignment problem in AI.

Recognition

In 2023, Leike was recognized by TIME magazine as one of the 100 most influential people in AI, highlighting his significant contributions to the field.

Jan Leike

Early Life and Education

Jan Leike (born 1986 or 1987) is an AI alignment researcher with a background in machine learning and reinforcement learning theory. He obtained his undergraduate degree from the University of Freiburg in Germany, followed by a master's degree in computer science. Leike then pursued a PhD in machine learning at the Australian National University under the supervision of Marcus Hutter, completing his thesis on "Nonparametric General Reinforcement Learning" in 2016.

Career

DeepMind

Leike began his career with a six-month postdoctoral fellowship at the Future of Humanity Institute before joining DeepMind, where he focused on empirical AI safety research and reinforcement learning from human feedback, collaborating with Shane Legg.

OpenAI

In 2021, Leike joined OpenAI, where he played a pivotal role in the development of groundbreaking technologies. In June 2023, he became the co-leader of the "superalignment" project alongside Ilya Sutskever, with the goal of ensuring the safety of future artificial superintelligences. This project involved automating AI alignment research using advanced AI systems. Leike's work at OpenAI also included the development of InstructGPT and ChatGPT, contributing to significant advancements in the field.

Anthropic

In May 2024, Leike announced his resignation from OpenAI, citing concerns about the company's safety culture and leadership. He subsequently joined Anthropic, an AI startup founded by former OpenAI employees, where he continues to pursue his research interests in superalignment.

Publications and Research

Leike has an impressive list of publications to his name, including:

"Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision" (2023)
"Language models can explain neurons in language models" (2023)
"Self-critiquing models for assisting human evaluators" (2022)
"Training language models to follow instructions with human feedback" (2022)
"Scalable agent alignment via reward modeling: a research direction" (2018)
"Deep Reinforcement Learning from Human Preferences" (2017)

Leike's research primarily focuses on the "hard problem of alignment," aiming to train AI systems to follow human intent on complex tasks that are challenging for direct human evaluation. His work has gained widespread recognition, and in 2023, he was listed as one of TIME's 100 most influential people in AI.

Conclusion

Jan Leike is a prominent figure in the field of AI alignment and machine learning, known for his significant contributions to the development of safe and ethical AI technologies. His research and publications have advanced the understanding of AI alignment, and his work continues to shape the future of artificial intelligence.

Youtube Videos

Youtube Title: OpenAI’s huge push to make superintelligence safe | Jan Leike

Youtube Link: link

Youtube Channel Name: 80,000 Hours

Youtube Channel Link: https://www.youtube.com/@eightythousandhours

OpenAI’s huge push to make superintelligence safe | Jan Leike

Youtube Title: Jan Leike | Super Intelligent Alignment @ Intelligent Cooperation Workshop

Youtube Link: link

Youtube Channel Name: Foresight Institute

Youtube Channel Link: https://www.youtube.com/@ForesightInstitute

Jan Leike | Super Intelligent Alignment @ Intelligent Cooperation Workshop

Youtube Title: Eine deutliche Warnung

Youtube Link: link

Youtube Channel Name: Karl Olsberg

Youtube Channel Link: https://www.youtube.com/@KarlOlsbergAutor

Eine deutliche Warnung

Youtube Title: Ilya Sutskever and Jan Leike RESIGN from OpenAI - My in-depth analysis - end of an era!

Youtube Link: link

Youtube Channel Name: David Shapiro

Youtube Channel Link: https://www.youtube.com/@DaveShap

Ilya Sutskever and Jan Leike RESIGN from OpenAI - My in-depth analysis - end of an era!

Youtube Title: Sam Altman WRECKS OpenAI - Jan Leike joins Anthropic - Brain Drain from OpenAI

Youtube Link: link

Youtube Channel Name: David Shapiro

Youtube Channel Link: https://www.youtube.com/@DaveShap

Sam Altman WRECKS OpenAI - Jan Leike joins Anthropic - Brain Drain from OpenAI

Youtube Title: Jan Leike, OpenAI (MIT AI Event)

Youtube Link: link

Youtube Channel Name: AttentionX

Youtube Channel Link: https://www.youtube.com/@attentionx

Jan Leike, OpenAI (MIT AI Event)

Youtube Title: Googler Reacts To David Shapiro's: Sam Altman WRECKS OpenAI - Jan Leike joins Anthropic

Youtube Link: link

Youtube Channel Name: SVIC Podcast

Youtube Channel Link: https://www.youtube.com/@svicpodcast

Googler Reacts To David Shapiro's: Sam Altman WRECKS OpenAI - Jan Leike joins Anthropic

Youtube Title: OpenAI Faces Turmoil as Jan Leike Resigns: What Does this Mean for the Future of AI?

Youtube Link: link

Youtube Channel Name: AI Insight News

Youtube Channel Link: https://www.youtube.com/@AIInsightNews

OpenAI Faces Turmoil as Jan Leike Resigns: What Does this Mean for the Future of AI?

Youtube Title: Jan Leike - AI alignment at OpenAI

Youtube Link: link

Youtube Channel Name: Towards Data Science

Youtube Channel Link: https://www.youtube.com/@TowardsDataScience

Jan Leike - AI alignment at OpenAI

Youtube Title: is GPT dangerous? The Truth behind Jan Leike's Move from OpenAI to Anthropic

Youtube Link: link

Youtube Channel Name: AI Untold

Youtube Channel Link: https://www.youtube.com/@aiuntold_

is GPT dangerous? The Truth behind Jan Leike's Move from OpenAI to Anthropic

Youtube Title: Stanford CS25: V2 I Language and Human Alignment

Youtube Link: link

Youtube Channel Name: Stanford Online

Youtube Channel Link: https://www.youtube.com/@stanfordonline

Stanford CS25: V2 I Language and Human Alignment

Youtube Title: Jan Leike - Scaling Reinforcement Learning from Human Feedback

Youtube Link: link

Youtube Channel Name: FAR AI

Youtube Channel Link: https://www.youtube.com/@FARAIResearch

Jan Leike - Scaling Reinforcement Learning from Human Feedback

Youtube Title: Jan Leike – General Reinforcement Learning – CSRBAI 2016

Youtube Link: link

Youtube Channel Name: Machine Intelligence Research Institute

Youtube Channel Link: https://www.youtube.com/@MIRIBerkeley

Jan Leike – General Reinforcement Learning – CSRBAI 2016

Youtube Title: 80,000 Hours - Jan Leike on OpenAI’s massive push to make superintelligence safe in 4 years or less

Youtube Link: link

Youtube Channel Name: IMM

Youtube Channel Link: https://www.youtube.com/@imm_radio

80,000 Hours - Jan Leike on OpenAI’s massive push to make superintelligence safe in 4 years or less

Youtube Title: Is OpenAI Neglecting Safety Concerns? (Jan Leike Adresses Issues)

Youtube Link: link

Youtube Channel Name: WieseTechnology

Youtube Channel Link: https://www.youtube.com/@WieseTechnology

Is OpenAI Neglecting Safety Concerns? (Jan Leike Adresses Issues)

Youtube Title: AGI-15 Jan Leike - Using Localization and Factorization to Reduce the Complexity of RL

Youtube Link: link

Youtube Channel Name: AGI Society

Youtube Channel Link: https://www.youtube.com/@AGISocietyOfficial

AGI-15 Jan Leike - Using Localization and Factorization to Reduce the Complexity of RL

Youtube Title: 🚩OpenAI Safety Team "LOSES TRUST" in Sam Altman and gets disbanded. The "Treacherous Turn".

Youtube Link: link

Youtube Channel Name: Wes Roth

Youtube Channel Link: https://www.youtube.com/@WesRoth

🚩OpenAI Safety Team "LOSES TRUST" in Sam Altman and gets disbanded. The "Treacherous Turn".

Youtube Title: Microsoft Promises a 'Whale' for GPT-5, Anthropic Delves Inside a Model’s Mind and Altman Stumbles

Youtube Link: link

Youtube Channel Name: AI Explained

Youtube Channel Link: https://www.youtube.com/@aiexplained-official

Microsoft Promises a 'Whale' for GPT-5, Anthropic Delves Inside a Model’s Mind and Altman Stumbles

Youtube Title: The Exciting, Perilous Journey Toward AGI | Ilya Sutskever | TED

Youtube Link: link

Youtube Channel Name: TED

Youtube Channel Link: https://www.youtube.com/@TED

The Exciting, Perilous Journey Toward AGI | Ilya Sutskever | TED

Youtube Title: Working in AI | Jan Leike, Helen Toner, Malo Bourgon, and Miles Brundage

Youtube Link: link

Youtube Channel Name: Centre for Effective Altruism

Youtube Channel Link: https://www.youtube.com/@EffectiveAltruismVideos

Working in AI | Jan Leike, Helen Toner, Malo Bourgon, and Miles Brundage