doi.bio/niklas_muennighoff

Niklas Muennighoff

Niklas Muennighoff is a researcher in Large Language Models (LLMs) at Contextual AI. He will be starting a PhD at Stanford in September. Muennighoff completed his bachelor's degree at Peking University.

Muennighoff's research focuses on improving the usefulness of LLMs. He has worked on projects in Chinese, English, Japanese, French, and German. His work includes:

Scaling up LLMs
Improving instruction-following in models
Working on the largest text embedding benchmark (MTEB)
Building multimodal models

Publications

Muennighoff has published several papers in the field of Natural Language Processing (NLP) and LLMs. Some of his notable publications include:

MTEB: Massive Text Embedding Benchmark: This paper introduces the Massive Text Embedding Benchmark (MTEB), which aims to address the lack of proper evaluation of text embeddings. MTEB spans 8 embedding tasks, 58 datasets, and 112 languages. The paper establishes the most comprehensive benchmark of text embeddings and finds that no single text embedding method dominates across all tasks.
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting: The paper focuses on adapting the BLOOM language model to new languages not seen during its pretraining. It finds that language adaptation is effective in improving performance in new languages and that adapter-based finetuning is more effective than continued pretraining for large models.
Crosslingual Generalization through Multitask Finetuning: This work applies multitask prompted finetuning (MTF) to multilingual language models, finding that finetuning on English tasks and prompts allows for generalization to non-English languages. The paper also introduces xP3, a composite of supervised datasets in 46 languages.
FinGPT: Large Generative Models for a Small Language: The paper studies the challenges of creating LLMs for Finnish, a low-resource language. It introduces the FinGPT and BLUUMI models and evaluates them using FIN-bench, a Finnish version of BIG-bench.
What Language Model to Train if You Have One Million GPU Hours?: This work focuses on identifying the best architecture and training setup for the BLOOM language model within a given computational budget. It performs an ablation study and investigates the impact of different pre-training corpora and multilingual models.

Other Projects

In addition to his research and publications, Muennighoff has also worked on several other projects, including:

OctoPack: Instruction tuning code for large language models.
SGPT: GPT sentence embeddings for semantic search.
GRITLM: Generative Representational Instruction Tuning.
Bloomprint: A collection of 140 simple rules for living a long life, based on research papers.

Youtube Videos

Youtube Title: Speech of Niklas Muennighoff, Global BBA emblematic student | ESSEC Commencement Day 2022

Youtube Link: link

Youtube Channel Name: ESSEC Business School

Youtube Channel Link: https://www.youtube.com/@essec

Speech of Niklas Muennighoff, Global BBA emblematic student | ESSEC Commencement Day 2022

Youtube Title: Cohere For AI - Community Talks: Niklas Muennighoff

Youtube Link: link

Youtube Channel Name: Cohere

Youtube Channel Link: https://www.youtube.com/@CohereAI

Cohere For AI - Community Talks: Niklas Muennighoff

Youtube Title: C4AI @ NeurIPS, 2023 - Niklas Muennighoff on Scaling Data-Constrained Language Models

Youtube Link: link

Youtube Channel Name: Cohere

Youtube Channel Link: https://www.youtube.com/@CohereAI

C4AI @ NeurIPS, 2023 - Niklas Muennighoff on Scaling Data-Constrained Language Models

Youtube Title: Scaling Data-Constrained Language Models | Talk at IST-Unbabel Seminar

Youtube Link: link

Youtube Channel Name: Niklas Muennighoff

Youtube Channel Link: https://www.youtube.com/@Muennighoff

Scaling Data-Constrained Language Models | Talk at IST-Unbabel Seminar

Youtube Title: Niklas Muennighoff - From GPU poor to poor GPU rich

Youtube Link: link

Youtube Channel Name: Sasha Rush 🤗

Youtube Channel Link: https://www.youtube.com/@srush_nlp

Niklas Muennighoff - From GPU poor to poor GPU rich

Youtube Title: MTEB: Massive Text Embedding Benchmark | Video Summary

Youtube Link: link

Youtube Channel Name: Niklas Muennighoff

Youtube Channel Link: https://www.youtube.com/@Muennighoff

MTEB: Massive Text Embedding Benchmark | Video Summary

Youtube Title: Crosslingual Generalization through Multitask Finetuning | Video Summary

Youtube Link: link

Youtube Channel Name: Niklas Muennighoff

Youtube Channel Link: https://www.youtube.com/@Muennighoff

Crosslingual Generalization through Multitask Finetuning | Video Summary

Youtube Title: Graduation Speech, Peking University

Youtube Link: link

Youtube Channel Name: Niklas Muennighoff

Youtube Channel Link: https://www.youtube.com/@Muennighoff

Graduation Speech, Peking University

Youtube Title: Generative Representational Instruction Tuning and Agents for Video Creation | Multimodal Weekly 42

Youtube Link: link

Youtube Channel Name: Twelve Labs

Youtube Channel Link: https://www.youtube.com/@Twelve_Labs

Generative Representational Instruction Tuning and Agents for Video Creation | Multimodal Weekly 42

Youtube Title: Crosslingual Generalization through Multitask Finetuning (BLOOMZ & mT0)

Youtube Link: link

Youtube Channel Name: Samuel Albanie

Youtube Channel Link: https://www.youtube.com/@SamuelAlbanie1

Crosslingual Generalization through Multitask Finetuning (BLOOMZ & mT0)

Youtube Title: Generative Representational Instruction Tuning Explained

Youtube Link: link

Youtube Channel Name: Unify

Youtube Channel Link: https://www.youtube.com/@unifyai

Generative Representational Instruction Tuning Explained

Youtube Title: DeepPavlov Community Call #11

Youtube Link: link

Youtube Channel Name: DeepPavlov

Youtube Channel Link: https://www.youtube.com/@DeepPavlov

DeepPavlov Community Call #11

Youtube Title: Forget models, build a data flywheel | Machine Learning Monthly January 2021

Youtube Link: link

Youtube Channel Name: Daniel Bourke

Youtube Channel Link: https://www.youtube.com/@mrdbourke

Forget models, build a data flywheel | Machine Learning Monthly January 2021