doi.bio/daniele_calandriello
Daniele Calandriello
Daniele Calandriello is a research scientist at DeepMind, where he focuses on machine learning and artificial intelligence. He has previously been affiliated with the following institutions:
- Università degli Studi di Genova
- Istituto Italiano di Tecnologia
- Politecnico di Milano
- Facebook AI Research
- Massachusetts Institute of Technology
- Deepmind, Paris
- HSE University
- Duisburg-Essen University
- École Polytechnique
- Mohamed Bin Zayed University of AI
- University of California, Berkeley
- Meta (FAIR)
Publications
Daniele Calandriello has authored or co-authored numerous papers, including:
- "Nash Learning from Human Feedback" (2023)
- "Model-free Posterior Sampling via Learning Rate Randomization" (2023)
- "Demonstration-Regularized RL" (2023)
- "A General Theoretical Paradigm to Understand Learning from Human Preferences" (2023)
- "Unlocking the Power of Representations in Long-term Novelty-based Exploration" (2023)
- "Fast Rates for Maximum Entropy Exploration" (2023)
- "Understanding Self-Predictive Learning for Reinforcement Learning" (2022)
- "Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees" (2022)
- "BYOL-Explore: Exploration by Bootstrapped Prediction" (2022)
- "Information-theoretic Online Memory Selection for Continual Learning" (2022)
- "Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times" (2022)
- "One Pass ImageNet" (2021)
- "ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions" (2021)
- "On the Emergence of Whole-Body Strategies from Humanoid Robot Push-Recovery Learning" (2020)
- "Sampling from a k-DPP without Looking at All Items" (2020)
- "Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification" (2020)
- "Learning to Sequence Multiple Tasks with Competing Constraints" (2019)
- "Statistical and Computational Trade-Offs in Kernel K-Means" (2019)
- "Exact sampling of determinantal point processes with sublinear time preprocessing" (2019)
- "Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret" (2018)
- "On Fast Leverage Score Sampling and Optimal Learning" (2018)
- "Distributed Adaptive Sampling for Kernel Matrix Approximation" (2017)
- "Second-Order Kernel Online Convex Optimization with Adaptive Sketching" (2016)
- "Analysis of Kelner and Levin graph sparsification algorithm for a streaming setting" (2016)
- "Incremental Spectral Sparsification for Large-Scale Graph-Based Semi-Supervised Learning" (2013)
- "Semi-Supervised Information-Maximization Clustering" (2013)
Daniele Calandriello
Daniele Calandriello is a research scientist at DeepMind, specialising in machine learning and artificial intelligence.
Biography
Calandriello is affiliated with the following institutions:
- DeepMind
- Università degli Studi di Genova
- Istituto Italiano di Tecnologia
- Politecnico di Milano
- École Polytechnique Fédérale de Lausanne
- University of Southern California
- Technische Universität Darmstadt
Publications
Calandriello has published extensively in the field of machine learning and reinforcement learning, with a focus on large language models (LLMs) and their alignment with human preferences. Notable publications include:
- Nash Learning from Human Feedback (2023): This paper introduces an alternative pipeline for fine-tuning LLMs using pairwise human feedback, aiming to address the limitations of current reward models in representing the richness of human preferences.
- Model-free Posterior Sampling via Learning Rate Randomization (2023): The paper proposes Randomized Q-learning (RandQL), a model-free algorithm for regret minimization in episodic Markov Decision Processes (MDPs).
- Demonstration-Regularized RL (2023): This work theoretically quantifies the impact of incorporating expert demonstrations on the sample efficiency of reinforcement learning (RL).
- A General Theoretical Paradigm to Understand Learning from Human Preferences (2023): The paper derives a new general objective, $\Psi$PO, for learning from human preferences, bypassing the approximations made in existing practical algorithms.
- Unlocking the Power of Representations in Long-term Novelty-based Exploration (2023): Calandriello et al. introduce RECODE, a non-parametric method for novelty-based exploration that estimates visitation counts for clusters of states, achieving state-of-the-art performance in challenging 3D exploration tasks and hard exploration Atari games.
- Fast Rates for Maximum Entropy Exploration (2023): The paper studies the reinforcement learning setting with sparse or absent reward signals, where exploration is crucial. It proposes an algorithm with improved sample complexity for visitation entropy maximization.
- Understanding Self-Predictive Learning for Reinforcement Learning (2022): This work studies the learning dynamics of self-predictive learning algorithms and identifies critical design choices to prevent representation collapse.
- Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees (2022): Calandriello et al. propose an optimistic posterior sampling algorithm with improved regret bounds for reinforcement learning in episodic, finite Markov decision processes.
- BYOL-Explore: Exploration by Bootstrapped Prediction (2022): BYOL-Explore is a curiosity-driven exploration approach for visually complex environments, learning a world representation, dynamics, and exploration policy through a single prediction loss.
- Information-theoretic Online Memory Selection for Continual Learning (2022): This work investigates the online memory selection problem in task-free continual learning from an information-theoretic perspective, proposing criteria to select informative data points.
Co-authors
Calandriello has frequently collaborated with the following researchers:
- Michal Valko
- Mohammad Gheshlaghi Azar
- Yunhao Tang
- Pierre Harvey Richemond
- Mark Rowland
- Andrea Michi
- Bilal Piot
- Daniil Tiapkin
- Denis Belomestny
- Eric Moulines
- Remi Munos
- Alexey Naumov
- Pierre Perrault
Youtube Videos
Youtube Title: Talk by Daniele Calandriello (DeepMind, Paris) hosted by Approximate Bayesian Inference Team
Youtube Link: link
Youtube Channel Name: RIKEN AIP
Youtube Channel Link: https://www.youtube.com/@aipriken8732
Talk by Daniele Calandriello (DeepMind, Paris) hosted by Approximate Bayesian Inference Team
Youtube Title: On the Emergence of Whole-body Strategies from Humanoid Robot Push-recovery Learning
Youtube Link: link
Youtube Channel Name: iCub HumanoidRobot
Youtube Channel Link: https://www.youtube.com/@robotcub
On the Emergence of Whole-body Strategies from Humanoid Robot Push-recovery Learning
Youtube Title: Connectivity through dynamics: surprise and variance reduction
Youtube Link: link
Youtube Channel Name: Brain Space Initiative
Youtube Channel Link: https://www.youtube.com/@BrainSpaceInitiative
Connectivity through dynamics: surprise and variance reduction
Youtube Title: EMERGENCE.
Youtube Link: link
Youtube Channel Name: Machine Learning Street Talk
Youtube Channel Link: https://www.youtube.com/@MachineLearningStreetTalk
EMERGENCE.
Youtube Title: Bootstrap Your Own Latent A new approach to self supervised learning
Youtube Link: link
Youtube Channel Name: DataFest Yerevan
Youtube Channel Link: https://www.youtube.com/@DataFestYerevan
Bootstrap Your Own Latent A new approach to self supervised learning
Youtube Title: ActInf Livestream #017.2 ~ Information flow in context-dependent hierarchical Bayesian inference
Youtube Link: link
Youtube Channel Name: Active Inference Institute
Youtube Channel Link: https://www.youtube.com/@ActiveInference
ActInf Livestream #017.2 ~ Information flow in context-dependent hierarchical Bayesian inference
Youtube Title: Amici di Milanoskating
Youtube Link: link
Youtube Channel Name: Milanoskating
Youtube Channel Link: https://www.youtube.com/@Milanoskating
Amici di Milanoskating
Youtube Title: Learning Graph Cellular Automata | Daniele Grattarola
Youtube Link: link
Youtube Channel Name: Valence Labs
Youtube Channel Link: https://www.youtube.com/@valence_labs
Learning Graph Cellular Automata | Daniele Grattarola
Youtube Title: KindOf<Polymorphism> - Daniele Campogiani at Kotlin Day 2019
Youtube Link: link
Youtube Channel Name: Xebia Functional (formerly 47 Degrees)
Youtube Channel Link: https://www.youtube.com/@xebiafunctional
KindOf<Polymorphism> - Daniele Campogiani at Kotlin Day 2019
Youtube Title: The quest for provably efficient ML algorithms
Youtube Link: link
Youtube Channel Name: MITCBMM
Youtube Channel Link: https://www.youtube.com/@MITCBMM
The quest for provably efficient ML algorithms
Youtube Title: Deepmind Drones
Youtube Link: link
Youtube Channel Name: Calibration - Topic
Youtube Channel Link: https://www.youtube.com/channel/UCrJlSGPK4YjxNteHcKAHaMw
Deepmind Drones
Youtube Title: Approximate Bayesian Inference Team Seminar 20211109
Youtube Link: link
Youtube Channel Name: RIKEN AIP
Youtube Channel Link: https://www.youtube.com/@aipriken8732
Approximate Bayesian Inference Team Seminar 20211109
Youtube Title: Social Dynamics: Daniele Quercia, Nokia Bell Labs
Youtube Link: link
Youtube Channel Name: Cambridge Spark
Youtube Channel Link: https://www.youtube.com/@CambridgeSpark
Social Dynamics: Daniele Quercia, Nokia Bell Labs
Youtube Title: Ioannis Koutis -- Pragmatic Ridge Spectral Sparsification for Large-Scale Graph Learning
Youtube Link: link
Youtube Channel Name: DIMACS CCICADA
Youtube Channel Link: https://www.youtube.com/@DIMACS_CCICADA
Ioannis Koutis -- Pragmatic Ridge Spectral Sparsification for Large-Scale Graph Learning
Youtube Title: Lorenzo Rosasco - Efficient learning with Nyström projections
Youtube Link: link
Youtube Channel Name: FAU Applied Mathematics
Youtube Channel Link: https://www.youtube.com/@FAUAppliedMathematics
Lorenzo Rosasco - Efficient learning with Nyström projections
Youtube Title: Generative Models and Symmetries - Danilo J. Rezende
Youtube Link: link
Youtube Channel Name: Workshop on Equivariance and Data Augmentation
Youtube Channel Link: https://www.youtube.com/@workshoponequivarianceandd8335
Generative Models and Symmetries - Danilo J. Rezende
Youtube Title: Bayesian Statistics for Programmers
Youtube Link: link
Youtube Channel Name: Tomer Ben David
Youtube Channel Link: https://www.youtube.com/@TomerBenDavid
Bayesian Statistics for Programmers
Youtube Title: Isotropy and Log-Concave Polynomials
Youtube Link: link
Youtube Channel Name: IEEE FOCS: Foundations of Computer Science
Youtube Channel Link: https://www.youtube.com/@IEEE-FOCS
Isotropy and Log-Concave Polynomials
Youtube Title: AI, Data & Ethics with Prof. Joanna Bryson
Youtube Link: link
Youtube Channel Name: Seldon
Youtube Channel Link: https://www.youtube.com/@SeldonIo
AI, Data & Ethics with Prof. Joanna Bryson