doi.bio/zeming_lin

Zeming Lin

Zeming Lin is a researcher and engineer with a background in machine learning, artificial intelligence, and computational physics. Lin has worked on a range of projects, from SMS-based dubstep file conversion to advanced machine learning models for protein structure prediction. Lin is currently a Senior Software Engineer at Facebook and holds a Mathematica Student Certification from Wolfram Research, Inc.

Education

Lin's educational background includes:

Advance Math Techniques
Advanced Linear Models
Advanced Time Series
Artificial Intelligence
Computational Physics
Large Scale Machine Learning
Machine Learning and Data Mining in Practice
Programming Languages
Theory of Computation

Publications

Zeming Lin has authored or co-authored the following publications:

2022: Learning inverse folding from millions of predicted structures (with Chloe Hsu, Robert Verkuil, Jason Liu, Brian Hie, Tom Sercu, Adam Lerer, and Alexander Rives)
2021: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences (with Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus)
2020: Growing Action Spaces (with Gregory Farquhar, Laura Gustafson, Shimon Whiteson, Nicolas Usunier, and Gabriel Synnaeve)
2019: 基于非线性模糊矩阵的代码混淆有效性评估模型 (Code Obfuscation Effectiveness Assessment Model Based on Nonlinear Fuzzy Matrices) (with Qing Su, Zhiyi Lin, and Jianfeng Huang)
2019: Value Propagation Networks (with Nantas Nardelli, Gabriel Synnaeve, Pushmeet Kohli, Philip H. S. Torr, and Nicolas Usunier)
2019: PyTorch: An Imperative Style, High-Performance Deep Learning Library (with Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, and Benoit Steiner)
2019: Growing Action Spaces (with Gregory Farquhar, Laura Gustafson, Shimon Whiteson, Nicolas Usunier, and Gabriel Synnaeve)
2019: PyTorch: An Imperative Style, High-Performance Deep Learning Library (with Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, and Benoit Steiner)
2018: Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play (with Sainbayar Sukhbaatar, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, and Rob Fergus)
2018: Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger (with Gabriel Synnaeve, Jonas Gehring, Daniel Gant, Vegard Mella, Vasil Khalidov, Nicolas Carion, and Nicolas Usunier)
2018: Value Propagation Networks (with Nantas Nardelli, Gabriel Synnaeve, Pushmeet Kohli, Philip H. S. Torr, and Nicolas Usunier)
2018: Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger (with Gabriel Synnaeve, Jonas Gehring, Daniel Gant, Vegard Mella, Vasil Khalidov, Nicolas Carion, and Nicolas Usunier)
2017: An Analysis of Model-Based Heuristic Search Techniques for StarCraft Combat Scenarios (with David Churchill and Gabriel Synnaeve)
2017: STARDATA: A StarCraft AI Research Dataset (with Jonas Gehring, Vasil Khalidov, and Gabriel Synnaeve)
2017: DeepCloak: Masking Deep Neural Network Models for Robustness Against Adversarial Samples (with Ji Gao, Beilun Wang, Weilin Xu, and Yanjun Qi)
2017: Episodic Exploration for Deep Deterministic Policies for StarCraft Micromanagement (with Nicolas Usunier, Gabriel Synnaeve, and Soumith Chintala)
2017: STARDATA: A StarCraft AI Research Dataset (with Jonas Gehring, Vasil Khalidov, and Gabriel Synnaeve)
2016: MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-Based Protein Structure Prediction (with Jack Lanchantin and Yanjun Qi)
2016: Deep Motif: Visualizing Genomic Sequence Classifications (with Jack Lanchantin, Ritambhara Singh, and Yanjun Qi)
2016: MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction (with Jack Lanchantin and Yanjun Qi)
2016: Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks (with Nicolas Usunier, Gabriel Synnaeve, and Soumith Chintala)
2016: TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games (with Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timothée Lacroix, Florian Richoux, and Nicolas Usunier)

References

dblp: Zeming Lin. https://dblp.org/pid/178/8595.html
Zeming Lin - Facebook | LinkedIn. https://www.linkedin.com/in/zeming-lin-63145a23

Zeming Lin

Biography

Zeming Lin is a researcher who has been affiliated with New York University and other institutions. Lin's research interests include machine learning, protein design, and protein structure prediction. They have published several articles in the field of AI and machine learning, particularly focusing on protein structure and function.

Notable Works

Evolutionary-scale prediction of atomic-level protein structure with a language model: In this work, Lin and their co-authors demonstrate how large language models can be used to directly infer full atomic-level protein structure from primary sequences. They scale up language models of protein sequences to 15 billion parameters, showcasing their ability to perform higher-level reasoning and generate life-like images and text.
A high-level programming language for generative protein design: This article explores the challenges of top-down protein design due to biological complexity. Lin and their colleagues propose a modular and programmable approach, combining basic building blocks into more intricate forms, a principle they argue is universal in design.
Language models of protein sequences at the scale of evolution enable accurate structure prediction: The paper investigates the capabilities of large language models, particularly their ability to go beyond pattern matching to perform higher-level tasks. It explores the potential of language models trained on protein sequences, known as Protein Large Language Models (PLMs), in understanding biological complexity.
Learning inverse folding from millions of predicted structures: Lin and their co-authors address the problem of predicting a protein sequence from its backbone atom coordinates. They propose a machine learning approach, augmenting training data by nearly three orders of magnitude, to improve the prediction of protein structures.
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences: This work proposes scaling a deep contextual language model with unsupervised learning across evolutionary diverse sequences. The findings suggest that information emerges in the learned representations, enabling the prediction of biological properties without prior knowledge.

Affiliations

New York University
University of Warsaw

- Facebook AI Research

Zeming Lin

Zeming Lin is a researcher in the field of AI and machine learning. They have been affiliated with New York University and Facebook AI Research.

Research

Lin's research focuses on the intersection of machine learning and evolutionary biology. They have published work on the prediction of protein structures and functions using large language models (LLMs) and machine learning approaches.

Some of their notable publications include:

"Evolutionary-scale prediction of atomic-level protein structure with a language model" - This work demonstrates the direct inference of full atomic-level protein structure from a primary sequence using LLMs.
"A high-level programming language for generative protein design" - Lin and their co-authors propose a top-down approach to protein design, combining basic building blocks into more complex forms.
"Language models of protein sequences at the scale of evolution enable accurate structure prediction" - The paper explores the capabilities of LLMs trained on protein sequences and their potential for biology-related tasks.
"Learning inverse folding from millions of predicted structures" - This work focuses on predicting protein sequences from backbone atom coordinates, addressing the limitations of machine learning approaches due to the scarcity of experimentally determined protein structures.
"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences" - Lin and their colleagues propose scaling a deep contextual language model with unsupervised learning to sequences spanning evolutionary diversity.

Publications

Proceedings of the 33rd International Conference on Neural Information Processing Systems (NIPS'19)
Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS'18)