doi.bio/vijil_chenthamarakshan
Vijil Chenthamarakshan
Biography
Vijil Chenthamarakshan is a research scientist at the Thomas J. Watson Research Center, IBM Research in the United States of America. Chenthamarakshan is a member of the AI Foundations Lab at the research centre. His research interests are in the broad areas of Machine Learning and Natural Language Processing.
Research Focus
Chenthamarakshan's research has focused on various aspects of Machine Learning, including machine translation, information extraction, transfer learning, and paraphrase detection. He has applied these techniques to solve problems in industries such as finance, oil and gas, insurance, and construction. Additionally, he has worked on projects for government agencies in the areas of national security and healthcare.
Education & Career History
Chenthamarakshan has a Research Software Engineer role at International Business Machines (IBM). He has several peer-reviewed publications in international journals and conferences, as well as 13 granted patents. More than 15 patent applications are also pending with the US Patent and Trademark Office.
Publications
Chenthamarakshan has numerous publications, including:
- 2024: Structure-Informed Protein Language Model; ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation; Larimar: Large Language Models with Episodic Memory Control; GP-MoLFormer: A Foundation Model For Molecular Generation; Protein Representation Learning by Geometric Structure Pretraining; Physics-Inspired Protein Encoder Pre-Training via Siamese Sequence-Structure Diffusion Trajectory Prediction; Enhancing Protein Language Models with Structure-based Encoder and Pre-training; Equivariant Few-Shot Learning from Pretrained Models
- 2023: Equi-Tuning: Group Equivariant Fine-Tuning of Pretrained Models; Protein Representation Learning by Geometric Structure Pretraining; Pre-Training Protein Encoder via Siamese Sequence-Structure Diffusion Trajectory Prediction; Efficient Equivariant Transfer Learning from Pretrained Models; Optimizing molecules using efficient queries from property evaluations; Large-scale chemical language representations capture molecular structure and properties; Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations; Cloud-Based Real-Time Molecular Screening Platform with MolFormer; AlphaFold Distillation for Improved Inverse Protein Folding; Reprogramming Pretrained Language Models for Antibody Sequence Infilling; Learning Geometrically Disentangled Representations of Protein Folding Simulations; GT4SD: Generative Toolkit for Scientific Discovery
- 2022: Accelerating Inhibitor Discovery for Multiple SARS-CoV-2 Targets with a Single, Sequence-Guided Deep Generative Framework; CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models; Sample-Efficient Generation of Novel Photo-acid Generator Molecules using a Deep Generative Model; Learning Implicit Text Generation via Feature Matching; Cloud-Based Real-Time Molecular Screening Platform with MolFormer; Reprogramming Large Pretrained Language Models for Antibody Sequence Infilling; Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations; Do Large Scale Molecular Language Representations Capture Important Structural Information?
- 2021: Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design; Benchmarking deep generative models for diverse antibody sequence design
- 2020: Fairness GAN: Generating datasets with fairness properties using a generative adversarial network; Interactive Visual Exploration of Latent Space (IVELS) for peptide auto-encoder model selection; Learning Implicit Text Generation via Feature Matching
- 2019: A Sequential Set Generation Method for Predicting Set-Valued Outputs
- 2018: Query Focused Variable Centroid Vectors for Passage Re-ranking in Semantic Search; Fairness GAN; PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences
- 2017: A cognitive assistant for risk identification and modeling
- 2015: WYSIWYE: An Algebra for Expressing Spatial and Textual Rules for Information Extraction
- 2014: Predicting employee expertise for talent management in the enterprise
- 2013: Amplifying the voice of youth in Africa via text analytics
- 2012: WYSIWYE: An Algebra for Expressing Spatial and Textual Rules for Information Extraction
- 2011: Transfer Latent Semantic Learning: Microblog Mining with Less Supervision; ALPOS: A Machine Learning Approach for Analyzing Microblogging Data; Concept Labeling: Building Text Classifiers with Minimal Supervision
- 2010: Effective decision support systems for workforce deployment; Measuring Compliance and Deviations in a Template-Based Service Contract Development Process; PROSPECT: a system for screening candidates for recruitment; Syntax Based Reordering with Automatically Derived Rules for Improved Statistical Machine Translation; Urdu and Hindi: Translation and sharing of linguistic resources; Leveraging social networks for corporate staffing and expert recommendation
- 2009: Dependency Analysis Framework for Software Service Delivery; Rule based synonyms for entity extraction from noisy text