Inkit Padhi
Inkit Padhi is a research engineer at IBM Research, with a focus on natural language processing and machine learning. He has contributed to a range of research projects and publications, particularly in the field of Large Language Models (LLMs) and their applications.
Education and Career
Padhi attended the University of Southern California and is currently a research engineer at IBM Research, having been associated with the Thomas J Watson Research Center, IBM Research-Almaden, and IBM Research-Australia.
Publications and Research
Padhi has co-authored several papers, including:
- Contextual Moral Value Alignment Through Context-Based Aggregation: This paper proposes a system for aligning LLMs with multiple moral values, allowing them to adapt their responses to user inputs.
- Detectors for Safe and Reliable LLMs: Presents an alternative approach to imposing safety constraints on LLMs by creating a library of "detectors" that provide labels for various harms.
- Alignment Studio: An approach that empowers application developers to fine-tune a model's behaviour to align with specific values, social norms, and regulations.
- The Impact of Positional Encoding on Length Generalization in Transformers: A study investigating the impact of positional encoding on length generalisation in decoder-only Transformers, finding that explicit position embeddings are not essential for good generalisation performance.
- Auditing and Generating Synthetic Data with Controllable Trust Trade-offs: Introduces an auditing framework to assess synthetic datasets and AI models, focusing on bias and discrimination prevention, fidelity, utility, robustness, and privacy preservation.
- Fair Infinitesimal Jackknife: Proposes an algorithm to improve the fairness of a pre-trained classifier by mitigating biases in machine learning models, particularly regarding demographic parity and equality of opportunity.
- Cloud-Based Real-Time Molecular Screening Platform with MolFormer: Describes a cloud-based platform that allows users to virtually screen molecules, leveraging chemical language processing models to automate chemical tasks.
- GT4SD: Generative Toolkit for Scientific Discovery: An open-source library enabling the use of generative models for hypothesis generation in scientific discovery, with applications in material science and drug discovery.
- ReGen: Reinforcement Learning for Text and Knowledge Base Generation: Presents ReGen, a bidirectional generation approach that leverages Reinforcement Learning to improve performance in constructing Knowledge Bases and generating text.
- Do Large Scale Molecular Language Representations Capture Important Structural Information?: Investigates the use of molecular embeddings obtained by training a transformer encoder model to predict chemical properties from molecular structures, with applications in drug discovery and material design.
Notable Co-Authors
- Pierre Dognin
- Payel Das
- Prasanna Sattigeri
- Kush R. Varshney
- Brian Belgodere
- Youssef Mroueh
- Karthikeyan Natesan Ramamurthy
- Jerret Ross
Inkit Padhi
Inkit Padhi is a research engineer at IBM Research. He previously attended the University of Southern California.
Publications
- Learning Implicit Generative Models by Matching Perceptual Features
- Tabular Transformers for Modeling Multivariate Time Series
- DualTKB: A Dual Learning Bridge between Text and Knowledge Base
- Learning Implicit Text Generation via Feature Matching
- Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models
- Sobolev Independence Criterion
- PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences
- Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer
- Improved Neural Text Attribute Transfer with Non-parallel Data
- Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge
- Alleviating Noisy Data in Image Captioning with Cooperative Distillation
- Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text
- Do Large Scale Molecular Language Representations Capture Important Structural Information?
- ReGen: Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models
- GT4SD: Generative Toolkit for Scientific Discovery
- Cloud-Based Real-Time Molecular Screening Platform with MolFormer
- Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting
- Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
- Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
- Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
- Contextual Moral Value Alignment Through Context-Based Aggregation
- The Impact of Positional Encoding on Length Generalization in Transformers