doi.bio/inkit_padhi

Inkit Padhi

Inkit Padhi is a research engineer at IBM Research, with a focus on natural language processing and machine learning. He has contributed to a range of research projects and publications, particularly in the field of Large Language Models (LLMs) and their applications.

Education and Career

Padhi attended the University of Southern California and is currently a research engineer at IBM Research, having been associated with the Thomas J Watson Research Center, IBM Research-Almaden, and IBM Research-Australia.

Publications and Research

Padhi has co-authored several papers, including:

Contextual Moral Value Alignment Through Context-Based Aggregation: This paper proposes a system for aligning LLMs with multiple moral values, allowing them to adapt their responses to user inputs.
Detectors for Safe and Reliable LLMs: Presents an alternative approach to imposing safety constraints on LLMs by creating a library of "detectors" that provide labels for various harms.
Alignment Studio: An approach that empowers application developers to fine-tune a model's behaviour to align with specific values, social norms, and regulations.
The Impact of Positional Encoding on Length Generalization in Transformers: A study investigating the impact of positional encoding on length generalisation in decoder-only Transformers, finding that explicit position embeddings are not essential for good generalisation performance.
Auditing and Generating Synthetic Data with Controllable Trust Trade-offs: Introduces an auditing framework to assess synthetic datasets and AI models, focusing on bias and discrimination prevention, fidelity, utility, robustness, and privacy preservation.
Fair Infinitesimal Jackknife: Proposes an algorithm to improve the fairness of a pre-trained classifier by mitigating biases in machine learning models, particularly regarding demographic parity and equality of opportunity.
Cloud-Based Real-Time Molecular Screening Platform with MolFormer: Describes a cloud-based platform that allows users to virtually screen molecules, leveraging chemical language processing models to automate chemical tasks.
GT4SD: Generative Toolkit for Scientific Discovery: An open-source library enabling the use of generative models for hypothesis generation in scientific discovery, with applications in material science and drug discovery.
ReGen: Reinforcement Learning for Text and Knowledge Base Generation: Presents ReGen, a bidirectional generation approach that leverages Reinforcement Learning to improve performance in constructing Knowledge Bases and generating text.
Do Large Scale Molecular Language Representations Capture Important Structural Information?: Investigates the use of molecular embeddings obtained by training a transformer encoder model to predict chemical properties from molecular structures, with applications in drug discovery and material design.

Notable Co-Authors

Pierre Dognin
Payel Das
Prasanna Sattigeri
Kush R. Varshney
Brian Belgodere
Youssef Mroueh
Karthikeyan Natesan Ramamurthy

- Jerret Ross

Inkit Padhi

Inkit Padhi is a research engineer at IBM Research. He previously attended the University of Southern California.

Publications

Learning Implicit Generative Models by Matching Perceptual Features
Tabular Transformers for Modeling Multivariate Time Series
DualTKB: A Dual Learning Bridge between Text and Knowledge Base
Learning Implicit Text Generation via Feature Matching
Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models
Sobolev Independence Criterion
PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences
Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer
Improved Neural Text Attribute Transfer with Non-parallel Data
Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge
Alleviating Noisy Data in Image Captioning with Cooperative Distillation
Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text
Do Large Scale Molecular Language Representations Capture Important Structural Information?
ReGen: Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models
GT4SD: Generative Toolkit for Scientific Discovery
Cloud-Based Real-Time Molecular Screening Platform with MolFormer
Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting
Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Contextual Moral Value Alignment Through Context-Based Aggregation
The Impact of Positional Encoding on Length Generalization in Transformers