publications
Publications by categories in reversed chronological order.
2022
- arXivA Case for Rejection in Low Resource ML DeploymentJerome White , Pulkit Madaan, Nikhil Shenoy , and 3 more authors2022
Building reliable AI decision support systems requires a robust set of data on which to train models; both with respect to quantity and diversity. Obtaining such datasets can be difficult in resource limited settings, or for applications in early stages of deployment. Sample rejection is one way to work around this challenge, however much of the existing work in this area is ill-suited for such scenarios. This paper substantiates that position and proposes a simple solution as a proof of concept baseline.
@misc{https://doi.org/10.48550/arxiv.2208.06359, doi = {10.48550/ARXIV.2208.06359}, url = {https://arxiv.org/abs/2208.06359}, author = {White, Jerome and Madaan, Pulkit and Shenoy, Nikhil and Agnihotri, Apoorv and Sharma, Makkunda and Doshi, Jigar}, keywords = {Machine Learning (cs.LG), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {A Case for Rejection in Low Resource ML Deployment}, publisher = {arXiv}, year = {2022}, copyright = {arXiv.org perpetual, non-exclusive license}, }
2020
- LRECMultilingual Neural Machine Translation involving Indian LanguagesPulkit Madaan, and Fatiha SadatIn Proceedings of the WILDRE5– 5th Workshop on Indian Language Data: Resources and Evaluation , May 2020
Neural Machine Translations (NMT) models are capable of translating a single bilingual pair and require a new model for each new language pair. Multilingual Neural Machine Translation models are capable of translating multiple language pairs, even pairs which it hasn’t seen before in training. Availability of parallel sentences is a known problem in machine translation. Multilingual NMT model leverages information from all the languages to improve itself and performs better. We propose a data augmentation technique that further improves this model profoundly. The technique helps achieve a jump of more than 15 points in BLEU score from the multilingual NMT model. A BLEU score of 36.2 was achieved for Sindhi–English translation, which is higher than any score on the leaderboard of the LoResMT SharedTask at MT Summit 2019, which provided the data for the experiments.
@inproceedings{madaan-sadat-2020-multilingual, title = {Multilingual Neural Machine Translation involving Indian Languages}, author = {Madaan, Pulkit and Sadat, Fatiha}, booktitle = {Proceedings of the WILDRE5{--} 5th Workshop on Indian Language Data: Resources and Evaluation}, month = may, year = {2020}, address = {Marseille, France}, publisher = {European Language Resources Association (ELRA)}, url = {https://www.aclweb.org/anthology/2020.wildre-1.6}, pages = {29--32}, language = {English}, isbn = {979-10-95546-67-2}, }
2019
- ThesisDeep mean shift clusteringPulkit Madaan, Abhishek Maiti , Saket Anand , and 1 more authorMay 2019
We use Mean Shift clustering in the latent space of an auto-encoder to have a better representation of the data and a more structured latent space. Instead of just using the mode of the distribution calculated using kernel density estimates, we use trajectories of data points leading to the modes to better model the basin of attraction of each mode. This helps in better structuring of the latent space and results in a more inferential model. Since mean-shift can be modelled as an RNN-block our method is end-to-end trainable. Tuning the bandwidth of mean-shift gives us the flexibility of clustering the latent space on different hierarchical levels. We modify the original trajectory based LSTM model by incorporating a discounting mechanism. We modified the mean shift implementation by using a fixed kernel for the mean shift iteratiosn. We also apply a new loss (Support Set Loss) to penalize the clusters made on the latent space. This uses the trajectories of the points segregated into groups which ended up in the same mode and those which didn’t. We have used this loss function in both semi-supervised and unsupervised fashion. In the end, we also propose a model which uses Contrastive Predictive Coding loss, in the latent space as well as a regularizer for the encoding network model.
@article{madaan2019deep, title = {Deep mean shift clustering}, author = {Madaan, Pulkit and Maiti, Abhishek and Anand, Saket and Mittal, Sushil}, year = {2019}, publisher = {IIIT-Delhi}, url = {https://repository.iiitd.edu.in/jspui/handle/123456789/915}, language = {English}, }