Improved the robustness of automatic speech recognition models on the ML-SUPERB benchmark by applying Group-DRO. Via experiments on the Bantu family of languages, and on English audio across various ML-SUPERB datasets, we showed that Group-DRO can help improve worst case performance across challenging group shifts while maintaining high average performance across the test set.
Investigated the effect of self supervised pretraining objectives (Contrastive Learning vs Masked Image Modelling) on the robustness of Vision Transformer representations to distribution shift. Found that linear probing over CL pretrained ViTs show the strongest OOD robustness for image classification due to the stronger out of the box classification performance and lack of overfitting to in-distribution training data that can be found during fine tuning.
Explored applying Instance Specific data augmentations to the meta-learning setting. Found that although instance specific augmentations are still able to beat a random baseline, the augmentations tend to overfit to the training task distribution, and do not perform as well as augmentations with stronger regularizing effects such as CutMix. Final project for CS 330 - Deep Multi-Task and Meta Learning.
Projects for CS 194-26 - Computational Photography and Computer Vision. Explored image alignment, blending, filters, morphing, warping, and applications of deep learning in computer vision.
Implemented work stealing for a batched GPU implementation of the Smith-Waterman DNA sequence alignment Algorithm using CUDA and OpenMP in C++ for heterogenous computing environments. Final project for CS 267 - Parallel Computing.
Built custom architecture for classification on Tiny ImageNet based on Vision Transformers using PyTorch. Achieved 85% top 1 accuracy, and explored use of different attention mechanisms, data augmentation methods, and pretrained base models for improving out of distribution robustness. Computer Vision final project for CS 182 - Deep Learning.