Past Projects
- Focusing on efficient training and inference of large models on NLP and computer vision tasks, developed adaptive freezing LoRA as a parameter-efficient fine-tuning (PEFT) method that yields 9.5× fewer trainable parameters on DeBERTaV3 with a SOTA performance in comparison with original LoRA on GLUE benchmark.
- Developed energy-efficient neural networks for hardware implementations targeting tasks such as speech recognition, image classification, and anomaly detection. Pioneered a spiking recurrent model that is efficient for streaming edge computing and has performance comparable to transformer models. In experiments on Google Speech Dataset, our model achieved 50x reduction in number of parameters and 65x reduction in FLOPS in comparison to the SOTA transformer-based models.
Works
- LMUFormer: Low Latency Yet Powerful Spiking Legendre Memory Units
- AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models
- Spiking Neural Networks with Dynamic Time Steps for Vision Transformers
Research Focus
- Parameter-efficient fine-tuning for large language models
- Energy-efficient neural networks for edge computing
- Spiking neural networks for streaming applications
- Long-context compression for LLMs
- Linear Transformers