Anisha Jain

I am currently pursuing Masters in Computer Vision (MSCV) at Robotics Institute at Carnegie Mellon University, advised by Prof. Lazlo A. Jeni and Michael Zollhoefer, Meta Reality Labs, where I am working on dynamic 3D reconstruction focusing non-rigidly deforming scenes captured by RGBD sensors. My primary interests lie in the intersection of Computer Vision and Machine Learning.

Prior to joining CMU, I worked as a Software Engineer at Microsoft (R&D) Pvt. Ltd., where my work focused on building agile deployment and experimentation of ML models for spam/phish detection during mail flow for Microsoft Defender for Office (MDO). I had the previledge of working with a team of talented engineers and researchers, and was mentored by Jay Goyal, Saurabh Shrivastav and Ganesh Pande.

In 2018, I worked under the supervision of Dr. Amarjot Singh and Dr S.N Omkar in the Computational Intelligence lab at Indian Institute of Science, Bangalore suspicious activity detection in surveillance using gait-based gender recognition system for subjects in loosely fitted clothing.

I completed my undergraduate education with a Major in Computer Science and Engineering from National Institute of Technology, Warangal in 2021. I had the pleasure of working under Dr. R Padmavathy for my undergraduate thesis.

In the late summer of 2020, I was a part of the Software Product Sprint at Google APAC, where I worked under the mentorship of Yash Jain. I have had the opportunity to intern at Microsoft Teams Mobile team, and The Garage at Microsoft (R&D) Pvt. Ltd., in the summer of 2020 and 2019 respectively.

Email  |  CV  |  Linkedin

profile photo

Zoox
Computer Vision Intern
May 2024 - Aug 2024

Carnegie Mellon University
MS in Computer Vision
Aug 2023 - Dec 2024

Microsoft (R&D) Pvt. Ltd.
Software Engineer
June 2021 - Aug 2023
SWE Intern
May 2020 - July 2020
SWE Intern
May 2019 - July 2019

National Institute of Technology, Warangal
B.Tech. Computer Science
Aug 2017 - May 2021

Google APAC
Software Product Sprint
July 2021 - Sept 2021

Indian Institute of Science, Bengaluru
Research Intern
May 2018 - June 2019

Projects
pose Dynamic reconstruction of non-rigid scenes from monocular RGB-D videos
[project page]

Instead of using complex camera systems, the goal is to simplify avatar creation by only using smartphone data with rgb-d sensors which can profoundly enhance the scalability and adoption of avatars. We propose a novel method for dynamic 3D reconstruction focusing on non-rigidly deforming scenes captured by RGBD sensors.

pose GaussCraft: Language Driven Segmentation and Editing in 3D Using Gaussian Splatting
[poster]

We streamline 3D asset creation by editing existing scenes, reducing complexity and time. While NeRF offers detailed rendering, it's hard to manipulate. Instead, 3D Gaussian Splatting enables real-time rendering, simplifying tasks. Our architecture integrates vision language, allowing language-driven editing of visual properties. This bridges the gap between language understanding and 3D scene editing, enhancing workflows.

pose GIF-Tune: One-Shot Tuning for Continuous Text-to-GIF Synthesis
[project page]

Our methodology can work with any text-to-image (T2I) diffusion architectures pre-trained on large image dataset. We introduce a spatiotemporal attention and background regularization techniques along with one-shot tuning strategy within our "GIF-Tune" model for temporally coherent and depth-consistent GIFs.

pose Curiosity & Entropy Driven Unsupervised RL in Multiple Environments
[arVix] [slides]

Alpha-MEPOL method for unsupervised RL in multiple environments achieves enhanced performance with dynamic alpha and a higher KL-Divergence threshold. Curiosity-driven exploration proves effective in high-dimensional environments, fostering diverse experiences and improving learning outcomes. Our experiments confirm positive results, highlighting the effectiveness of these modifications in the current context.

Research

pose Bayesian Gait-Based Gender Identification (BGGI) Network on Individuals Wearing Loosely Fitted Clothing
[pdf]

Addressing the challenge of identifying individuals wearing loosely fitted clothes of the opposite gender to evade detection, our proposed Bayesian Gait-based Gender Identification (BGGI) excels in real-world scenarios. Utilizing the LFCI dataset for training, our technique demonstrates superior performance in both pose estimation and gender recognition, employing state-of-the-art methods on dense real-world videos.

News Updates
[Aug 2023] Starting my M.S. Computer Vision (MSCV) degree at the Robotics Institute, Carnegie Mellon University (CMU).
[Jun 2021] Started a new position as a Software Engineer, Security, Compliance and Management Org, Microsoft, Hyderabad, India.
[May 2021] Received my Bachelor's degree from National Institute of Technology Warangal, with a major in Computer Science and Engineering.
[May 2020] Started summer internship in the Microsoft Teams Mobile team at Microsoft, Hyderabad, India
[May 2019] Started summer internship in the The Garage at Microsoft, Hyderabad, India
[Feb 2019] Started a new position as Programmer at G-Bit Studios
[May 2018] Started summer internship at Indian Institute of Science, Bangalore

Source code from Jon Barron