a
Data Science Grad Student

ANSH GUPTA

Dartmouth College
Boston University
ABOUT
userImage
Hey! I am an MS Student at Dartmouth College studying Health Data Science. I am very passionate about working on meaningful projects that help humanity in the long term. I would love to work in Data Science roles that include Machine Learning/Data Engineering/Data Analysis. I am proficient in Python, R, SQL, shell scripting and C++ and love participating in hackathons and doing competitive coding. I also have a very strong background in Statistics, Calculus and Probability.
EXPERIENCE
Data Science Intern at Amazon
Data Science Intern at Ruby Inc.
Computational Biology Research Assistant at The Johns Hopkins University
Deputy Director of Academic Affairs at Boston University Student Government
Computer Vision Intern at SportsVisio
Machine Learning Intern at Ovenue Inc.
Bioinformatics Research Intern at Neuberg Supratech Reference Laboratories
Data Science Intern at Amazon
June 2023 - Sep 2023
  • Authored and presented an Amazon insights paper to senior Amazon leadership on identifying regions of hallucination using deep learning models like autoencoders.
  • Built random forest, autoencoders using TensorFlow, Spark, AWS to classify 52M+ Alexa utterances, thereby identifying hallucination in LLMs as the root cause of error in Alexa’s LLMs.
  • Engineered & deployed machine learning pipelines with Python, Spark, AWS, classifying 52M+ Alexa utterances; identified hallucinations in Large Language Models (LLMs) as primary source of error in Alexa's responses.
  • Optimized SQL queries on 30B+ rows, boosting efficiency by 50% with join operations on sort, dist keys.
  • Collaborated and communicated with 20+ stakeholders with varying technical knowledge daily.
PROJECTS
Recession Predictor USA (Python)
A predictor using 6 factors (employment rate, inflation, etc.) to predict a recession in USA Scraped, aggregated & cleaned 50 years of US recession data from multiple sources into dataframe. Created a random forest that predicts Recessions in the USA with 97% accuracy.
Prediction of Parkinson’s Disease Using Biomarkers (R)
Established Biomarkers for early detection of Parkinson’s disease to slow or halt disease progression.Implemented linear regression, logistic regression, and principal component analysis.
MIT Article/Patent Equity Data Analysis (Collaboration with Boston University School of Law)
Purpose: To show if there exists a correlation between the articles published and patents issued by MIT professors and the race, gender and tenure status, and department of the respective professors. Work: Web-scraped the data using Python Scripts. Used gender and race classification algorithms for professors without labeled data. Performed preliminary analysis on the data using various different factors such as the disparity index ratio. Verified the results for statistical significance using various tests such as T-Tests and Anova Tests.
Study Buddy App
Created a mobile app that allows students to find their classmates and create study groups with their fellow peers based on similar classes, majors, colleges, etc. Students can also purchase tutoring sessions and study guides from other students using the app. Implemented Google Firebase (for secure login, data management, and chat functionality between the users), Stripe API ( for the payment services for the study guides and tutoring sessions), and Recombee API (to recommend the most suitable study partners for the user).
ExcelLearn — built at BostonHacks 2019, Boston, MA
Created a specialized search engine for educational resources. It ranks the courses/content based on various factors such as user reviews, ratings, number of ratings, and it also provides many filters so the user can tailor courses as per their particular needs. Used Google Cloud's Natural Language Processing API to implement sentiment analysis to judge the user reviews for ranking. Created a website and logo for the same. Integrated the backend and frontend using UiPath.
HiMedScan— built at Hashathon 6.0 Hackathon at HashedIn Technologies, Bengaluru, India
Created an application called HiMedScan that can diagnose pneumonia by looking at the patient’s X-Ray with 99% accuracy. Used a Convolutional Neural Network as the model. Created a logo and website for too with Flask. Came up with a business model (subscription-based) and scalability model for the same.
CONTACT
mail me at
ansh.gupta.gr@dartmouth.edu
or drop a DM on
and follow my work on
Copyright 2021. Ansh Gupta. All Rights Reserved