Fall 2020 Projects

Constellation Brands: A Comprehensive Analysis of Alcohol & Marijuana Consumption Trends

Team Members: Nageswara Rao Gurram, Surya Iyer, Purvanshi Mehta, Rohan Sharma

For beer, wine, and spirits producer Constellation Brands, students identified alcohol and marijuana consumption trends. The team analyzed publicly available datasets using statistical tools, modeling techniques, and auto regression methods, to identify national, state, and regional alcohol consumption trends and demographics, and change factors that have contributed to the increase and decrease of marijuana usage across different states.

Paychex

Team 1: Predicting Potential Clients for Paychex HR Services

Members: Raunak Mahalik, Jayant Rohra, Srishti Singh, Ajinkya Deshmukh

For human resources and payroll company Paychex, students on team one were tasked with creating a solution to identify ideal clients for 401(k) upsell opportunities. The team created an XGBoost predictive model to isolate existing Paychex clients who might be more likely to purchase 401(k) services. They also added model explanation capabilities to spell out why a client might be a good candidate for upsell.

Team 2: A Model to Predict Paychex 401(k) Services’ Potential Clients and Explainers for Analysis

Members: Haoyu Chen, Yiwen Cai, Xinyu Hu, Yuchao Zha

Paychex team two was also tasked with identifying 401(k) upsell opportunities. Using under sampling and oversampling methods to enhance precision and recall, the team employed the Random Forest classification to predict whether a client will add 401(k) services. They also built local explainers (LIME and SHAP) to interpret why a specific client might be a good candidate or a poor candidate for upsell. Finally, team members leveraged SHAP to generate global interpretations of the impact of features on the probability of clients adding 401(k) services.

Regional Transit Service (RTS): Preventable Accidents Identification

Team Members: Xiaoran Li, Weiran Lin, Meiying Chen, Weinan Hu

For Rochester-Genesee Regional Transportation Services, or RTS, students identified the causality of preventable bus accidents. The team utilized logistic regression and deep learning when they analyzed data, and determined that road type, road curve and grade, and operators’ absence frequency are all factors correlated with preventable bus accidents.

Shoptaki: Developing an Automated Cryptocurrency Trading Bot

Team Members: Justin Hughes, Jake Senhert, Tom Hogrefe

For smart chain startup Shoptaki, students created a trading bot that identifies optimal buy and sell price points across various cryptocurrency pairs, and executes trades accordingly. The team focused on data labelling and preparation, and utilized a Long-Short Term Memory model, to create a bot that focuses on profit maximization and executes trades based on evaluations of historical data.

United Way: RMAPI Residents Survey Analysis and Prediction during the Covid-19 Pandemic

Team Members: Haosong Zheng, Haomin Hu, Zhichao Peng

To help Monroe County residents impacted by COVID-19, United Way sent a survey to ascertain their necessities. The survey utilized frequency mining to extract residents’ needs from their responses to open-ended questions. For this project, students leveraged NLP and NLYK techniques to explore key words. They then employed TFDIF to extract data, and a Bayesian model for predictions. Under RMAPI’s survey data, the region’s necessities can be predicted, and United Way can provide supplies to those in need; with more new data the model’s accuracy will improve over time.

University of Rochester – Facilities: Effect of Atmospheric and Seasonal effects on Solar Panel Power Output

Team Members: Ian Costley, Meghana Murthy, Marina Kupina, Ronald Michaels

For University of Rochester Facilities, team members worked to understand why U of R’s solar panel system doesn’t produce at its rated capacity. The team used multiple linear regression models and Gaussian processes to determine that atmospheric and solar seasonal effects like cloud cover, temperature, solar radiation, and humidity all contribute to the effective power output of U of R’s solar panels.

University of Rochester – River Campus Libraries: Scientific Journal Subscription Recommendations for River Campus Libraries

Team Members: Moshiul Azam, Agabek Kabdullin, Trang Nguyen, Yizhi Lan

For University of Rochester River Campus Libraries, students ran a time series model and utilized regression analysis to determine which journals present the most value to U of R researchers.

Wegmans: Decoding Price Elasticity using Sales Transaction Data

Team Members: Bruce Lyu, Daniel Ji, Yingzi Wang, Jay Zhang

For regional supermarket chain Wegmans, team members built a data-driven model that predicts an item’s sales based on price changes at Wegmans and at their competitors. Students used linear regression models to investigate the effects of price elasticity on sales. The regression models also helped to predict sales fluctuations for a variety of items and departments.