Fall 2019 Projects

Panasonic-R

Team Members: Haolan Zhang, Hongru Ye, Greta Smith, Hao Jiang

The goal of replenishment is to keep inventory flowing through supply chains, maintaining order and efficient line item fill rates. The goal of this project is to improve Panasonic’s supply chain efficiency, minimizing replenishment errors and identifying important Key Performance Indicators (KPIs). First, the team used frequent pattern mining, and FP-growth algorithms to identify parts that Panasonic replenishes in tandem. Then, students performed statistical tests (such as ANOVA), pinpointing replenishment differences between various shifts and weekdays. Finally, team members applied outlier analysis to detect aberrations in job completion time.

Panasonic-K

Team Members: Kefei Wu, Jong Hwi Park, Miao Zhou, Yingping Lu

Kitting is the process of gathering components needed for the manufacture of a particular automobile part in an assembly line. The goal of this project is to determine valid Key Performance Indicators (KPIs) by extracting insights from Panasonic’s ProVIEW database to optimize efficiency and minimize error. First, the team performed exploratory analysis to gather information about kitting states and jobs completed. Then, team members examined various KPIs and created a predictive model to classify the completion/failure of pick jobs. Predictive analysis utilized random forest, logistic regression, XGBoost, and the decision tree.

Paychex

Team 1

Members: Tianjie Cheng, Sheng Zhang, Gekun Feng, Zhuang Tian

Recommendation engines suggest personalized products to clients based on demographic information and purchase history, improving product sales over time. For this project, the team employed classical collaborative filtering methods (such as item-based, K-nearest-neighbors and Singular Value Decomposition) to predict the scores clients might give to new products. Students leveraged clustering methods and imputation techniques to boost the accuracy of Paychex’ recommendation engine, and mitigated cold-start problems by considering feature distributions.

Team 2

Members: Andrea Clark-Sevilla, Sahar Hajiseyednasir, Hyungkyu Lim, Zonyang Yang

This team took a different approach to the same data set. For this project, the team used a K-modes clustering technique with Hamming distance to place clients into groups. Then, students employed a matrix factorization technique with alternating least squares to perform the inference. Finally, team members generated a purchasing list for each individual client by ranking evaluations by mean-average precision.

RTS

Team 1

Members: Faner Lin, Shuaidong Pan, Yi Yao, Qianyi Li

RTS is a regional transportation authority established by New York State. The goal of this project is to find explanations for preventable accidents caused by bus operators. First, descriptive and exploratory analysis were performed on all data, driver-related variables, and environmental-related variables. Then, the team applied frequent pattern mining and calculated conditional probabilities to the accident histories of operators identified as “high risk,” pinpointing accident patterns. Finally, team members utilized the hierarchical clustering technique to identify groups of operators with highly preventable accident rates and to point out locations where high preventable accidents occurred.

Team 2

Members: Mingwei Jiang, Yuqing Zhao, Vatsal Mehta, Weitan Tao

This team took a different approach to the same data set and problem statement from RTS. After descriptive and exploratory analysis were applied to all data, driver-related variables and environmental variables, the team performed map analysis and route analysis to determine the locations of the accidents. Finally, team members analyzed RTS’ absence table to establish a relationship between the number of days drivers took off and the preventable accident rate.

URMC

Team Members: Yumeng Xi, Ziyu Song, Haizhu Yang, Tianyou Xiao

The goal of this project is to provide insights into URMC’s un-meeting format, employed for a recent scientific conference, defining interactions and identifying patterns in session attendance and topics discussed. First, the team cleaned and pre-processed the meeting data, identifying signal strength distribution during different sessions. Then, students created a dynamic member network graph to categorize group and general interactions. They also plotted a member-to-member “heatmap” to display the distribution of member interactions. Finally, team members aggregated meeting attendees with similar backgrounds to analyze interaction patterns.

VestEdu

Team Members: Sakshi Mehta, Shree Vandana Kachroo, Shrikant Adhikarla, Scott Kirschner

The goal of this project is to identify the correlation between education parameters (such as university rank, major, and level of education) and the creditworthiness of loan recipients, exploring the consequential interest rates applied by lending clubs. First, the team performed exploratory analysis, examining the relationship between credit value and education parameters. Then, students used spline regression and random forest models to determine the effects education parameters had on credit rates. Finally, team members utilized the XG-Boost model to predict the interest rates of lending clubs.

Wegmans

Team 1

Members: Kefu Zhu, Seda Ozturk, Ella Wan, Zhou Xu

Wegmans grocery stores often experience changes in consumer demand due to inclement weather, which can result in item shortages. The goal of this project is to generate a list of items that could sell out during inclement weather events, allowing Wegmans to prepare stores ahead of time. Weather warnings from NWS weather data were correlated with the anomalous behavior in net unit sales of various items over time. The same analysis was then applied to multiple store locations; the results show region-specific and seasonal consumer patterns.

Team 2

Members: James Zhan, Stephen Savchik, Xiaoning Guo, Luke Gerstner

This team took a different approach to the same data set. In this project, the team fitted Facebook Prophet time series models to each Wegmans product, correlating significant deviations from expected net unit sales with weather warning dates from NWS weather information. They then provided confidence intervals for each product’s percent change in consumer demand for a variety of stores, regions and seasons.