- www.mindyng.com
-
www.linkedin.com/in/mindyng85
mindyng
I am passionate about combining
descriptive analytics with results-oriented
data problem solving and bridging the
knowledge gap across multiple disciplines
and presenting results/insights to di erent
audiences and teams.
MINDY NG
Projects
Time Series Forecasting on Uber Eats' Vendors
Dec. 2018 to Dec. 2018
Utilized 7,911 samples of date-stamped data and predicted which vendors were worth continuing business with based on ROI.
Trended each vendors' data with Facebook's Prophet. Trends performed over a span of 15 months. Data further broken down into weekly and daily
trends. Resulting model performance based on 30-day horizon producing 0.01 - 0.03 RMSE.
Postmates New Market Analysis with Geospatial Heatmaps
Mar. 2019 to Mar. 2019
Analyzed 3-sided market to explore contributors to conversion and churn, used heatmaps to visualize supply and demand, determined health of market
and addressed data integrity issues.
Skills
TaskRabbit 2-Sided Market Analysis - Supply and Demand Optimization
LANGUAGES
Used Decision Tree and Random Forest Tree models to predict whether or not a Tasker would be hired. Resulting model performance based on 30-days
of data for Random Forest was 0.943 Accuracy.
SQL
Python
R
DATA WRANGLING
May 2019 to May 2019
Utilized 30,000 samples of date-stamped recommendations to Clients to predict what sort of Tasker is usually chosen.
Utilized 30,000 samples of market data to build a model that suggests hourly rates.
Trended each Task category with Facebook's Prophet. Trends performed based on 30 historical days and broken down into yearly, weekly and daily
predictions. Resulting model based on 6-month horizon produced 12.7-13.7 RMSE.
Data Cleaning
Sentiment Classification on Amazon Book Reviews
Data Exploration
Gathered 243,269 Amazon book reviews through UCI's Machine Learning Repository in order to label customer reviews with three di erent sentiment
scores to allow e cient product assessment.
STATISTICS
Probability Statistics
Inferential Statistics
Statistical Analysis and Core Statistical
Functions
Feb. 2017 to Apr. 2017
Built three di erent classi cation models- MN Naive Bayes, Decision Tree and Random Forest.
Out of the three, Random Forest was the best predictor due to having best model performance results with 0.72 Test Set Accuracy. Reclassifying
Amazon product reviews prevents shopping paralysis leading to quick purchase conversions.
Descriptive Analytics
U.S. Health Insurance Market Analysis
MODELS / MACHINE LEARNING
Wrangled and visualized over 12,000,000 health insurance data points to examine trends in bene ts over a span of time and states. Also, explored
di erent rates between patients of varying health and rates across states.
Natural Language Processing
Medicare Prescription Drugs Analysis
Logistic Regression
Analyzed 25,209,130 samples of Medicare Part D Prescription use to determine how geography correlates with provider density, provider specialties
and drug costs.
Linear Regression
Decision Trees
Random Forests
Naive Bayes Classi cation
Predictive Analytics
K-Means Clustering
June 2019 to June 2019
July 2019 to July 2019
Plotly and Seaborn used to visualize number of providers across states, to geocode provider specialties and to examine di ering degrees of drug cost
variance across the U.S.
Cohort Analysis on Drugs for Cancer Patients
Jan. 2019 to Jan. 2019
Examined 1,096 samples of de-identi ed cancer patient treatment data to predict best drug regimen for cancer clinic's cohort.
Utilized paired t-test to determine if there was di erence in e
DIMENSIONALITY REDUCTION
Principal Component Analysis
OPTIMIZATION
Feature Selection
cacy between two di erent Breast Cancer drugs.
NURX E-commerce Telemed Conversion Funnel Analysis
Looked into User Page Views data to analyze drop-o
Mar. 2019 to Mar. 2019
points and conversion points for website optimization.
Fitbit Calories Burned Measurement Prediction
May 2017 to Aug. 2017
Gathered 91 quanti ed self data points through Fitbit's API. And with 6 meaningful calorie measurements, determined which activity was the best to
invest in to achieve the highest calorie burn.
BUSINESS ANALYTICS
A/B Testing
Customer Segmentation
Cohort Analysis
Time Series Analysis
VISUALIZATION
Built three di erent regression models- Linear Regression, Decision Tree and Random Forest.
Out of the three, Linear Regression was the best predictor with relatively the lowest RMSE values with 0.7 for Test set results. Completing analysis on
self-quantifying data provides new dashboard metric for healthconscious Fitbit users.
Touch of Modern E-commerce Consumer Behavior Analysis
Apr. 2019 to Apr. 2019
Examined users and orders data in order to determine consumer trends for business intelligence insights.
Matplotlib
Bokeh
Plotly
Folium
Employment
Immuno Concepts
Quality Control Analyst
-Performed statistical analysis on half of the company's 22 products per week.
-Tracked trends and outliers to make manufacturing recommendations to management to create e
-Created product performance reports to drive key business investments for following quarter.
University of California, Davis
Research Associate
Sacramento, CA
July 2010 to Apr. 2019
ciencies and increase pro t margins.
Davis, CA
Jan. 2005 to Dec. 2008
-Through repeated experimentation explored sigma70 subunit architecture to characterize macromolecular complexes involved in transcription of
growth-related genes.
-Narrowed down which protein chain substitution in antibody-derived proteins t best with research aims in pre-targeting radioimmunotherapy for NonHodgkin's Lymphoma.
Education
Springboard, Data Science Career Track
University of California, Davis
Genetics Bachelor's of Science
Jan. 2017 to Dec. 2017
Sept. 2003 to Dec. 2007