A recommendation system to 'prescribe' comic books to read based on your
existing comic book preferences.
Based on 500K actual comic book purchases obtained in partnership with a local
comic book shop
Applied collaborative filtering to infer relationships between customers and their purchase choices,
via alternating least squares matrix factorization (Spark ML, Python API)
Deployed a Flask Web App for users to get recommendations based on their 3 favorite
comic books, at www.comics-rx.com
Doppel
A face recognition app for customer engagement.
Utilized open-source Google Images scraper to obtain 5K images for training data
Implemented transfer learning using Tensorflow and MobileNetV2 to extract image
features for classification
Applied Random Forest as final classification model
Deployed as Flask Web App for a fictional case study promoting Hobbs and Shaw, a
2019 action film, at www.doppeldoya.com
A Song of Vice and Higher
A characterization of 2020 presidential nominees through Game of Thrones.
Accessed the Reddit API to archive thousands of comments about Game of Thrones and
politics
Applied a CountVectorizer to attribute comments to nominees or characters
Engineered features using Gensim and clustered using unsupervised learning to map
characters to nominees
Scored using cosine similarity to map characters to nominees to make a more
compelling model
4-Year AV of Drafted NFL Player
Predict First 4 Years of Accumulated Approximate Value (AV) for a Drafted NFL Player.
National Football League teams (American Football) have an annual draft of college
players
How can teams maximize the value they get out of their draft picks?
Multiple linear regression to predict Approximate Value of a player
Predicting MLB Pitches
Use data available via MLB Stats API to develop models to identify baseball pitches.
Accessed the MLB Stats API to obtain individual pitch data (e.g. spin rates, location)
Explored various classification algorithms to create models to identify pitch based on it's characteristics
Explored various classification algorithms to create models to predict the next pitch
Based on Seattle Mariners 2018 Regular Season data (single team only due to need for expediency and limited resources)
Visualizing COVID Testing
Visualizing my family's COVID testing results since fall 2021.
Based on self-maintained dataset containing pertinent dimensions and measures (Google Sheets)
Deployed as Tableau dashboards
Balls & Strikes
(Work in Progress)
Use MLB pitch data to develop model(s) to predict whether a pitch will be called a ball or strike.
Explored various classification algorithms to create a model to identify whether a pitch would be called a ball or strike
Used neural networks to create a model to identify whether a pitch would be called a ball or strike
Deploy a Tableau dashboard to visualize pitches and compare/contrast between actual and predicted balls and strikes.