About Me
Know Me More
I'm Joe Waugh, a Data Scientist and ML Researcher
I am passionate about data analytics and developing solutions, and ultimately being able to pair the two to create value for organizations. I am constantly working to improve both as a learner and as an employee.
I am skilled in providing data-driven recommendations for analytical use cases across multiple industries, with recent work including KPI optimization forecasting, retail sales forecasting, automated image classification, API development, theft & fraud detection, and self-training multi-classification modelling. My contributions on these projects have collectively saved $300M+ in revenue savings, manual labor reduction, and operational savings.
- Name:Joe Waugh
- Email:joseph.waugh100@gmail.com
- Current Roles:
- Data Scientist @ FedEx Ground
- Data Scientist @ OmniThink.AI
- Data Science Specialist @ Scale AI
- Data Science/ML Mentor @ MentorCruise
- From:Phoenix, Arizona
5+
Years Experience
M+
Cost Savings to Date
700M+
Approved CapEx Recommendation ($M)
+
Years Mentorship
Services
What I Do?
Machine Learning
I have built and deployed a variety of different machine learning models, ranging from time-series forecasting to multi-class image classification models. These models include ML Ops packaging of solutions for proper code development.
Cloud Computing
I work in Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure for data engineering, machine learning, and ML solution deployment monitoring tasks.
Python Package Development
I have created Python packages to assist in time-series forecast sensitivity analysis at FedEx Ground, in addition to other packages for various other use cases in the supply chain sector.
Data Engineering
I utilize Apache Airflow and Apache Spark for on-premises data ingestion into Google Cloud Platform (GCP Cloud Composer), in addition to scaled data transformations to assist other departments and predictive model data pipelines. I also build and maintain data ingestion scripts via Sqoop/Cron in GCP and other cloud providers.
Business Analysis
I have years of experience in preparing data-driven business recommendations related to cost justification, vendor product quality, and hypothesis analysis. This work has resulted in core investments ranging exceeding $300-700M to support future business growth.
API Development
I have experience in deploying parameter-tuned models in GCP via Compute Engine, which integrate with streaming data pipelines and auto-retrained ML workflows to support key business areas
Summary
Experience
Education
2020 - 2022
Master of Science - Computer Science
Georgia Institute of Technology
Specialization - Machine Learning
Impact of Network Compression in Deep Convolutional Neural Networks
Markov Decision Processes in Different Environments
Supervised Learning in Randomized Optimization Algorithms
Unsupervised Learning and Dimensionality Reduction Practices
2017 - 2019
Bachelor of Science - Business Data Analytics
Arizona State University
Graduated from 4-year program in 2 years
Fielding diverse lineups, ownership pays off in Premier League, Major League Baseball
Framing Shifts of the Ukraine Conflict in pro-Russian News Media
Professional Experience
2022 - Present
Data Scientist II
FedEx Ground
- Leading development of an internal package fraud alerting system using multiple GCP microservices resulting in $300M+ in revenue savings from customer churn and 100+ internal employee terminations due to theft
- Developed a TensorFlow image classification model for package surcharges, yielding an estimated $3M in future revenue
- Deployed production changes to improve the FedEx.com estimated delivery date prediction model accuracy by 5%
- Built a TensorFlow image classification model to audit picture proof of delivery images, generating yearly savings of $2M
- Developed a computer vision learning curriculum and facilitated an internal workshop for 15+ data scientists
- Built 10+ data ingestion jobs with Cron and Cloud Composer for ingestion of IBM AS/400 and DB2 data into BigQuery
- Built 10 different Cloud Logging jobs in Google Cloud Platform to monitor performance of data ingestion scripts
- Built 2 Power BI dashboards to visualize service failure trends for leadership to use in contract negotiations
- Technology Used: Python, SQL, Spark, TensorFlow, BigQuery, Cloud Logging, Cron, Cloud Composer, Power BI
2023 - Present
Data Science Specialist
Scale AI
- Supporting the training of LLMs into production models
2023 - Present
Machine Learning Mentor
MentorCruise
- Worked with 8+ mentees on developing technical skillsets for job placement into data science roles at various levels
- Assisted a mentee to fine-tune a stable diffusion generative AI model for an MVP of an image generation tool
2021 - 2022
Operations Research, Planning & Analysis Engineer
FedEx Ground
- Conducted capex spend analysis for $300-700M facility upgrades and reviewed with executive stakeholders, resulting in a 100% approval rate for capex spend proposals
- Developed a Python library to handle company forecast data for a $1.2B+ capex budget optimization research initiative
- Built volume forecasts for sensitivity analyses to provide guidance for future market share growth and load planning
- Performing qualitative and quantitative analysis on large datasets to understand customer trends and gain predictive insights to improve linehaul and facility planning strategies
- Completed statistical tests on executive hypotheses regarding different strategic initiatives that impact key service metrics
2019 - 2020
Process Engineer
State Farm
- Led API data evaluation strategy of a $3M+ telematics-related tool in support of the State Farm Drive Safe & Save initiative, resulting in 5% lift for automated detection of crash events via A/B testing methodologies
- Led deployment of 7 robotics process automation (RPA) use cases, resulting in cost savings of $400k and 30 FTEs
- Collaborated with internal claims leadership to develop claim volume forecasts, resulting in improved operational efficiencies and business growth by accurately estimated FTE needs for upcoming fiscal year
- Created and maintained staffing dashboards to assist leadership and stakeholders during the COVID-19 pandemic
Skills
Tech Stack
Data Querying
Tools Frequently Used:
- Google BigQuery
- Oracle SQL
- Azure SQL
Distributed Computing
Tools Frequently Used:
- Pyspark (Python)
- Apache Spark (Scala)
- SparkTrials API
- Hadoop
Production Application Development
Tools Frequently Used:
- Docker Containerization
- Google App Engine (Deploy APIs)
- Heroku Deployment
- Google Vertex AI (ML Workflow)
Python - Data Manipulations
Packages Frequently Used:
- Pandas
- Pyspark
- Numpy
Python - Data Visualization
Packages Frequency Used:
- Matplotlib
- Seaborn
- Plotly
Python - Machine Learning
Packages Frequency Used:
- TensorFlow
- Pytorch
- SciPy
- NumPy
- Scikit-Learn
Portfolio
My Work
Contact
Get in Touch
Contact Information
(480) 823-4052
joseph.waugh100@gmail.com