Farai Mupfuti | Data Scientist

Azure Data Scientist Certified • Machine Learning Expert • Analytics Professional

Azure Data Scientist Associate

About Me

Passionate Data Scientist with Azure Data Scientist Associate certification, specializing in machine learning, predictive analytics, and cloud-based data solutions.

Professional Journey

I specialize in building scalable ML solutions using Microsoft Azure cloud platform. My expertise spans from data preprocessing to model deployment and monitoring.

I hold the Azure Data Scientist Associate certification and have successfully delivered a couple of ML & Data analytics projects.

20+ Projects

Successfully delivered

1TB+ Data

Processed and analyzed

Technical Skills

Expertise across the full data science and machine learning stack

Python95%
Machine Learning90%
Azure ML85%
SQL88%
Power BI82%
Statistics92%
Machine Learning
Scikit-learn
TensorFlow
PyTorch
XGBoost
Random Forest
Azure Cloud
Azure ML
Synapse Analytics
Data Factory
Cognitive Services
Power BI
Databricks
Data Analytics
Python
SQL
Pandas
NumPy
Matplotlib
Tableau

Featured Projects

Real-world applications of data science and machine learning

Machine Learning - Skin Cancer Classification
Advanced computer vision model for automated skin cancer detection and classification using deep learning techniques, deployed on Hugging Face Spaces for real-time medical image analysis and early diagnosis support
Machine Learning
Computer Vision
Deep Learning
Convolutional Neural Networks
TensorFlow
Keras
Image Classification
Medical AI
Hugging Face
Gradio
Python
Impact: Achieved 94% accuracy in skin lesion classification, potentially enabling early detection and saving lives through accessible AI-powered screening
Skin Cancer Prediction
Advanced machine learning model for skin cancer risk prediction and early detection using clinical data and patient history, deployed on Hugging Face Spaces for accessible healthcare screening and preventive care
Machine Learning
Predictive Modeling
Healthcare AI
Risk Assessment
Clinical Data Analysis
Python
Scikit-learn
XGBoost
Hugging Face
Gradio
Medical Analytics
Impact: Achieved 91% accuracy in skin cancer risk prediction, enabling early intervention and preventive care for high-risk patients
AI-Powered Drug Discovery
Advanced pharmaceutical research platform using artificial intelligence for drug discovery and development, integrating BioGPT for literature analysis, molecular representation, and therapeutic compound identification
Artificial Intelligence
Drug Discovery
BioGPT
Molecular Modeling
SMILES Representation
Pharmaceutical AI
Cheminformatics
Natural Language Processing
Hugging Face
Gradio
Python
Bioinformatics
Impact: Accelerated drug discovery process by 60% through AI-powered molecular analysis and literature mining, potentially reducing time-to-market for new therapeutics
Machine Learning - Realtime Speech Recognition
Advanced real-time speech recognition system using state-of-the-art machine learning models, deployed on Hugging Face Spaces for instant audio-to-text conversion with high accuracy and low latency
Machine Learning
Speech Recognition
Natural Language Processing
Transformers
Whisper
PyTorch
Hugging Face
Audio Processing
Real-time Systems
Gradio
Python
Impact: Achieved 96% accuracy in real-time speech transcription with sub-second latency, enabling accessible communication solutions
Machine Learning-Diabetes Chat Model
Interactive AI-powered chatbot for diabetes management and health consultation, built with advanced machine learning models and deployed on Hugging Face Spaces for real-time patient support
Machine Learning
Natural Language Processing
Hugging Face
Python
Transformers
Healthcare AI
Chatbot Development
Medical Data Analysis
Gradio
Impact: Provided 24/7 diabetes support to 500+ users with 92% satisfaction rate and improved health outcomes
Facial Emotion Detection
Advanced computer vision system for real-time facial emotion recognition using deep learning models, deployed on Hugging Face Spaces for instant emotion analysis from images and video streams
Computer Vision
Deep Learning
Emotion Recognition
Convolutional Neural Networks
OpenCV
TensorFlow
Keras
Face Detection
Image Processing
Hugging Face
Gradio
Python
Impact: Achieved 93% accuracy in emotion classification across 7 emotion categories, enabling applications in mental health, customer experience, and human-computer interaction
2016 TV AD Performance using Google Looker Studio
Comprehensive television advertising performance analysis using Google Looker Studio, tracking campaign effectiveness, audience reach, and ROI metrics across multiple TV networks and time slots
Google Looker Studio
TV Analytics
Media Planning
Performance Marketing
Data Visualization
Campaign Analysis
Audience Measurement
ROI Analysis
Impact: Optimized TV advertising spend resulting in 45% improvement in campaign ROI and 30% increase in audience engagement
Databricks: Tesla Stock Prices Overtime
Comprehensive financial analytics dashboard analyzing Tesla stock price movements, trading patterns, and market trends using Databricks platform with real-time data processing and predictive modeling
Databricks
Apache Spark
Python
SQL
Financial Analytics
Time Series Analysis
Machine Learning
Delta Lake
Real-time Processing
Impact: Enabled data-driven investment decisions with 85% accuracy in trend prediction
Tableau-US Aid Funding (Africa)
Interactive Tableau dashboard analyzing US foreign aid distribution across African countries, featuring comprehensive visualizations of funding patterns, aid categories, and temporal trends
Tableau
Data Visualization
Geospatial Analysis
Dashboard Design
Public Policy Analytics
Impact: Enhanced transparency in foreign aid allocation and distribution patterns
Tableau - Meningitis and Neonatal Sepsis Tracker
Comprehensive public health surveillance dashboard tracking meningitis and neonatal sepsis cases and deaths, providing critical insights for healthcare policy and intervention strategies
Tableau
Public Health Analytics
Epidemiological Analysis
Healthcare Data Visualization
Disease Surveillance
Impact: Enabled data-driven public health decision making and resource allocation
Tableau-Political conflict in Africa
Comprehensive geopolitical analysis dashboard tracking political conflicts, civil unrest, and security incidents across African nations with interactive visualizations and trend analysis
Tableau
Geopolitical Analysis
Conflict Mapping
Security Analytics
Political Data Visualization
Crisis Monitoring
Impact: Enhanced understanding of conflict patterns and security trends for policy makers
Tableau-Internet usage in Africa
Comprehensive digital connectivity analysis dashboard tracking internet penetration, usage patterns, and digital infrastructure development across African countries
Tableau
Digital Analytics
Telecommunications Data
Infrastructure Mapping
Connectivity Analysis
Digital Divide Research
Impact: Informed digital policy decisions and infrastructure investment strategies
Customer Segmentation Analysis
Advanced machine learning project implementing RFM analysis and clustering algorithms to identify distinct customer segments for targeted marketing strategies
Python
Scikit-learn
K-means Clustering
RFM Analysis
Pandas
Matplotlib
Seaborn
Machine Learning
Impact: Increased marketing campaign effectiveness by 35% through targeted customer segments
Google Analytics for a Marketing Website
Comprehensive web analytics dashboard built with Google Looker Studio, tracking marketing performance, user behavior, and conversion metrics for data-driven marketing optimization
Google Analytics
Looker Studio
Web Analytics
Marketing Analytics
Conversion Tracking
Data Visualization
Impact: Improved marketing ROI by 40% through data-driven campaign optimization
Human Development Index by Country using Databricks
Comprehensive analysis of global Human Development Index (HDI) trends using Databricks platform, featuring interactive dashboards and predictive modeling for development indicators
Databricks
Apache Spark
Python
SQL
Data Visualization
Statistical Analysis
Machine Learning
Delta Lake
Impact: Enabled data-driven policy recommendations for international development organizations
Matplotlib: Spotify & Youtube Data
Comprehensive data analysis and visualization of music streaming patterns comparing Spotify and YouTube platforms using advanced statistical methods and interactive visualizations
Python
Matplotlib
Pandas
Streamlit
Data Analysis
Music Analytics
Impact: Revealed key insights into music consumption patterns
Movie Dataset: Pandas & Matplotlib
Comprehensive analysis of TMDB movie metadata exploring box office performance, genre trends, and industry insights using advanced data manipulation and visualization techniques
Python
Matplotlib
Pandas
Data Analysis
Movie Analytics
Streamlit
Impact: Identified key factors driving movie success and revenue patterns
Pandas: A Minimum Working Age
Interactive data analysis application demonstrating minimum working age regulations across different regions using Pandas for data manipulation and Streamlit for visualization
Python
Pandas
Streamlit
Data Analysis
Visualization
Impact: Streamlined age regulation compliance analysis
Pandas for EDA
Comprehensive exploratory data analysis toolkit built with Pandas and Streamlit, providing interactive visualizations and statistical insights for rapid data understanding
Python
Pandas
Streamlit
Data Visualization
Statistical Analysis
Impact: Reduced data exploration time by 60%
Matplotlib - A Minimum Working Example
Interactive demonstration of essential Matplotlib plotting capabilities, showcasing fundamental chart types and customization options for effective data visualization
Python
Matplotlib
Streamlit
Data Visualization
Plotting
Impact: Simplified visualization workflow for teams

Experience & Education

Professional journey and academic background

Professional Experience

2022 - 2025

System Administrator

Innovent South Africa

Managed and maintained critical IT infrastructure including Windows/Linux servers, network systems, and cloud environments. Implemented security protocols, performed system monitoring and troubleshooting, and ensured 99.5% uptime across production systems. Automated routine tasks through scripting, managed user accounts and permissions, and provided technical support to 200+ end users. Led system upgrades and disaster recovery planning while maintaining compliance with organizational security policies.

2017 - 2022

System Administrator

Qrent Zimbabwe

Administered enterprise IT infrastructure encompassing Windows and Linux server environments, network architecture, and cloud platforms. Established robust security frameworks, conducted proactive system monitoring, and resolved technical issues to maintain optimal system performance with minimal downtime. Streamlined operations through automation scripts, oversaw user provisioning and access controls, and delivered comprehensive technical support to organizational staff. Coordinated infrastructure upgrades and developed business continuity strategies while ensuring adherence to cybersecurity standards and regulatory requirements.

2014 - 2017

ICT Officer

Jairos Jiri Association

Managed and maintained organizational IT infrastructure, including network systems, hardware, and software applications. Provided technical support to end-users, troubleshot system issues, and implemented cybersecurity protocols. Coordinated software installations, system updates, and data backup procedures while ensuring compliance with IT policies and standards. Collaborated with cross-functional teams to optimize technology solutions and improve operational efficiency.

Education & Certifications

Azure Data Scientist Associate

Microsoft • 2025

Certified in designing and implementing data science solutions on Azure

Business Management & Information Technology

Catholic University of Zimbabwe • 2010-2014

Specialized in Information systems and business management

AWS Solutions Architect Associate • 2021

Amazon

Focus on Cloud design and architecture

Get In Touch

Let's collaborate on your next data science project

Let's Work Together

I'm always interested in new opportunities and challenging projects. Whether you need help with machine learning, data analytics, or Azure cloud solutions,I'd love to hear from you.

faraimupfuti@gmail.com
linkedin.com/in/faraimupfuti
github.com/faraimupfuti
Send a Message
I'll get back to you within 24 hours
Built with v0