Sihle Kalolo
Data Analyst & Scientist | Turning Disruption into Insight
Curious. Resilient. Mission-Driven. I help purpose-led teams find clarity in complex data.
My Path: From Disruption to Discovery
BSc Computer Science & Applied Math
University of Fort Hare
WeThinkCode_
Software Engineering
ExploreAI Academy
Data Science
Sand Technology
Data Scientist
My journey into data hasn’t been linear. During my final year studying BSc in Computer Science and Mathematics at the University of Fort Hare, I experienced a serious car accident in September 2021 that temporarily slowed my academic progress.
That period became a turning point. It pushed me to reassess my path and focus on building practical, industry-ready skills. I redirected my energy into intensive, hands-on training through WeThinkCode_ and ExploreAI Academy, where I developed a strong foundation in data science and real-world problem solving.
There, I discovered my true passion: using data not just for theory, but for real-world impact. I learned to wrangle unruly data, build predictive models, and visualize insights that help humans make better decisions.
That car accident didn't end my story; it gave it a new, more focused direction. Now, I'm on a mission to help others find signal in the noise.
Selected Work: Driving Impact with Data
Predicting River Health for Thames Water
1.5M+
Monthly Records
50+
Monitoring Stations
85%
Faster Reporting (5d→3h)
25%
↓ Incident Risk
Complex SQL Engineering: Wrote complex SQL queries to analyze data from 50+ stations and 12 field scientists, uncovering patterns in water quality and sewage discharge risks.
Dashboard Automation: Built interactive Power BI dashboards for 30+ field teams, slashing manual reporting time from 5 days to 3 hours.
Data Hygiene: Championed data hygiene by validating datasets and correcting 8,500+ inconsistencies, ensuring stakeholders could trust the insights.
Predictive Intelligence: Identified trends that enabled predicting sewage discharge events up to 48 hours in advance, leading to proactive interventions.
Technical Mentor
Chosen among a select group of mentors to guide and shape the career paths of future tech professionals.
- Selected to serve as a Technical Mentor for the 2023 Cohort, contributing expertise in Python programming.
- Providing guidance, support, and insights to students, fostering their technical and professional growth.
- Collaborating with fellow mentors to create impactful learning experiences and workshops.
- Facilitating induction and training sessions to equip students with essential skills for success in the tech industry.
Data Playground: Real Problems, Real Code
South Africa Public Procurement Intelligence System 🇿🇦
Overview: An end‑to‑end machine learning system that transforms raw e‑tender data (OCDS format) into actionable intelligence for suppliers, procurement officials, and policy analysts.
What It Does: Features an interactive Streamlit Dashboard with real‑time ML predictions, a REST API with 10 endpoints for programmatic access, and an Automated Excel Reporter. The machine learning pipeline utilizes Random Forest, XGBoost, LightGBM, and an ensemble for contract value forecasting, alongside 7 governance red‑flag anomaly detectors.
"No Power BI required – the built‑in interactive dashboard and automated Excel reports fully meet all analytical needs."
Key Outcomes
- 🚀Identified top opportunities with confidence scores
- 🚩Flagged 37% of contracts with governance risks
- 💰Delivered median‑anchored contract value predictions
- ⚙️Automated Scheduler for weekly retraining & updates
Impact
- Empowers SMEs to make data‑driven bidding decisions
- Helps detect procurement irregularities
- Provides analysts with transparent, reproducible intelligence
Predictive Crime Hotspot Modeling
Overview: Built a machine learning solution to forecast future crime counts at police station level, enabling proactive resource allocation and early identification of high-risk crime hotspots.
This project focuses on predicting future crime trends using historical SAPS station-level data (2008–2023). I designed an end-to-end data pipeline starting with SQL-based extraction and aggregation, followed by advanced feature engineering. I developed and combined three models—Random Forest, XGBoost, and LightGBM—into an ensemble to capture temporal and spatial patterns.
"Achieved high predictive accuracy with MAE ~200 crimes per station-year, enabling actionable forecasting for real-world policing strategies."
Key Features
- Forecasted station-level crime with ML ensemble
- Engineered lag, rolling averages & clustering
- Identified high-risk stations (top 20%)
- Generated 2024 forecasts for planning
Business Impact
- Enables proactive vs reactive policing
- Efficient resource & patrol allocation
- Early identification of emerging hotspots
- Data-driven public safety strategy
IMS StepUp SA — Marketing Analytics Dashboard
End-to-end marketing analytics project for a South African footwear retailer. Built a 4-page executive dashboard in Power BI covering campaign ROI, store & product sales, customer segmentation and CLV, and website performance (GA4) — powered by Python data cleaning and DAX measures across 24 months of real business data.
Vegetable-Prices
Vegetable Price Analysis and Forecasting – analyzes historical vegetable prices across various regions to identify key trends and regional disparities using machine learning techniques.
Pizza-Sales
Interactive Pizza Sales Analysis dashboard built with Tableau.
Supermarket-sales
Analyzes 3-month sales data from 3 supermarket branches. Uses MySQL for data cleaning, Power BI for visualization, and predictive analytics to identify trends and forecast sales.
House-Prices
Predict sales prices with feature engineering, random forests, and gradient boosting.
Tools & Technologies
Data Querying
Data Science
Visualization
Cloud & Core
Verified Credentials
Continuous Learning
University of Fort Hare
Bachelor of Science (Computer Science & Applied Mathematics)
Coursework completed through September 2021. Achieved multiple distinctions, including 96% for Geometry and 81% for Computer Literacy.
WeThinkCode_
Software Engineering Programme | NQF Level 5
Intensive, peer-led learning focused on practical application. Final Score: 4.18.
ExploreAI Academy
Data Science Programme
Focused on applied data science, machine learning, and advanced analytics.
QCTO Statement of Results (118708)
Completed multiple Data Science modules (PM-02 to PM-10, WM-01 to WM-04).
Ready to find clarity in the chaos?
Let's turn your raw data into actionable insights. I am currently open for opportunities in South Africa (Hybrid/Remote/On-site).