Production-style, decision-driven projects focused on business impact, ROI-aware modeling, and real-world constraints.
AI / NLP
FEATURED
CyberSec Severity Intelligence Engine
NVD CVE → TF-IDF + SecBERT → Severity Prediction · Live Streamlit App
End-to-end NLP pipeline that predicts CVE severity (LOW/MEDIUM/HIGH/CRITICAL) using TF-IDF bigrams + XGBoost and a fine-tuned SecBERT model. Achieves 0.81 F1 on CRITICAL — the highest-stakes class — with SHAP token-level explainability and an LLM cost-benchmark comparison.
PythonSecBERTXGBoostSHAPStreamlitSMOTE
View on GitHub
Machine Learning
FEATURED
Fraud Triage Engine
Temporal splitting → calibrated ensemble → capacity-constrained triage policy
Production-aware fraud detection system that reframes detection as a capacity-constrained ranking problem. Uses a calibrated LightGBM + XGBoost ensemble with temporal splits, cost-sensitive thresholding, SHAP explainability, and concept drift monitoring — achieving 81.3% fraud recall at 500 daily reviews with 70% cost reduction.
PythonLightGBMXGBoostSHAPIsotonic CalibrationDrift Detection
View on GitHub
BI / Decision
FEATURED
Telecom Churn Decision Engine
Churn risk + uplift + CLV → MILP budget optimisation → Monte Carlo simulation
Profit-optimised retention system that goes beyond churn scoring — combining survival-adjusted CLV, T-learner uplift modeling, and 0/1 knapsack MILP allocation. Achieves 8.15× ROI at £10k budget with correlated Monte Carlo risk simulation across 2,000 scenarios, outperforming highest-churn-first targeting by +37.7% mean profit.
PythonLightGBMPuLP MILPUplift ModelingMonte CarloCLV
View on GitHub
BI / Decision
FEATURED
Campaign ROI Optimisation Engine
Bank marketing → profit-ranked triage → Streamlit decision dashboard
Answers the real operational question for a Portuguese bank's outbound campaigns: given N calls per day, which customers maximise expected profit? Uses Optuna-tuned Random Forest (ROC-AUC 0.81) with SHAP-driven feature analysis, achieving 77% conversion precision in the top 1,000 contacts — 6.8× better than random — with an interactive profit simulation dashboard.
PythonRandom ForestOptunaSHAPStreamlitProfit Simulation
View on GitHub
Forecasting
FEATURED
SaaS Revenue Intelligence & Forecasting Engine
Retention + usage analytics → revenue intelligence
Subscription analytics pipeline that turns churn/usage signals into revenue intelligence for forecasting and retention-focused decisions.
PythonCohortsForecastingAnalytics
View on GitHub
Forecasting
Rossmann Sales Forecasting
Time-series forecasting with strong baselines + validation
Forecasts total daily sales with proper time-based validation, trend/seasonality diagnostics, and promo impact analysis.
PythonTime SeriesSARIMAForecasting
View on GitHub
Data Engineering
Job Market Intelligence Pipeline
Trend detection on job postings (rising vs falling skills)
End-to-end pipeline that ingests job postings, normalizes skills, and tracks momentum over time for labor market intelligence.
PythonDuckDBNLPTrend Analytics
View on GitHub
Machine Learning
Digital Finance Platform for Logistics
Risk scoring + pricing + decisioning for invoice payouts
Mini decisioning + pricing engine that predicts late-payment risk, approves instant pay, and generates a risk-based fee quote.
PythonRisk ModelingProfit SimulationDecisioning
View on GitHub
More
MORE
More Projects & Case Studies
Full repository list + ongoing builds
I'm continuously shipping decision-focused systems and analytics tools. Explore all repositories and recent updates on GitHub.
In ProgressCase StudiesDocs
View All Repos