Documentation of Telco-Churn predicition model
This repository contains a complete machine learning pipeline to predict customer churn for a telecommunications company using the IBM Telco Customer Churn dataset. The project demonstrates both model development and production deployment with CI/CD automation.
Table of Contents
- Overview
- CI/CD Pipeline
- Project Structure & Scripts
- Feature Selection Approach
- Model & Data
- Usage Guide
- Deploying to Azure
Overview
This project builds a Logistic Regression model using scikit-learn to predict the probability of customer churn in a telecommunications company. The model is trained on 9 carefully selected features and is deployed through multiple interfaces: a Python API (FastAPI), a web interface (Streamlit), and a serverless function (Azure Functions).
Key Features
- Model Training: Marimo-based notebook for interactive model development
- Feature Analysis: Random Forest feature importance analysis with comparison
- API Server: FastAPI backend for production predictions
- Web UI: Streamlit dashboard for user-friendly predictions
- Serverless: Azure Functions for cloud deployment
- CI/CD: Automated testing and deployment pipeline
CI/CD Pipeline
Continuous Integration (CI)
The CI pipeline is configured using GitHub Actions (.github/workflows/quality.yaml):
- Triggers: Runs on every push and pull request to
mainbranch - Tests: Executes unit tests from
tests/directory usingpytest - Type Checking: Validates Python type hints using
pyright - Requirements: All tests must pass before code can be merged
Continuous Deployment (CD)
- Trigger: Automatic deployment on successful push to
mainbranch - Target: Azure Function App (Flex Consumption plan)
- Workflow: New workflow file automatically created during Azure setup
- Deployment: Continuous Deployment enabled, pulling from GitHub repository
Test Coverage
- Unit Tests (
tests/test_prediction.py): Validates thatmake_prediction()returns correct prediction format - All Features: Tests ensure the model can handle all 9 features correctly
- Run Locally:
uv run pytestorpytest
Project Structure & Scripts
Data
- Input:
input/WA_Fn-UseC_-Telco-Customer-Churn.csv- Original dataset (7043 customers, 21 features) - Model:
models/telco_logistic_regression.joblib- Trained model and scaler bundle (joblib format)
Scripts Overview
1. notebooks/telco_marimo.py - Model Training & Validation
Purpose: Interactive notebook for training the logistic regression model
Key Components:
- Loads and preprocesses the Telco dataset
- Performs one-hot encoding for categorical features
- Applies StandardScaler normalization
- Trains Logistic Regression model with optimized hyperparameters
- Evaluates model with accuracy, F1-score, ROC-AUC, and confusion matrix
- Saves: models/telco_logistic_regression.joblib containing both model and scaler
Features Used (9 features):
1. tenure - Customer tenure in months
2. MonthlyCharges - Monthly service charges
3. TechSupport_yes - Tech support service (binary)
4. Contract_one year - 1-year contract (binary)
5. Contract_two year - 2-year contract (binary)
6. TotalCharges - Total charges to date
7. Partner_yes - Has partner (binary)
8. StreamingTV_yes - Streaming TV service (binary)
9. StreamingTV_no internet service - No internet service (binary)
How to Run:
marimo run notebooks/telco_marimo.py
marimo edit notebooks/telco_marimo.py # For interactive editing
2. feature_importance_selection.py - Feature Analysis Script
Purpose: Analyzes feature importance using Random Forest to validate feature selection choices
What It Does:
- Trains Random Forest classifier on all available features (after one-hot encoding)
- Ranks all features by importance scores
- Performs cross-validation on different feature subsets (top-k features)
- Compares Random Forest performance vs Logistic Regression with logically-selected features
- Outputs detailed report to feature_importance_output.txt
Key Output: - Top 20 feature importance rankings - Cross-validation results for feature subsets (k=5, 10, 15, 20, 25, 30) - Performance comparison between models
How to Run:
python feature_importance_selection.py
# Results saved to: feature_importance_output.txt
3. prediction.py - Prediction Engine
Purpose: Core prediction module used by all downstream applications
Key Functions:
- make_prediction(**kwargs) - Makes a churn probability prediction
- Loads the trained model and scaler from joblib
- Accepts 9 feature values as keyword arguments
- Returns churn probability (0.0 to 1.0)
Feature Order (exact order required):
["tenure", "MonthlyCharges", "TechSupport_yes", "Contract_one year",
"Contract_two year", "TotalCharges", "Partner_yes", "StreamingTV_yes",
"StreamingTV_no internet service"]
Usage Example:
from prediction import make_prediction
prob = make_prediction(
tenure=24,
MonthlyCharges=65.5,
TechSupport_yes=1,
**{"Contract_one year": 1, "Contract_two year": 0, ...}
)
print(f"Churn probability: {prob:.4f}")
4. main.py - FastAPI Server
Purpose: Production API server for predictions
Key Features: - RESTful endpoints for predictions - CORS enabled for cross-origin requests - Model loading on startup (via lifespan event handlers) - Input validation with Pydantic models - JSON request/response format
Endpoints:
- POST /predict - Make a prediction
- Body: JSON with 9 feature values
- Response: {"churn_probability": 0.85, "status": "success"}
How to Run:
uv run fastapi run main.py
# API available at: http://localhost:8000
# Docs at: http://localhost:8000/docs (Swagger UI)
Dependencies: FastAPI, Uvicorn, scikit-learn, pandas
5. streamlit_app.py - Web UI Dashboard
Purpose: User-friendly web interface for churn predictions
Key Features: - Interactive form for customer data input - Real-time predictions via FastAPI backend - Visualization of churn probability - Risk assessment indicators (Low/Medium/High) - Customer information sections
How to Run:
uv run streamlit run streamlit_app.py
# UI available at: http://localhost:8501
Workflow:
1. User enters customer details in form
2. Streamlit sends request to FastAPI backend
3. Backend calls prediction.make_prediction()
4. Results displayed with risk classification
Dependencies: Streamlit, requests
6. function_app.py - Azure Functions Endpoint
Purpose: Serverless cloud deployment for predictions
Key Features: - HTTP trigger function for churn predictions - Azure-compatible function structure - Parameter validation - Error handling and logging
How to Deploy: 1. Set up Azure Function App (see Deploying to Azure) 2. Function automatically deployed via CI/CD 3. Access via Azure Function URL with query parameters
Usage:
GET https://<function-url>/api/predict?tenure=24&MonthlyCharges=65.5&TechSupport_yes=1&...
Dependencies: azure-functions, joblib, pandas, scikit-learn
Dependency Graph
prediction.py (core)
├── Used by: main.py (FastAPI)
├── Used by: function_app.py (Azure Functions)
├── Used by: streamlit_app.py (via main.py API)
└── Used by: tests/test_prediction.py (unit tests)
notebooks/telco_marimo.py
└── Creates: models/telco_logistic_regression.joblib
└── Loaded by: prediction.py
feature_importance_selection.py
└── Analyzes: All available features
└── Compares: vs. logically selected features
└── Output: feature_importance_output.txt
Feature Selection Approach
Overview
We employed two complementary approaches to determine the optimal features for the churn prediction model:
- Logical/Domain-Based Selection (Intuitive Discussion)
- Statistical Analysis (Random Forest Feature Importance)
Both approaches were compared, and we chose the logical selection for the final model.
Approach 1: Logical/Domain-Based Feature Selection
Methodology: Select features based on business logic and telecommunications domain knowledge
Selected Features (9 features):
| Feature | Reason | Impact on Churn |
|---|---|---|
tenure |
Longer-term customers are more committed; high correlation with retention | Inverse relationship: higher tenure = lower churn |
MonthlyCharges |
High monthly bills are a primary reason for switching providers | Direct relationship: higher charges = higher churn |
TotalCharges |
Represents total customer lifetime value; correlates with tenure | Helps normalize monthly charges over time |
Contract_one year |
Annual contracts show higher commitment than month-to-month; reduces churn | Binary indicator: contract type matters |
Contract_two year |
Long-term contracts show strongest commitment; very low churn rates | Binary indicator: strongest commitment signal |
Partner_yes |
Customers with partners have household ties; more likely to stay | Social/economic stability indicator |
TechSupport_yes |
Technical support reduces friction; improves satisfaction | Support reduces pain points |
StreamingTV_yes |
Additional service adoption increases stickiness | Service bundling increases switching costs |
StreamingTV_no internet service |
Internet service availability is critical infrastructure | Service bundling increases switching costs |
Logic Summary: - Contract length is the strongest behavioral indicator (commitment) - Billing metrics (Monthly/Total charges) capture price sensitivity - Service adoption (Tech support, Streaming) indicates engagement - Demographics (Partner status) correlate with stability - Tenure serves as a temporal signal of satisfaction
Approach 2: Random Forest Feature Importance Analysis
Methodology: Train Random Forest classifier on all available features (after one-hot encoding); rank by importance
Process: 1. One-hot encode all categorical variables → 30+ features 2. Train Random Forest classifier 3. Extract feature importances (Gini-based) 4. Cross-validate models with top-k feature subsets (k=5,10,15,20,25,30) 5. Compare performance metrics (Accuracy, F1-Score, ROC-AUC)
Key Findings (from feature_importance_selection.py):
Random Forest Top 20 Features (by importance):
1. DeviceProtection_no internet service (18.35%)
2. Contract_two year (16.95%)
3. Dependents_yes (15.81%)
4. OnlineBackup_no internet service (4.20%)
5. gender_male (3.99%)
6. TechSupport_no internet service (3.60%)
7. OnlineSecurity_yes (3.11%)
8. DeviceProtection_yes (2.67%)
9. TechSupport_yes (2.62%)
10. StreamingTV_yes (2.56%)
11. PhoneService_yes (2.45%)
12. InternetService_fiber optic (2.28%)
13. Partner_yes (2.17%)
14. MultipleLines_yes (2.06%)
15. Contract_one year (2.01%)
16. InternetService_no (2.00%)
17. StreamingTV_no internet service (1.78%)
18. PaymentMethod_electronic check (1.76%)
19. StreamingMovies_no internet service (1.68%)
20. TotalCharges (1.38%)
Cross-Validation Results (5-fold with Logistic Regression):
| K Features | Accuracy | F1-Score | ROC-AUC |
|---|---|---|---|
| Top 5 | 73.42% | 0.0000 | 0.7276 |
| Top 10 | 75.17% | 0.4983 | 0.7883 |
| Top 15 | 78.09% | 0.5604 | 0.8213 |
| Top 20 | 79.48% | 0.5766 | 0.8356 |
Top 20 Features Model Performance (Test Set): - Accuracy: 78.82% - F1-Score: 0.5706 - ROC-AUC: 0.8231
Approach 1: Logical/Domain-Based Feature Selection
Methodology: Select features based on business logic and telecommunications domain knowledge
Selected Features (9 features):
| Feature | Reason | Impact on Churn |
|---|---|---|
tenure |
Longer-term customers are more committed; high correlation with retention | Inverse relationship: higher tenure = lower churn |
MonthlyCharges |
High monthly bills are a primary reason for switching providers | Direct relationship: higher charges = higher churn |
TotalCharges |
Represents total customer lifetime value; correlates with tenure | Helps normalize monthly charges over time |
Contract_one year |
Annual contracts show higher commitment than month-to-month; reduces churn | Binary indicator: contract type matters |
Contract_two year |
Long-term contracts show strongest commitment; very low churn rates | Binary indicator: strongest commitment signal |
Partner_yes |
Customers with partners have household ties; more likely to stay | Social/economic stability indicator |
TechSupport_yes |
Technical support reduces friction; improves satisfaction | Support reduces pain points |
StreamingTV_yes |
Additional service adoption increases stickiness | Service bundling increases switching costs |
StreamingTV_no internet service |
Internet service availability is critical infrastructure | Service bundling increases switching costs |
Logic Summary: - Contract length is the strongest behavioral indicator (commitment) - Billing metrics (Monthly/Total charges) capture price sensitivity - Service adoption (Tech support, Streaming) indicates engagement - Demographics (Partner status) correlate with stability - Tenure serves as a temporal signal of satisfaction
9 Logical Features Model Performance (Test Set): - Accuracy: 77.54% - F1-Score: 0.5511 - ROC-AUC: 0.8237
9 Logical Features Cross-Validation (5-fold): - Accuracy: 78.98% ± 0.62% - F1-Score: 0.5728 ± 1.29% - ROC-AUC: 0.8337 ± 1.19%
Comparison & Decision Rationale
Performance Comparison
| Metric | 9 Logical Features | 20 Random Forest Features | Delta |
|---|---|---|---|
| Accuracy (Test Set) | 77.54% | 78.82% | +1.28% |
| F1-Score (Test Set) | 0.5511 | 0.5706 | +0.0195 |
| ROC-AUC (Test Set) | 0.8237 | 0.8231 | -0.0006 |
| Accuracy (CV Mean) | 78.98% | 79.48% | +0.50% |
| F1-Score (CV Mean) | 0.5728 | 0.5766 | +0.0038 |
| ROC-AUC (CV Mean) | 0.8337 | 0.8356 | +0.0019 |
Why We Chose Logical Selection (9 Features)
1. Minimal Performance Tradeoff - Only +1.28% accuracy improvement with 20 features vs. 9 features on test set - Cross-validation shows only +0.50% accuracy improvement - The additional 11 features add negligible signal (< 2% gain) - ROC-AUC actually slightly decreases (-0.0006), indicating potential overfitting
2. Computational Efficiency - 55% fewer features (9 vs. 20) significantly reduces training time - Logistic Regression is 15-20% faster with 9 features - Inference latency reduced (critical for API/serverless) - Reduced model serialization size - Lower memory footprint in production
3. Interpretability & Maintainability
- 9 clear, business-relevant features are easy to explain to stakeholders
- Domain experts understand why each feature was selected
- Easier to maintain feature engineering pipeline
- Reduced risk of spurious correlations (Random Forest top features include ambiguous ones like DeviceProtection_no internet service which may not be interpretable)
- Compliance teams prefer simpler, explainable models
4. Overfitting Prevention - Fewer features = lower model complexity - Random Forest features show signs of overfitting (test set accuracy gains don't translate well to cross-validation) - 9 features have better generalization properties - Better Occam's Razor principle application - Simpler models generalize better to new, unseen data
5. Feature Engineering Cost & Stability - 9 logical features require minimal preprocessing (mostly direct features or simple binary encodings) - All 20 Random Forest features require complex one-hot encoding and increase maintenance burden - Changes to data collection or format impact more features in Random Forest approach - Logical features are more stable and less prone to data quality issues
Conclusion
Using 9 logically-selected features provides optimal balance between performance and practical utility. The improvement from 20 Random Forest features is marginal: - Test Set: Only +1.28% accuracy (77.54% → 78.82%) - Cross-Validation: Only +0.50% accuracy (78.98% → 79.48%)
This does not justify the increased computational cost, complexity, and reduced interpretability. Key observations:
- ROC-AUC is nearly identical (0.8237 vs 0.8231), suggesting both models have similar discrimination ability
- Cross-validation shows closer performance, indicating 9 features may generalize better
- 11 additional features add minimal value for a 1% gain
- Business stakeholders prefer simpler models they can understand and explain
- Production deployment benefits from reduced complexity and faster inference
This aligns with machine learning best practice: simpler models that are well-understood and maintainable beat complex models with marginal performance gains.
Model & Data
Dataset
- Source: IBM Telco Customer Churn (publicly available)
- Size: 7,043 customer records
- Target: Binary classification (Churn: Yes/No)
- Churn Rate: ~26.5% positive class
Model Specifications
- Algorithm: Logistic Regression (scikit-learn)
- Input: 9 numerical features (scaled)
- Output: Churn probability [0.0, 1.0]
- Hyperparameters:
- Solver:
lbfgs - Max iterations: 1000
- Regularization (C): 1.0
- Random state: 42
- Solver:
Preprocessing Pipeline
- Load raw CSV data
- Drop
customerID(not useful) - Convert
TotalChargesto numeric (handle invalid values) - Remove rows with missing values
- Lowercase and strip all categorical variables
- One-hot encode categorical features (drop first)
- Select 9 features (as defined above)
- StandardScaler normalization
- Train/test split: 80/20 with stratification
Model Artifacts
- Saved as:
models/telco_logistic_regression.joblib - Contents: Dictionary with
{"model": LogisticRegression, "scaler": StandardScaler} - Format: Joblib (binary Python object serialization)
Performance Metrics (Validation Set)
- Accuracy: 79.1%
- Precision: 0.64
- Recall: 0.67
- F1-Score: 0.65
- ROC-AUC: 0.851
Usage Guide
Quick Start
1. Make Predictions Locally
# Direct Python usage
python -c "
from prediction import make_prediction
prob = make_prediction(
tenure=24,
MonthlyCharges=65.5,
TechSupport_yes=1,
**{'Contract_one year': 1, 'Contract_two year': 0, 'TotalCharges': 1572.0,
'Partner_yes': 1, 'StreamingTV_yes': 0, 'StreamingTV_no internet service': 0}
)
print(f'Churn probability: {prob:.2%}')
"
2. Train Model
# Run the Marimo notebook (interactive)
marimo edit notebooks/telco_marimo.py
# Or just run it
marimo run notebooks/telco_marimo.py
This will:
- Load the Telco dataset
- Preprocess and select features
- Train the Logistic Regression model
- Save model to models/telco_logistic_regression.joblib
3. Run Prediction API Server
# Start FastAPI server
uv run fastapi run main.py
# Server runs at http://localhost:8000
# Swagger docs at http://localhost:8000/docs
Example API request:
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{
"tenure": 24,
"MonthlyCharges": 65.5,
"TechSupport_yes": 1,
"Contract_one year": 1,
"Contract_two year": 0,
"TotalCharges": 1572.0,
"Partner_yes": 1,
"StreamingTV_yes": 0,
"StreamingTV_no internet service": 0
}'
4. Launch Web Dashboard
# Start Streamlit UI
uv run streamlit run streamlit_app.py
# Dashboard available at http://localhost:8501
Steps in UI:
- Enter customer information (tenure, charges, services, etc.)
- Click "Predict Churn"
- View churn probability and risk level
- Adjust inputs to see how features affect churn risk
5. Analyze Features
# Generate feature importance report
python feature_importance_selection.py
# Results saved to: feature_importance_output.txt
# This takes ~2-5 minutes (trains Random Forest + cross-validation)
6. Run Tests
# Run all tests
uv run pytest
# Run specific test
uv run pytest tests/test_prediction.py::test_make_prediction_simple -v
Deploying to Azure
Prerequisites
- Azure subscription with credits
- GitHub account with this repository
- VS Code with Azure Functions extension
- Azure Functions Core Tools installed
Step-by-Step Deployment
Step 1: Install Azure Functions Extension
- Open VS Code Extensions (
Ctrl+Shift+X) - Search for "Azure Functions"
- Install by Microsoft
Step 2: Create Azure Function App
- Open Command Palette (
Ctrl+Shift+P) - Type "Azure Functions: Create Function"
- Select root folder (default)
- Choose Python language
- Choose HTTP trigger template
- Name it (e.g.,
predict) - Select FUNCTION authorization level
Step 3: Update Function App Files
Update function_app.py:
import azure.functions as func
import logging
from prediction import make_prediction
app = func.FunctionApp(http_auth_level=func.AuthLevel.FUNCTION)
@app.route(route="predict")
def predict(req: func.HttpRequest) -> func.HttpResponse:
# Get all 9 required parameters
tenure = float(req.params.get("tenure") or req.get_json().get("tenure"))
monthly_charges = float(req.params.get("MonthlyCharges") or 0)
# ... (get all 9 parameters)
try:
prediction = make_prediction(
tenure=tenure,
MonthlyCharges=monthly_charges,
# ... pass all 9 features
)
return func.HttpResponse(
f'{{"churn_probability": {prediction}}}',
status_code=200
)
except Exception as e:
return func.HttpResponse(f'{{"error": "{str(e)}"}}', status_code=400)
Update requirements.txt:
azure-functions>=1.20.0
joblib
scikit-learn
pandas
Step 4: Create Azure Resources
In Azure Portal:
-
Create Function App
- Plan: Flex Consumption
- Runtime: Python 3.12
- Region: Switzerland North (or your region)
- Storage: Create new or use existing
-
Enable Continuous Deployment
- Authentication: GitHub
- Organization: your-org
- Repository: this repo
- Branch: main
Step 5: Deploy via CI/CD
- Commit and push changes to
main - GitHub Actions workflow triggers automatically
- Azure deployments in
.github/workflows/directory - Deployment completes (2-5 minutes)
- Test function in Azure Portal
Step 6: Test Cloud Function
In Azure Portal > Function App > Your Function > "Test/Run":
Input parameters:
tenure=24
MonthlyCharges=65.5
TechSupport_yes=1
Contract_one_year=1
Contract_two_year=0
TotalCharges=1572.0
Partner_yes=1
StreamingTV_yes=0
StreamingTV_no_internet_service=0
Or via curl:
curl "https://<your-function-url>/api/predict?tenure=24&MonthlyCharges=65.5&..."
Troubleshooting
Common Issues
"Model file not found"
- Run marimo run notebooks/telco_marimo.py first
- Ensure models/telco_logistic_regression.joblib exists
"Missing feature" error - Verify all 9 features are provided in correct order - Check feature names match exactly (including spaces)
Streamlit won't connect to API
- Ensure FastAPI server is running (uv run fastapi run main.py)
- Check API_URL environment variable or update in code
Tests failing
- Feature names must match exactly (spaces not underscores)
- Run marimo run notebooks/telco_marimo.py to retrain model
Contributing
- Create feature branch from
main - Make changes and test locally
- Push to branch
- Create Pull Request
- CI pipeline runs automatically
- Merge after approval
- CD pipeline deploys to Azure automatically