The Harsh Truth About Data Analytics Portfolios
Last month, I reviewed 500+ data analytics portfolios for my hiring team.
Result: 487 were immediately rejected.
Not because the candidates lacked skills, but because their portfolios failed to showcase them properly.
This post will show you the 13 portfolios that made it through - and exactly how to build one that stands out.
What Hiring Managers Actually Look For (In Order)
1. Can They Solve Real Problems? (40%)
Show business impact, not just technical skills.
2. Can They Communicate? (30%)
Your README and presentation matter more than your code.
3. Are They Technically Competent? (20%)
Clean code, proper tools, best practices.
4. Do They Show Initiative? (10%)
Unique projects, continuous learning, community involvement.
The 5 Projects EVERY Portfolio Must Have
Project 1: Business Dashboard
Why It Matters: - 90% of data analyst roles involve dashboards - Shows you understand business KPIs - Demonstrates visualization skills
What to Include:
Business Problem: E-commerce company needs to monitor daily sales performance
Dataset: Kaggle E-Commerce Sales Data
Tools: Tableau Public / Power BI
KPIs Tracked:
- Total Revenue
- Order Count
- Average Order Value
- Revenue by Category
- Top 10 Products
- Sales Trend (daily)
- Customer Segmentation
Insights Found:
1. 30% of revenue comes from just 3 products
2. Weekend sales are 40% lower than weekdays
3. Mobile traffic has poor conversion (18% vs 42% desktop)
Recommendations:
1. Increase marketing spend on top 3 products
2. Weekend promotion campaigns needed
3. Optimize mobile checkout process
Example Datasets: - Superstore Dataset - Kaggle E-Commerce Data - Adventure Works
Template README:
# E-Commerce Sales Dashboard
## Problem Statement
[Company] needed a real-time dashboard to monitor...
## Data Source
- Kaggle E-Commerce Dataset (500K orders, 2019-2023)
- Cleaned in Python (removed 5% null values)
## Tools Used
- Python (pandas, numpy) for data cleaning
- Tableau Public for visualization
- SQL for initial exploration
## Key Insights
1. [Insight with business impact]
2. [Insight with business impact]
3. [Insight with business impact]
## Interactive Dashboard
[Link to Tableau Public/Power BI]
## Screenshots
[Include 3-5 screenshots with captions]
## Skills Demonstrated
- Data cleaning & transformation
- KPI selection
- Dashboard design
- Business storytellingProject 2: Exploratory Data Analysis (EDA)
Why It Matters: - Shows statistical thinking - Proves you can find patterns - Demonstrates hypothesis testing
Structure:
Research Question: What factors influence employee attrition?
Dataset: IBM HR Analytics Dataset
Methodology:
1. Data Cleaning (handling missing values, outliers)
2. Univariate Analysis (distributions)
3. Bivariate Analysis (correlations)
4. Multivariate Analysis (complex relationships)
5. Statistical Testing (t-tests, chi-square)
Key Findings:
1. Employees with <2 years tenure have 47% attrition rate
2. Overtime workers are 3.2x more likely to leave
3. Job satisfaction score <2 predicts 68% of attrition
Statistical Evidence:
- Chi-square test: p < 0.001 (overtime vs attrition)
- T-test: Significant difference in satisfaction scores (p=0.003)
Example Datasets: - IBM HR Analytics - Titanic Dataset - World Happiness Report
Code Structure:
# 1. Imports and Setup
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# 2. Data Loading
df = pd.read_csv('data.csv')
# 3. Data Cleaning
# Document every decision
# 4. EDA
# Visualizations with interpretations
# 5. Statistical Tests
# With clear conclusions
# 6. Summary
# Key takeaways and limitationsProject 3: Predictive Model
Why It Matters: - Growing requirement (even for analysts) - Shows you understand ML concepts - Demonstrates end-to-end skills
Project Structure:
Problem: Predict customer churn for subscription service
Dataset: Telco Customer Churn
Approach:
1. EDA and Feature Engineering
2. Train-Test Split
3. Model Selection (Logistic Regression, Random Forest, XGBoost)
4. Model Evaluation (accuracy, precision, recall, F1, ROC-AUC)
5. Feature Importance Analysis
6. Business Recommendations
Results:
- Best Model: XGBoost (AUC = 0.89)
- Top 3 Predictors: Contract type, tenure, monthly charges
- Model identified 85% of churners correctly
Business Impact:
- Targeting high-risk customers could save $1.2M annually
- Focus retention efforts on month-to-month contract holders
Example Datasets: - Telco Customer Churn - Credit Card Default - House Prices
Code Structure:
# 1. Problem Definition
# 2. Data Loading and Exploration
# 3. Data Preprocessing
# - Handle missing values
# - Encode categorical variables
# - Scale numerical features
# 4. Feature Engineering
# 5. Model Training
# - Multiple algorithms
# - Cross-validation
# 6. Model Evaluation
# - Confusion matrix
# - ROC curve
# - Feature importance
# 7. Insights and RecommendationsProject 4: SQL Analysis Project
Why It Matters: - SQL is non-negotiable (78% of jobs) - Shows database thinking - Demonstrates query optimization
Project Example:
Business Problem: Analyze customer purchase patterns to optimize inventory
Dataset: E-Commerce Database (4 tables)
- Customers (100K rows)
- Orders (500K rows)
- Order_Items (1.5M rows)
- Products (5K rows)
SQL Skills Demonstrated:
1. Complex JOINs (3+ tables)
2. Window Functions (ROW_NUMBER, RANK)
3. CTEs (Common Table Expressions)
4. Subqueries
5. Aggregations with GROUP BY
6. Date Functions
Sample Queries:
-- Find top 10 customers by lifetime value
WITH customer_totals AS (
SELECT customer_id, SUM(amount) as ltv
FROM orders
GROUP BY customer_id
)
SELECT c.customer_name, ct.ltv
FROM customers c
JOIN customer_totals ct ON c.customer_id = ct.customer_id
ORDER BY ct.ltv DESC
LIMIT 10;
-- Calculate 30-day retention rate
WITH first_purchase AS (
SELECT customer_id, MIN(order_date) as first_date
FROM orders
GROUP BY customer_id
),
repeat_purchase AS (
SELECT f.customer_id
FROM first_purchase f
JOIN orders o ON f.customer_id = o.customer_id
WHERE o.order_date BETWEEN f.first_date AND f.first_date + INTERVAL '30 days'
AND o.order_date > f.first_date
)
SELECT
COUNT(DISTINCT r.customer_id) * 100.0 / COUNT(DISTINCT f.customer_id) as retention_rate
FROM first_purchase f
LEFT JOIN repeat_purchase r ON f.customer_id = r.customer_id;
Insights:
1. 30-day retention rate: 32%
2. Top 10% customers generate 65% of revenue
3. Average time between purchases: 23 days
How to Present: - Create GitHub repo with .sql files - Include schema diagram - Write detailed README with business context - Show query results as tables or visualizations
Project 5: Unique/Passion Project
Why It Matters: - Shows personality and creativity - Demonstrates self-driven learning - Memorable (makes you stand out)
Examples That Worked: 1. Sports Analytics: “Which NBA position has evolved most since 1990?” (Python + viz) 2. Social Media Analysis: “Analyzing 10K Reddit posts about data careers” (NLP + Python) 3. Personal Finance: “I tracked every dollar I spent for 2 years” (Dashboard + insights) 4. Gaming: “Optimizing Pokémon team selection with data” (Python + ML) 5. Music: “What makes a Spotify hit in 2024?” (Python + Spotify API)
Keys to Success: - Choose something YOU care about - Make it data-driven (not just opinions) - Show complete workflow (data → insights → viz) - Make it interactive if possible
Portfolio Hosting: Where and How
Option 1: GitHub Pages (FREE - RECOMMENDED)
Pros: - Free hosting - Version control - Shows coding skills - Professional URL
How to Set Up: 1. Create GitHub account 2. Create repository: username.github.io 3. Add index.html or use Jekyll/Hugo/Quarto 4. Push your projects 5. Live at https://username.github.io
What to Include:
/
├── index.html (landing page)
├── projects/
│ ├── sales-dashboard/
│ ├── customer-churn/
│ └── sql-analysis/
├── about.html
└── resume.pdf
FREE Templates: - Quarto - What I use for this site! - Jekyll - Hugo
Option 2: Notion (QUICKEST)
Pros: - No coding required - Beautiful templates - Easy to update - Mobile-friendly
Structure:
Home Page
├── About Me
├── Skills & Tools
├── Projects
│ ├── Project 1 (with embedded visuals)
│ ├── Project 2
│ └── Project 3
└── Contact
Template: - Notion Portfolio Template
Option 3: Personal Website
For Non-Coders: - Wix (Free tier) - WordPress.com (Free tier) - Carrd (Simple, free)
For Coders: - Quarto - Static site generator - Streamlit - Python dashboards - Dash - Python dashboards
Portfolio Structure That Works
Landing Page (Must-Haves):
Hero Section:
- Professional photo
- Name + Title ("Data Analyst")
- One-sentence value proposition
- Links: LinkedIn, GitHub, Email
About Section:
- 3-4 sentences about background
- Key skills (Python, SQL, Tableau, etc.)
- What you're looking for
Projects Section:
- 5-6 featured projects
- Thumbnail + title + 2-sentence description
- "View Project" button
Contact:
- Email
- LinkedIn
- GitHub
- Optional: Calendar link for coffee chats
Individual Project Page Structure:
# Project Title
## Problem Statement
What business problem are you solving?
## Data Source
Where did you get the data? How much? Any limitations?
## Tools & Technologies
- Python (pandas, scikit-learn)
- SQL (PostgreSQL)
- Tableau Public
## Methodology
Step-by-step what you did (high level)
## Key Insights
1-3 data-driven findings with business impact
## Visualizations
Include 3-5 key charts/dashboards
## Code
Link to GitHub repo
## Challenges & Learnings
What was hard? What did you learn?
## Skills Demonstrated
- Data cleaning
- Statistical analysis
- Machine learning
- Data storytellingPortfolio Red Flags (Auto-Rejection)
❌ Only Tutorial Projects
✅ Add your unique spin or business context
❌ No README or Documentation
✅ Every project needs clear documentation
❌ Sloppy Code (no comments, poor naming)
✅ Clean, readable, commented code
❌ No Visuals (walls of code only)
✅ Include charts, dashboards, screenshots
❌ Broken Links
✅ Test every link before sharing
❌ Generic Insights (“Sales increased over time”)
✅ Specific, actionable insights
❌ No Business Context
✅ Every project needs a “why does this matter?” section
FREE Resources to Build Your Portfolio
Datasets:
- Kaggle Datasets - Thousands of datasets
- Data.gov - US government data
- UCI ML Repository - Classic datasets
- FiveThirtyEight Data - News-worthy data
- Our World in Data - Global data
- TidyTuesday - Weekly datasets
Inspiration:
- Kaggle Notebooks - See what others build
- Tableau Public Gallery
- GitHub Data Science Projects
- Awesome Data Science
The 8-Week Portfolio Build Plan
Week 1-2: Foundation
- Set up GitHub account
- Learn Git basics
- Choose portfolio platform
- Create landing page
Week 3-4: Projects 1-2
- Dashboard project (Tableau/Power BI)
- EDA project (Python/R)
- Document everything
Week 5-6: Projects 3-4
- SQL analysis project
- Predictive modeling project
- Write READMEs
Week 7: Project 5
- Passion project
- Make it unique and memorable
Week 8: Polish
- Review all documentation
- Test all links
- Get feedback from 3 people
- Make final improvements
- LAUNCH! 🚀
Portfolio Examples That Got People Hired
Example 1: Career Switcher
- 3 projects (dashboard, EDA, SQL)
- Clear documentation
- Resume shows “self-taught” story
- Result: Hired at Series B startup, $75K
Example 2: Recent Graduate
- 5 projects (mix of school + personal)
- Active GitHub (weekly commits)
- Blog posts explaining projects
- Result: Data analyst at Fortune 500, $80K
Example 3: Professional Pivot
- 4 industry-specific projects (healthcare)
- Showed domain expertise + new technical skills
- Video explanations of each project
- Result: Senior analyst at health tech, $95K
Interview Preparation
Be ready to: 1. Walk through each project (5-minute presentation) 2. Explain your decisions (why this method vs. that) 3. Discuss challenges (what went wrong, how you fixed it) 4. Show business impact (not just technical details) 5. Demonstrate passion (why this project specifically?)
Practice Script:
"For this project, I analyzed e-commerce data to help a company understand their customer churn.
The dataset had 100K customers over 3 years. I started by cleaning the data - there were 5% missing values which I handled by [method].
I then explored the data and found that customers who didn't make a purchase in 90 days had a 70% chance of never returning. This was the key insight.
I built a simple predictive model using Python and scikit-learn, achieving 85% accuracy in predicting churn.
The business recommendation was to implement a 60-day re-engagement campaign, which could potentially save $500K in lost revenue annually.
The hardest part was feature engineering - I had to create recency, frequency, and monetary value features which required careful date calculations.
You can see the full analysis on my GitHub, and I'd be happy to walk through any part in more detail."
Take Action Today
Your homework (90 minutes):
- Set up GitHub account (15 min)
- Choose a dataset (15 min)
- Outline your first project (30 min)
- Create a simple landing page (30 min)
Share your progress! Tweet or post on LinkedIn with #DataAnalytics
Final Thoughts
Your portfolio is your best interview preparation. Every project teaches you something new and gives you stories to tell.
Don’t wait for perfection. Ship your first project this week.
Remember: 13 out of 500 portfolios got interviews. Be one of the 13.
Related Posts: - Your Ultimate 100-Day Data Analytics Roadmap - Master SQL in 30 Days - Data Visualization Mastery
Tags: #Portfolio #Career #DataAnalytics #Projects #GitHub #JobSearch