Data mining
Undergraduate · Statistics
Syllabus focus
Standard syllabus · STEM / applied
Pricing calculator
Choose materials, tutoring, or both — or book a single session as needed. Customize your plan on the subscribe page.
Billed in 15-minute increments (15-minute minimum, up to 4 hours). No subscription required.
$60.00 · 60 min · Undergraduate · Online ($60/hr)
Book through intake or schedule a session.
Topics typically covered
Standard syllabus
Data preparation
- Train/validation/test splits
- Feature engineering and encoding
- Handling missing values and outliers
- Dimensionality reduction: PCA for mining
- Class imbalance strategies
Supervised learning
- Classification and regression trees
- Ensemble methods: bagging and random forests
- Boosting (AdaBoost, gradient boosting intro)
- k-nearest neighbors and naive Bayes
- Model evaluation: ROC, AUC, and confusion matrices
Unsupervised learning
- Cluster analysis for market segmentation
- Association rules and market basket analysis
- Anomaly detection (introduction)
- Text mining basics (optional)
STEM / applied
Applied mining projects
- End-to-end Kaggle-style projects
- scikit-learn and tidy modeling pipelines
- Hyperparameter tuning and cross-validation
- Deploying models as simple APIs (intro)
- Ethics: fairness, bias, and explainability
- Big data tools overview: Spark (optional)
Additional applied practice
- Reviewing assumptions with domain experts
- Documenting analysis choices for reproducibility
- Sensitivity analyses for key modeling decisions
- Connecting results to the original research or business question
Notes
Undergraduate data mining courses bridge statistics and computer science. Focus on predictive accuracy and model interpretation rather than deep learning theory.