Data Mining

This course covers data mining topics. Students will learn some theories and its algorithms. They also work on data mining project. All of the assignments and project are submitted via kulino (

Course Contents:

  1. Introduction to Data Mining: KDD Process, Atributes vs Label, Type of attributes, Data normalization, Univariate vs multivariate, Similarity/Dissimilarity, Supervised vs Unsupervised
  2. Preprocessing: Outlier removal and handling missing value
  3. Discretization Techniques: Statistical approach, Chimerged
  4. Classification: K-Nearest Neighbor, Naive Bayes, Decision Tree
  5. Prediction: Corelation, Linear regression
  6. Feature Reduction: Filter, Wrapper, Transformation, Correlation-based, Chi square method, and Singular Value Decomposition
  7. Clustering: Kmeans, Hierarchical Agglomerative Clustering
  8. Ensemble Methods: Bagging and boosting
  9. Imbalanced Dataset: RUS, ROS, SMOTE
  10. Association Rules: Apriori Algorithm
  11. Recommendation Systems: Borda, Copeland
  12. Statistical Performances: Friedman test, Nemenyi, Bonferroni-Dunn

Text books:

  1. Oded Maimon, Lior Rokach, Data Mining and Knowledge Discovery Handbook, 2010
  2. Jiawei Han, Micheline Kamber, Jian Pei, Data Mining Concepts and Techniques, 3rd Edition, 2011
  3. Margaret H. Dunham, Data Mining – Introductory and  Advance Topics, Prentice Hall, 2003
  4. Mohammed J. Zaki, Wagner Meira Jr., Data Mining and Analysis Fundamental Concepts and Algorithms, Cambridge, 2014

Leave a Reply

Your email address will not be published. Required fields are marked *