Madhavan Mukund



Data Mining and Machine Learning,
Aug-Nov 2017

Data Mining and Machine Learning

Aug-Nov, 2017


Administrative details

  • Teaching assistants: Akash Kumar, Siddarth R, Nisarg Patel

  • Evaluation:

    • Assignments 40%, midsemester exam 20%, final exam 40%

    • Copying is fatal

  • Textbook and reading material

    • Main textbooks:

    • Supplementary material:

      • Data Mining; Concepts and Techniques by Jiawei Han and Micheline Kamber.

      • Machine Learning by Tom Mitchell.

      • C4.5: Programs for Machine Learning by Ross Quinlan.

      • Artificial Intelligence: A Modern Approach by Stuart J Russell and Peter Norvig.


Course plan

Here is a tentative list of topics.

  • Supervised learning: Frequent itemsets, association rules, decision trees, naive Bayes, SVM, classifier evaluation, expectation maximization, ensemble classifiers.

  • Unsupervised learning: Clustering, outlier detection.

  • Text mining: Basic ideas from information retrieval, TF/IDF model, Page Rank, HITS

  • Other topics: Probabilistic graphical models, Bayesian networks, Markov models, neural networks, ranking and social choice, …



Lecture summary

  • Lecture 1, 08 Aug 2017:

    Frequent itemsets, a-priori algorithm

    • Liu, Chapter 2.1 and 2.2.1 (part)
  • Lecture 2, 10 Aug 2017:

    A-priori algorithm, association rule generation, tabular data, multiple minimum supports, class association rules

    • Liu, Chapter 2.2-2.5
  • Lecture 3, 17 Aug 2017:

    Decision trees

    • Liu, Chapter 3.2
    • Mitchell, Chapter 3
    • Quinlan, Chapters 1,2
  • Lecture 4, 22 Aug 2017:

    Discretizing continuous attributes

    • Liu, Chapter 3.2.3
    • Mitchell, Chapter 3.7.2
    • Quinlan, Chapter 2.4

    Overfitting and tree pruning

    • Liu, Chapter 3.2.4
    • Mitchell, Chapter 3.7.1
    • Quinlan, Chapter 4

    Classifier evaluation

    • Liu, Chapter 3.3
    • Manning, Raghavan and Schütze, Chapter 8.3
  • Lecture 5, 24 Aug 2017:

    Naive Bayesian Classifiers

    • Liu, Chapter 3.6

    Generative probablisitic models and parameter estimation, naive Bayes text classifiction

  • Lecture 6, 29 Aug 2017:

    Support vector machines (SVMs), the linearly separable case

  • Lecture 7, 31 Aug 2017:

    SVMs with soft margins, kernel functions

  • Lecture 8, 5 Sep 2017:

    A formal setting for machine learning, online learning, Perceptron algorithm, VC-dimension

    • Blum, Hopcroft and Kannan, Chapter 5.1, 5.5, 5.6, 5.9
  • Lecture 9, 7 Sep 2017:

    True error and sample error, sample size vs overfitting, VC-dimension, ensemble classifiers: bagging and boosting

    • Liu, Chapter 3.10
    • Blum, Hopcroft and Kannan, Chapter 5.1, 5.2, 5.5, 5.9, 5.10
  • Lecture 10, 12 Sep 2017:

    Clustering: K-Means, hierarchical

    • Liu, Chapter 4.1—4.4
  • Lecture 11, 19 Sep 2017:

    Density based clustering

    Density based local outlier detection

    Semi supervised learning: Expectation-Maximization

  • Lecture 12, 21 Sep 2017:

    Convergence of EM algorithm

    EM for text classification

  • Reading material for 03–12 Oct 2017:

    Regression

  • Lecture 13, 17 Oct 2017:

    Boolean information retrieval: documents, terms, postings

    Tokenization, stop words, stemming and lemmatization, ,skip lists, positional postings and phrase queries

    Parametric and zone indexes, weighted zone scoring, tf-idf, scoring in the vector space model.

    • Manning, Raghavan and Schütze, Chapter 1, 2.2–2.4, 6.1–6.3
  • Lecture 14, 19 Oct 2017:

    Tf-idf and variants, PageRank

    • Liu, 7.3
    • Manning, Raghavan and Schütze, Chapter 6.3, 6.4.1, 6.4.2, 21.1, 21.2
  • Lecture 15, 24 Oct 2017:

    PageRank, HITS, Latent Semantic Indexing

    • Liu, 7.3, 7.4, 6.7
    • Manning, Raghavan and Schütze, Chapter 21.2, 21,3, 18.1–18.4
  • Lecture 16, 31 Oct 2017:

    Bayesian networks: basic definitions, semantics, exact inference

  • Lecture 17, 02 Nov 2017:

    Bayesian networks: Conditional independence, D-separation

  • Lecture 18, 07 Nov 2017:

    Bayesian networks: exact inference, approximate inference, sampling

    • Russell and Norvig, Chapter 14.4, 14.5
  • Lecture 19, 09 Nov 2017:

    Temporal models: inference (most likely explanation, Viterbi algorithm), Hidden Markov Models (HMMs)

    • Russell and Norvig, Chapter 15.1, 15.2, 15.3 (overview)
  • Lecture 20, 21 Nov 2017:

    Neural networks: Multilayer perceptrons, sigmoid neurons, network architecture, learning weights, cross entropy cost function, overfitting and regularization

  • Lecture 21, 23 Nov 2017:

    Neural networks: Backpropagation, the unstable gradient problem, convolutional networks, deep learning