Teaching assistants: Alok Dhar Dubey, Ankan Kar, Rohit Roy
Evaluation:
Assignments 30-40%, quizzes and midsemester exam 20-30%, final exam 40%
Copying is fatal
Course outline (tentative)
Supervised learning: Association rules, regression, decision trees, naive Bayes, SVM, classifier evaluation, expectation maximization, ensemble classifiers.
Unsupervised learning: Clustering, outlier detection, dimensionality reduction.
Text mining: Basic ideas from information retrieval, TF/IDF model, Page Rank, HITS
Other topics (if time permits): Probabilistic graphical models, Bayesian networks, Markov models, neural networks, ranking and social choice, …
Text and reference books:
Web Data Mining by Bing Liu, 2nd edition, Springer (2011).
Foundations of Data Science by Avrim Blum, John Hopcroft and Ravi Kannan
Machine Learning by Tom Mitchell.
C4.5: Programs for Machine Learning by Ross Quinlan.
Artificial Intelligence: A Modern Approach by Stuart J Russell and Peter Norvig, 3rd edition, Pearson (2016).
Hands-On Machine Learning with Scikit-Learn, Keras and Tensorflow by Aurélien Géron, 3rd edition, O'Reilly (2022)
Reinforcement Learning: An Introduction, by Richard S. Sutton and Andrew G. Barto, MIT Press, 2nd ed (2018)
Lecture 1: 7 Jan 2025
(Class Notes (pdf))
Introduction to supervised and unsupervised learning
Lecture 2: 16 Jan 2025
(Class Notes (pdf))
Market-basket analysis, frequent itemsets, Apriori algorithm
Lecture 3: 21 Jan 2025
(Class Notes (pdf),
Lecture Slides (pdf))
Association rules, class association rules
Supervised learning, decision trees, impurity
Lecture 4: 23 Jan 2025
(Class Notes (pdf),
Lecture Slides (pdf))
Decision trees: impurity measures (entropy, Gini index), information gain ratio, handling numeric attributes
Evaluating classifiers: training/test sets, confusion matrix
Lecture 5: 28 Jan 2025
(Class Notes (pdf))
Decision Trees in Python
Lecture 6: 30 Jan 2025
(Class Notes (pdf),
Lecture Slides (pdf))
Linear Regression: loss functions, normal equation, gradient descent
Lecture 7: 4 Feb 2025
(Class Notes (pdf),
Slides (pdf))
Linear Regression: gradient descent, probabilistic justification for SSE loss
Polynomial regression, regularization, the non-polynomial case
Lecture 8: 06 Feb 2025
(Class Notes (pdf),
Slides (pdf))
Regression using decision trees
Handling overfitting in decision trees
Regression for Classification
Lecture 9: 11 Feb 2025
(Class Notes (pdf))
Scaling regression inputs
Linear, polynomial and logistic regression in Python
Regression Trees in Python
Lecture 10: 13 Feb 2025
(Class Notes (pdf),
Slides (pdf))
Naïve Bayesian classifiers
Naïve Bayes text classification
20 Feb 2025
Lecture 11: 25 Feb 2025
(Class Notes (pdf),
Slides (pdf))
Ensemble Classifiers: Bagging
Lecture 12: 27 Feb 2025
(Class Notes (pdf),
Slides (pdf))
Ensemble Classifiers: Boosting
Lecture 13: 11 Mar 2025
(Class Notes (pdf),
Slides (pdf))
Unsupervised learning: Clustering — K-Means, Hierarchical, Density-based
Unsupervised learning: Local density based outlier detection
Lecture 14: 13 Mar 2025
(Class Notes (pdf),
Slides (pdf))
Applications of unsupervised learning: semi-supervised learning, preprocessing, image segmentation
Lecture 15: 18 Mar 2025
(Class Notes (pdf),
Slides (pdf))
Dimensionality reduction: PCA, manifold learning, locally linear embeddings
Lecture 16: 20 Mar 2025
(Class Notes (pdf),
Slides (pdf))
Expectation maximization and applications
Lecture 17: 25 Mar 2025
(Class Notes (pdf),
Slides (pdf))
Linear Separators — Perceptrons
Lecture 18: 27 Mar 2025
(Class Notes (pdf),
Slides (pdf))
Linear Separators — SVMs
Kernel methods
Lecture 19: 3 Apr 2025
(Class Notes (pdf),
Slides (pdf))
Kernel methods
Neural networks: Multilayer perceptrons, sigmoid neurons, network architecture, universality
Lecture 20: 8 Apr 2025
(Class Notes (pdf),
Slides (pdf))
Neural networks: Backpropagation
Lecture 21: 10 Apr 2025
Neural networks in Python
Bayesian networks: basic definitions, semantics
(Class Notes (pdf),
Slides (pdf))
Lecture 22: 15 Apr 2025
(Class Notes (pdf),
Slides (pdf))
Bayesian networks: Conditional independence, D-separation
Lecture 23: 17 Apr 2025
(Class Notes (pdf),
Slides (pdf))
Bayesian networks: approximate inference, sampling
Introduction to Markov chains
Lecture 24: 22 Apr 2025
(Class Notes (pdf),
Slides (pdf))
Markov Chain Monte Carlo: Gibbs Sampling
Lecture 25: 24 Apr 2025
(Class Notes (pdf),
Slides (pdf))
Causal graphical models, confounding, intervention
Additional reading: