Instructors: Pranabendu Misra, Madhavan Mukund
Teaching assistants: Shourjya Basu, Sampad Kumar Kar, Shankar Ram V
Evaluation:
Assignments 30-40%, quizzes and midsemester exam 20-30%, final exam 40%
Copying is fatal
Text and reference books:
Web Data Mining by Bing Liu.
Foundations of Data Science by Avrim Blum, John Hopcroft and Ravi Kannan
Machine Learning by Tom Mitchell.
C4.5: Programs for Machine Learning by Ross Quinlan.
Artificial Intelligence: A Modern Approach by Stuart J Russell and Peter Norvig.
Hands-On Machine Learning with Scikit-Learn, Keras and Tensorflow by Aurélien Géron, O'Reilly, 3rd edition (2022)
Here is a tentative list of topics.
Supervised learning: Association rules, regression, decision trees, naive Bayes, SVM, classifier evaluation, expectation maximization, ensemble classifiers.
Unsupervised learning: Clustering, outlier detection, dimensionality reduction.
Text mining: Basic ideas from information retrieval, TF/IDF model, Page Rank, HITS
Other topics (if time permits): Probabilistic graphical models, Bayesian networks, Markov models, neural networks, ranking and social choice, …
Assignment 1, 29 January 2023, due 12 February 2023.
Assignment 2, 28 February 2023, due 14 March 2023.
Assignment 3, 28 March
2023, due 15 April 18 April 2023.
Lecture 1: 5 Jan 2023
(Class Notes (pdf),
Slides (pdf))
Introduction, market-basket analysis, frequent itemsets
Lecture 2: 10 Jan 2023
(Class Notes (pdf),
Slides (pdf))
Apriori algorithm, association rules, class association rules
Lecture 3: 12 Jan 2023
(Slides)
Supervised learning, decision trees, impurity measures (entropy, Gini index)
Lecture 4: 17 Jan 2023
(Slides)
Decision trees: information gain ratio, handling numeric attributes
Lecture 5: 19 Jan 2023
Decision Trees in Python (pdf, ipynb)
Evaluating classifiers: training/test sets, confusion matrix (see earlier slides)
Lecture 6: 24 Jan 2023
(Slides)
Naïve Bayesian classifiers
Naïve Bayes text classification
Lecture 7: 31 Jan 2023
(Class Notes (pdf),
Slides (pdf))
Linear Regression: loss functions, normal equation, gradient descent, probabilistic justification for SSE loss
Lecture 8: 2 Feb 2023
(Class Notes (pdf),
Slides (pdf))
Polynomial regression, regularization, the non-polynomial case
Regression for Classification
Lecture 9: 7 Feb 2023
(Class Notes (pdf),
Slides (pdf))
Regression using decision trees
Handling overfitting in decision trees
Python notebooks from Aurelion Géron's code repository
Lecture 10: 9 Feb 2023
(Slides (pdf),
ipynb )
Ensemble Classifiers: Bagging
Lecture 11: 14 Feb 2023
(Slides (pdf))
Ensemble Classifiers: Boosting
Lecture 12: 16 Mar 2022
(Slides (pdf),
ipynb )
Ensemble Classifiers: Gradient Boosting
Lecture 13: 28 Feb 2023
(Slides)
Unsupervised learning: Clustering — K-Means, Hierarchical, Density-based
Unsupervised learning: Local density based outlier detection
Lecture 14: 02 Mar 2023
(Slides,
Jupyter Notebook)
Applications of unsupervised learning: semi-supervised learning, image segmentation
Lecture 15: 7 Mar 2023
(Slides)
Dimensionality reduction: PCA, manifold learning, locally linear embeddings
Lecture 16: 9 March 2023
(Slides,
Jupyter Notebook)
Expectation maximization and applications
Lecture 17: 14 Mar 2023
(Class Notes (pdf),
Slides (pdf))
Linear Separators — Perceptrons, SVMs
Lecture 18: 21 Mar 2023
(Class Notes (pdf),
Slides (pdf),
Jupyter notebook)
Kernel methods
Lecture 19: 23 Mar 2023
(Class Notes (pdf),
Slides (pdf))
Neural networks: Multilayer perceptrons, sigmoid neurons, network architecture, universality
Lecture 20: 28 Mar 2023
(Class Notes (pdf),
Slides (pdf))
Neural networks: Backpropagation, cost functions
Lecture 21: 30 Mar 2023
(Class Notes (pdf),
Slides (pdf))
Bayesian networks: basic definitions, semantics, exact inference
Lecture 22: 4 Apr 2023
(Class Notes (pdf),
Slides (pdf))
Bayesian networks: Conditional independence, D-separation
Lecture 23: 11 Apr 2023
(Class Notes (pdf),
Slides (pdf))
Bayesian networks: approximate inference, sampling
Introduction to Markov chains
Lecture 24: 18 Apr 2023
(Class Notes (pdf),
Slides (pdf))
Markov Chain Monte Carlo: Gibbs Sampling