-
Mathematical Methods – Analysis
- Single-variable analysis: Limits, continuity, sequences
and series of real numbers, differentiation, maxima and
minima, Riemann integration, improper integrals (like normal,
exponential, gamma).
- Multi-variable analysis: Differentiation, convexity,
gradient and Hessian of a multivariate function, Taylor's
expansion, necessary and sufficient conditions for the
existence of an extremal point, Newton’s method, Lagrange
multipliers, gradient and conjugate gradient methods.
- Sequences and series of functions.
Textbook
- Tom W. Apostol: Calculus: One-Variable Calculus with
An Introduction to Linear Algebra, Vol 1, Wiley, Second edition (2007).
- Tom W. Apostol: Calculus: Multi-Variable Calculus and Linear Algebra with Applications to Differential Equations and Probability, Vol 2, Wiley, Second edition (2007).
Reference
- P.G. Ciarlet: Introduction To Numerical Linear Algebra
And Optimisation, Cambridge University Press.
-
Probability and Statistics with R
- Combinatorial probability, Independence of events,
Conditional probabilities
- Random variables, densities, Expectation, Variance and
moments, Standard univariate distributions, Independence of
random variables, Moment Generating Functions
- Tchebychev's inequality and weak law of large numbers,
Central Limit Theorem.
- Marginal Distribution, Conditional Distribution,
Conditional expectation, Regression, Correlation, Bivariate
normal distribution, Multivariate normal distribution
- Introduction to Statistics with examples of its use, Draw
random samples, Descriptive statistics, Graphical statistics:
Histogram, scatter diagram, Pie diagram, estimates sample
moments, sample mean, sample standard deviation
- Sampling distributions based on normal populations - t,
chi-square and F distributions
- Sufficient statistics. Point and Interval Estimation,
Consistency, Minimum Variance Unbiased Estimator (statement only),
method of moments estimators, maximum likelihood estimator,
consistency and asymptotic normality of MLE's (statement only)
- Testing of Hypothesis: one sample and two sample tests based on
t, chi-square and F distributions. - Error probabilities,
statistical power of test, p-values, log-likelihood ratio test
Textbook
- R. Ash: Basic Probability Theory, : John Wiley & Sons
(1970).
- W. Feller: Introduction to Probability Theory and its
Applications, Volume 1, Third Edition, John Wiley & Sons (1972).
- P.G. Hoel, S.C. Port & C.J. Stone: Introduction to
Probability Theory Houghton-Miffin (1971).
- G.K. Bhattacharya & R.A. Johnson: Statistics : Principles
and Methods, Second Edition, John Wiley & Sons (1992).
- P. G. Hoel, S. C. Port, and C. J. Stone: Introduction to
Statistical Theory, Houghton Mifflin (1971).
-
Programming and Data Structures with Python
Introduction to basic programming principles using Python, including
object-oriented design, big-oh notation, sorting and search
algorithms, elementary data structures (lists, heaps, binary trees).
Textbook
- Mark Pilgrim : Dive into Python,
available online.
- T.H. Cormen, C.E. Leiserson, and R.L. Rivest : Introduction
to algorithms, Prentice-Hall (1998).
-
Visualization (2 credits)
- Descriptive statistics
- Exploratory data analysis
- Grammar of graphics
- Advanced visualization
- Analytics dashboard
- Introduction to Tableau
Textbook
- The Grammar of Graphics, by LeLand Wilkinson, Springer, Second Edition
- ggplot2 Elegant graphics for data analysis, by Hadley Wickham, Springer
Refereence
- The Visual Display of Quantitative Information, by Edward R Tufte
- Introductory Statistics with R, by Peter Dalgaard, Springer, Second Edition
- Tabular
Data Analysis with R and Tidyverse: Environmental
Health, by Jean-Yves
Sgro and Kristen Malecki
- Online Resources for Tableau
-
RDBMS and SQL (2 credits)
Database Concepts: Relations, Query Languages, Relational
Algebra, Relational Calculus. Database Design: Schema,
Entity Relationship Model, Functional Dependencies, Normal
Forms, Joins. Storage: File Organization, Optimization,
B+-Tree Indexing. Transaction Management: Transactions, ACID
Properties, Serializability, Concurrency. SQL: Tables and
Views, CRUD Operations, Primary Key, Foreign Keys,
Constraints, Types of Joins, Grouping, Summary Functions,
Indexes.
Textbook
- A.Silberschatz, H.F.Korth and S.Sudarshan: Database System Concepts,
McGraw-Hill Publications, Seventh Edition (2019).
-
Linear Algebra and its Applications
- Jordan canonical form, other reductions to
triangular and diagonal forms, projection matrices;
- Matrix norms, Rayleigh quotient, conditioning of a problem, floating
point arithmetic, backward and forward stability of an algorithm;
- Direct and iterative methods for solving a linear system of equations:
Gaussian elimination, LU factorization, Cholesky method, QR factorization,
Householder's matrices, Jacobi's method, Gauss-Seidel method, successive
over-relaxation methods (SOR);
- Eigenvalue-eigenvector methods: methods based on reduction to Hessenberg
or tridiagonal forms (Arnoldi, Gram-Schmidt), power iteration, inverse
iteration, QR iteration, Rayleigh quotient iteration, Jacobi's method,
bisection method, divide-and-conquer, Krylov subspace methods, method of
conjugate gradients;
- Singular value problems: Computing the SVD, elements of PCA;
- Least squares problems: normal equations, QR, SVD, solving rank-deficit
least squares problems using SVD and QR.
Textbook
- Lloyd N. Trefethen and David Bau, III: Numerical linear algebra, SIAM (1997)
Reference
- James W. Demmel: Applied numerical linear algebra, SIAM (1997)
- P.G.Ciarlet: Introduction to Numerical Linear Algebra and Optimisation, Cambridge Univ Press (1989
- Gene H. Golub and Charles F. Van Loan.Matrix Computations, Johns Hopkins, 4th ed (2012)
-
Data Mining and Machine Learning
Association rules, frequent itemsets;
Finding high-correlation with low-support;
Classifiers -- Bayesian, Nearest Neighbour;
Decision Trees;
Clustering techniques;
Vector space (TF-IDF) model;
Stop words and stemming;
Supervised learning : Bayesian Networks, Support Vector Machines;
Semisupervised learning: Expectation maximization;
Web search: HITS and PageRank;
Textbook
- Jiawei Han, Micheline Kamber: Data mining: concepts and
techniques (2nd ed), Morgan Kaufman (2006).
- Bing Liu: Web Data Mining: Exploring Hyperlinks,
Contents and Usage Data, Springer (2006).
- Soumen Chakrabarti: Mining the Web: Discovering knowledge
from hypertext data, Elsevier (2003).
- Christopher D Manning, Prabhakar Raghavan and
Hinrich Schütze : An Introduction to
Information Retrieval, Cambridge University Press
(2009).
-
Algorithms
A quick revision of sorting, searching, selection and Big Oh;
Divide and Conquer;
Dynamic Programming;
Graphs, BFS, DFS, connectivity;
Algorithms on Matrices;
Combinatorial Optimization --- Linear Programming,
Simplex, Duality,
Primal Dual Algorithms
(shortest paths, max flow, matching).
Textbook
- T.H. Cormen, C.E. Leiserson, and R.L. Rivest: Introduction
to algorithms, Prentice-Hall (1998).
- J. Kleinberg and E. Tardos: Algorithm design,
Pearson/Addison-Welsey (2006).
- C. Papadimitriou and K. Steiglitz: Combinatorial
Optimization
-
Distributed Computing and Big Data
- Evolution of data storage systems — files, RDBMS, ETL, OLTP, data
warehousing, data lakes, cloud storage, Storage as a Service (STaaS).
- Evolution of computing infrastructure — microprocessors, Moore's
law, multi-core processors, GPUs, super computers and (Infrastructure as
a Service) IaaS.
- File systems, distributed file systems and HDFS.
- A distributed processing model, space-time diagram, consistent cuts,
scalar, vector and matrix clocks.
- Distributed computing in practice — Public, Private and Hybrid
Cloud, Map Reduce, Pig, NoSQL DB, Web applications and RESTful services
over cloud.
Reference books
- Ajay D. Kshemkalyani and Mukesh Singhal: Distributed Computing: Principles, Algorithms, and Systems, Cambridge University Press (2010)
- Arshdeep Bahga and Vijay Madisetti: Big Data Analytics: A Hands-On Approach, VPT (2018)
- Tom White: Hadoop - The Definitive Guide, 4th ed, Oreilly (2015)
-
Predictive Analytics – Regression and Classification
- Introduction to Predictive Analytics, Case studies
- Least-Square Method, Overview of Supervised Learning,
- Regression, Linear Models, Gauss-Markov Theorem, Multiple Linear
Regression, Variable Selection, Bayesian Linear Regression, Ridge
Regression, LASSO, Elastic Net, Principal Component Regression,
Functional Regression, Spline Regression
- Outlier detection, Influential point, Cooks distance, Model
Selection via AIC and BIC
- Classification, Linear Classifiers, Linear Discriminant Analysis
(LDA), Quadratic Discriminant Analysis, Logistic Regression, CART,
CHAID,
Textbooks
- Hastie, Tibshirani, Friedman, The Elements of Statistical Learning, Data Mining, Inference and Prediction, Second Edition, Springer Series in
Statistics.
-
Advanced Machine Learning
- Deep Learning Philosophy, Deep Neural Networks, Convolutional Neural Networks, Recurrent Neural Networks, LSTM
- Pytorch, Keras
- Reinforcement Learning
- Graphical Network Models
- Hidden Markov Models
Text and reference books:
- Ian Goodfellow, Yoshua Bengio and Aaron Courville:
Deep Learning,
MIT Press 2016
- Richard S. Sutton and Andrew G. Barto:
Reinforcement Learning: An Introduction,
MIT Press (2nd ed) 2018
- Francois Chollet:
Deep Learning with Python,
Manning Publications 2017
- Stuart J Russell and Peter Norvig:
Artificial Intelligence: A Modern Approach,
Pearson (3rd ed) 2015
- Daphne Koller and Nir Friedman:
Probabilistic Graphical Models – Principles and Techniques,
MIT Press 2009
-
Industry Project
- Students will be allowed to work on an industry project during their 3rd or 4th semester (not both).
- Such a project can count for a maximum of 4 credits.
- The project can either be a continuation of the summer internship into the 3rd semester or a separate internship.
- The placement cell will not actively participate in finding these internships. Their role could be limited to passing along information about any opportunities that may arise.
- Students will have to obtain permission for such an internship from the
faculty advisor. A proposal containing the project description with the
name and signature of (or email from) the industry supervisor has to be
submitted at least two weeks prior to the course registration deadline.
- A report on the work done in the project during the semester will have
to be submitted to the faculty advisor before or during the final exam week.
- Course credits will be granted on proper completion of the project and all
formalities by the faculty advisor and/or a faculty member chosen by the
Dean