The application of topological techniques to traditional data analysis, which before has mostly developed on a statistical setting,
has opened up new opportunities. There is a growing interest to explore this field further as well as look for new applications.
Some of the notable successes (such as the identification of a new type of breast cancer, or the discovery new basketball
playing positions) in the recent years have earned praise from the industry.
Indeed, with the explosion in the amount and variety of available data, identifying, extracting and exploiting their underlying
structure has become a problem of fundamental importance. Many such data come in the form of point clouds, sitting in potentially
high-dimensional spaces, yet concentrated around low-dimensional geometric structures that need to be uncovered.
The non-trivial topology of these structures is challenging for classical exploration techniques such as dimensionality reduction.
The goal of TDA is therefore to develop novel methods that can reliably capture geometric or topological information
(connectivity, loops, holes, curvature,etc) from the data without the need for an explicit mapping to lower-dimensional space.
Watch the following videos to know more about TDA and its applications.
Learning goals of this course
This course is not about algebraic topology. We will not spend time, for example, proving various properties of simplicial homology.
The aim of this course is to introduce TDA as a tool to data analysts and teach using a hands on approach.
We will explore a lot of real-life examples using various software packages. In fact, the Moodle course page contains around 30 articles
about applications of TDA to various fields like developmental economics, image segmentation, cosmology, medical imaging, protein flexibility,
weather forecasting, robotics etc. Some of these will be discussed in class, some will be assigned as course project.
At the end of the course we expect that the participants gain insight into how persistent homology works, they have used all the
important packages and seen enough data analysis examples. This should enable them to use TDA in new areas in future.
Course Information
Lecture schedule:
Mondays and Wednesdays at 10:30 am
Venue:
LH 801
Instructors:
Sourish Das
Priyavrat Deshpande
Office hours:
TBA
Prerequisites:
A first course in topology/ metric spaces. Proficiency in Python, R and Matlab.
Text:
Computational Topology An Introduction by H. Edelsbrunner, J. Harer, AMS, 2013.
Topology and Data by Gunnar Carlson, Bull. Amer. Math. Soc. 46 (2009).
The homework will be assigned bi-weekly; it will be uploaded on Moodle.
It is your duty to submit the solutions on time. There are two types of home works; math problems and programming.
The math problems are to be solved and the solutions to be submitted in the written format.
Copying and/or plagiarism will not be tolerated. Here are a few writing guidelines you might want to follow.
Feel free to work together, but you should submit your own
work.
Your questions/comments/suggestions are most welcome. I will
also be fairly generous with the hints. However, do not expect
any kind of help, including extensions, on the day a homework is
due.
Please turn in a neat stapled stack of papers. Refrain from
using blank / printing paper as far as possible use ruled paper.
Your final version should be as polished as you can
make it. This probably means that you cannot submit sketchy
solutions or sloppily written first versions. Please expect to
do a fair amount of rewriting. Do not hand in work with parts
crossed out; either use a pencil and erase or rewrite.
Please write complete sentences that form paragraphs and so
forth. It might be a good idea to use short simple sentences;
avoid long complicated sentences.
Do use commonly accepted notation (e.g., for functions, sets,
etc.) and never invent new notation when there is already some
available.
For the programming assignments you may upload your source code and screenshot of a successful run on the Moodle.
However, do hand in a short write up about your program and a few sample outputs.
Projects
Project is an important part of this course. A list of possible projects and the corresponding reference material is located on the Moodle
course page. Everybody is supposed to have selected their project topic before of end of August. A typical project involves reading and
understanding a paper (that deals with a real life application of TDA); reproducing their results and finally try their algorithm on a different
data set. A project report will consist of a summary of the paper you read and a discussion about the new analysis you might have performed.