Chennai Mathematical Institute


CMI Data Science Seminar
Date: 26 Feb 2021 (Friday)
Time: 2:00 pm to 3:00 pm
Hellinger net: A consistent statistical learning model for imbalanced pattern classification

Tanujit Chakraborty
IIIT Delhi.


Classification of objects is a basic chore of machine intelligence. Over the years, a number of classifiers from different genres have been developed by both statistics and the machine learning community. To increase the pertinence of machine learning algorithms in human lives, we have to work on the interface of algorithm design and its utility. Traditional classifiers are designed on the basis of a number of assumptions like well-balanced class cardinalities. Interestingly, datasets from a number of real-world domains have shown to possess a class imbalance nature. Class imbalance is the quantitative disproportion between the cardinalities of some or all classes of a dataset. For a two-class scenario, the class with a significantly higher number of instances is termed as the majority class whereas the other is the minority class. While training a traditional classifier with class-imbalanced data, usually the classifier is found to get biased towards the quantitatively abu ndant class. In this talk, the class imbalance problem is solved by building a novel statistically consistent Hellinger net model. Hellinger net, a tree to network mapped model, is a deep feedforward neural network with a built-in hierarchy, just like decision trees. Hellinger net also utilizes the strength of a skew insensitive distance measure, namely Hellinger distance, in handling class imbalance problems. On the theoretical side, this talk also includes a discussion on the theoretical consistency of the Hellinger net model from a statistical learning theory perspective.

Bio: Tanujit Chakraborty is a researcher in the field of Statistical Machine Learning. Currently, he is working as a Postdoctoral researcher at Laboratory for Computational Social Systems (LCS2), IIIT Delhi. He also serves as a statistical consultant at Bajaj Finserv Limited, Pune. He received his Ph.D. degree from Indian Statistical Institute, Kolkata. During his Ph.D., he has contributed to statistical pattern recognition with applications to business analytics, quality control, software defect prediction, and macroeconomics.