Data Science Seminar
Time: 2 pm to 3 pm
Curse of Dimensionality
Anand Nath Jha
Assistant Vice President at Genpact, India.
“Curse of Dimensionality” refers to the need of more training data with use of more features (dimensions) in Machine Learning. And with it comes several associated problems – cost and time of data collection, difficulty in training a model, more computational time and memory, narrowing choice of Machine Learning models etc. It was coined by Richard E. Bellman when encountered with multiple decision variables in Optimization problems. Over the years, it started being used in the AI/ML due to similar problems encountered with increase in features. This is one topic, which has been widely studied and highly recommended for ML Learners and Practitioners alike. Because it manifests itself in multiple ways and there are multiple approaches to tackle them. Ease of data integration, data collection, big data technologies, Internet-of-things, Computer Vision, Natural language Processing etc. have made it a very relevant area of study. Across domains we face high dimensional data – 360 degree view of customers, Sensor data on Machines, text and image data in the form of tokens or pixel level values and so on. High Dimensional data enables us to do a holistic and accurate analysis but at the same time we should be aware of the concomitant problems.
About the speaker: Anand is an Assistant Vice President at Genpact (India) serving as the senior member in the data science & insights CoE. Anand has more than 20 years of industry experience, prior to Genpact he has worked at Infosys as a senior data scientist, at ITC infotech as analytics architect and also in Honeywell technology solutions, General Electric and HAL in similar roles. Anand earned his B.Tech in aerospace engineering from IIT Kanpur; he also has an MBA from University of Pheonix and Certification in Business Analytics for Executives from IIM Lucknow.