Chennai Mathematical Institute


MSc Thesis Seminar
11 am, Lecture Hall 2
Using Reinforcement learning to learn Pallanguzhi

Chayan Banerjee
Chennai Mathematical Institute.


Reinforcement learning is a way of Machine Learning where the software agent is not given any labelled training data, but is expected to learn from the behavioural aspect of learning to maximize its reward. We use this on Pallanguzhi, a traditional South Indian game and a very similar version of Mancala. The agent was given the boards to play on and the rules of the respective games. We talk about how policy gradient and Adam optimizer is used in a convolutional neural network framework to train the agent. The agent is then made to play against a random AI player. After 2 million iterations on the Pallanguzhi game, we have reached at 99.6% win rate for the trained player when it plays first against a random opponent.

