Music and Musical Instruments

Machine recognition of hand-printed Tamil musical notation
Paper presented at the Annual Convention of the Computer Society of India, Bombay, February 1980; also, STAT-41/80, January 1980
M. Chandrasekaran, R. Chandrasekaran and Gift Siromoney

ABSTRACT

This paper deals with computer recognition of hand printed symbols of Tamil musical notation. Each symbol is converted into a binary picture matrix. Using the methods developed by the authors, information is extracted from each array in the form of strings and features that are stable in spite of the variations in shape are obtained from these strings and stored in the memory of the computer. Over 350 symbols from a standard musical text in Mohana raga using five swaras have been chosen and digitized manually.

The sample used for the recognition process is fed into the computer symbol by symbol. Each symbol that is read is first tested as to whether it has a horizontal line below it. If such a line exists, it is recognized and removed. The rest of the symbol is reduced to a string pattern and the stable features are compared with the features of the original symbols already stored in the memory. If there is perfect agreement, the symbol is reckoned as recognized and printed in the Roman form.

Character recognition has been performed on an IBM 370/155 computer at the Indian Institute of Technology, Madras.

1. INTRODUCTION

Modern composers of western music and music researchers are now making use of the wide range of possibilities offered by the computer in experimental music composition and research (Lincoln, 1970). The computer can be effectively used in stylistic analysis and in other statistical applications (Siromoney et al, 1964). Musical researchers wishing to use computers normally convert the musical notation into some form suitable for punching onto computer cards. In recent years attempts have been made to devise a method by which the computer can recognize the musical scores directly. This is a problem in the area of pattern recognition and in this paper we demonstrate a method of recognition of hand-printed Tamil musical notation.

The main problem in the field of pattern recognition is one of recognizing two-dimensional visual patterns whether it be natural objects or conventional symbols. David Prerau of the Massachusetts Institute of Technology addressed himself to the following problem (Prerau, 1971). Given a sheet of printed music in western notation, can the computer recognize the scores and printout the names of different notes and their duration in some standard notation? If so, such an output can be used as input to music analysis programs, music playing programs and commercial music printers. For all practical purposes the central problem has been solved for the printed western musical scores.

To extend the same problem to South Indian music, we ask ourselves the question whether it will be possible to devise a method by which given a sheet of printed music in Tamil musical notation, a computer can recognize the different symbols and print them out in some standard notation. If the required output is the one used by Prerau then it will be possible to get a printout of the South Indian music in Tamil musical notation in some western notation and vice versa. From a commercial point of view this may be very expensive and not feasible. Is it possible to tackle the problem at an experimental and academic level? The answer is in the affirmative.

The recognition of Tamil musical notation is linked with the problem of computer recognition of Tamil characters and we have been successful in solving this problem (Siromoney et al, 1978a). In the Tamil musical notation we have to allow for the lines drawn above or below the letters to denote the shortening of notes.

Sometime back we took up the problem of recognition of machine printed Mridangam mnemonics (Siromoney et al, 1978b). Here we had to reckon with a few additional features not met with in the recognition of printed Tamil characters such as a single or double horizontal bar above a line of sound words and a series of dots below a line. We have successfully solved this problem of recognition of Mridangam mnemonics adapting the methods developed by us for printed Tamil characters. The recognition of printed South Indian music in Tamil notation is a straight forward extension of this problem.

Having developed a method for recognizing printed Tamil characters, we took up the problem of recognizing hand-printed characters. With the hand-printed characters there are many different forms of the same letter and it makes the problem more difficult than the machine printed characters.

For recognizing the different characters, we use the method of 'Condensed runs', 'Coded runs' and their variations developed by us. These are described in the following paragraphs.

2. CONDENSED RUN METHOD

Each character is represented as a M x N binary matrix called PATN(I,J). The features present in the i-th row of the pattern matrix PATN(I,J) are extracted and represented in the form of a single numeral called RRUN(l) which is equal to the number of distinct runs of 1's in the i-th row of PATN(I,J). Similarly CRUN(J) is the number of distinct runs of 1's in the j-th column of PATN(I,J) and represents the features present in the j-th column.

We now define RRUN of PATN(I,J) as a string of length M such that RRUN = r(l)r(2) .. ..r(M) where r(I) =RRUN(I), I=1,....,M. Thus RRUN is a string of RRUN(l) 's where I ranges over all M rows and contains all the necessary information extracted from all the rows. Similarly CRUN is defined as a string of CRUN(J)'s, J = 1,...., N.

The strings RRUN and CRUN are further condensed as follows to the strings RCON and CCON respectively. Let RRUN be composed of k substrings such that RRUN = s( i₁) s( i₂) ... s( i_k ) where s(i_r) is a substring composed entirely of all consecutive occurrences of the same number i_r. Such a representation is unique. Then RCON = i₁i₂…i_k where i_r represents a single numeral. For example if RRUN is the string 1133322211, the corresponding RCON is 1321. RCON is called the condensed row string. Similarly the condensed column string CCON is formed by retaining only one digit in each run of that digit in the corresponding CRUN. This is a kind of thinning process and the condensed strings will be independent of the size and thickness of the letters as well as the proportion of the matrices. We note that the length of the string RCON is much less than that of the corresponding string RRUN. Let RCON = p(l)p(2)....p(RN) where RN is the length of the string RCON. Then p(I) p(I+l) and each p(l) is equal to some RRUN(J), J=1,...,M. We define RCON(I) = p(I).

3. CODED STRING METHOD

In this method the features present in the i-th row of the pattern matrix PATN(I,J) are extracted and encoded in the form of a single Roman letter. Such an alphabetic character is denoted by RCOD(I). RCOD(l) can take 26 alphabetic characters as well as ten numeric characters if necessary depending upon the pattern found in the i-th row of the matrix. For our purposes 17 letters are sufficient. The row patterns are reckoned in terms of the number and the nature of runs of 1's in the respective rows. For example, if there is only one run of 1's and if that is a short run then the row pattern is encoded as A. If the length of the run of 1's is less than three we reckon it as a small run; if the length is greater than six then it is a long run; otherwise it is a medium run. These threshold values depend upon the level of digitization and the average thickness of the letters. If there are two runs of 1's and if the first is a medium run and the second a short run then the corresponding code is E. In this case RCOD(I) will be represented by E. When there are three runs of 1's in a certain row, codes are given for the different combinations ignoring the order in which the runs occur, so as to keep the number of code values small. The same codes are used to represent the column vectors of the picture matrix. The j-th column is represented by CCOD(J). We now define RCOD of PATN(I,J) as a string of alphabetic characters of length M such that RCOD = RCOD(1)RCOD(2) ...RCOD(M). Thus RCOD is a string of RCOD(I)'s where I ranges over all the M rows. Similarly CCOD is defined as a string of CCOD(J)'s, J=1,....,N.

The string of row and column codes represented by RCOD and CCOD are further condensed to RCODC and CCODC respectively using the procedure explained in the previous section.

4. ZERO-ONE RUN METHOD

This method is the same as the condensed run method except that a single run of 1's is classified into four categories. If there is only one run of 1's in the i-th row of PATN(I,J) then it is given the code P. If the run of 1's is followed by a run of 0's on the right then the i-th row is coded as R; a 0 run followed by a run of 1's on the right is given the code Q and if the run of 1's is flanked by a run of 0's on the left as well as on the right then the i-th row gets the representation S. The same procedure is used for coding the columns of picture matrix. After getting the row and column strings they are condensed by omitting the consecutive occurrences of the same code.

5. CODED ZERO-ONE RUN METHOD

This method is similar to the coded string method in the sense that the row patterns are reckoned in terms of the number and the nature of runs of 0's. For example if there is a short run of 0's followed by a run of 1's in the i-th row of PATN(I,J) then it is encoded as D. If there is a run of 1's preceded and followed by long runs of 0's, the corresponding code is I. Similarly other combinations are given suitable codes. Here we don' t make use of the definition of medium runs and the definitions of short and long runs are the same as in the coded string method. Then the process of condensing is carried out on the row and column strings.

6. RECOGNITION PROCEDURE

For the recognition of characters one may use the template matching method (Kovalevsky, 1968) but it cannot be extended to include recognition of hand-printed characters. Narasimhan has proposed a linguistic method for the recognition of hand-printed English letters (Narasimhan et al, 1971). Syntactic methods have also been used for the recognition of Devanagari script (Sinha, 1973). However we have made use of only our methods described earlier for purposes of recognizing hand-printed characters.

For our study we have selected a song in Mohana raga from Abraham Pandithar's Karun\a$mirtha Sa$gara Thirattu (Abraham Pandithar, 1934). Mohana raga makes use of five swaras and each is represented by a letter of the Tamil alphabet. The lengthening of notes is denoted by the addition of a separate symbol. In the case of short and long forms of the swara Ri, they are denoted by two separate letters. Thus the number of major symbols to be recognized are seven letters of the Tamil alphabet.

Over 350 symbols have been chosen and digitized manually into rectangular binary matrices. Information is extracted from each array in the form of strings and features that are stable in spite of the variations in shape are obtained from these strings and stored in the memory of the computer. The sample that is fed into the computer is read character by character. Each character that is read is first tested as to whether it has a horizontal line below it. If such a line exists, it is recognized and removed. The rest of the picture is reduced to a string pattern using the methods described earlier and stable features are extracted. These features are compared with the features of characters already stored in the memory. If there is agreement, the character is recognized and printed in the Roman form. The flow diagram for the recognition procedure is given in Fig. 1. We have achieved a high degree of success in this recognition problem.

Further work has to be done to improve the method of recognition of hand-printed characters. The work can be extended to the problem of recognizing printed musical scores in different fonts and to cursive handwritten musical scores.

7. REFERENCES

1. Abraham Pandithar (1934), Karuna$mirtha Sa$gara Thirattu, Tanjore.
2. Kovalevsky V.A. (1968) , Character Readers and Pattern Recognition, Spartan, New York.
3. Lincoln H.B (1970) , The Computer and Music, Ithaca.
4. Narasimhan R and Reddy V.S.N. (1971), A syntax-aided recognition scheme for hand-printed English letters, Pattern Recognition, 5, 345 - 361.
5. Prerau D (1971), Computer pattern recognition of printed music, Proceedings of the Fall Joint Computer Conference, 153 - 162.
6. Sinha R.M.K. (1973), A syntactic pattern analysis system and its application to Devanagari script recognition, Ph.D Thesis, Dept. of Electrical Engineering, IIT, Kanpur.
7. Siromoney G and Rajagopalan K.R. (1964), Style as information in Karnatic Music, Journal of Music Theory, 8, 267 - 272
8. Siromoney G, Chandrasekaran R and Chandrasekaran M (1978a),Computer recognition of printed Tamil characters, Pattern Recognition, 10, 243 - 247.
9. Siromoney G, Chandrasekaran M and Chandrasekaran R (1978b), Computer recognition and transliteration of mridangam mnemonics, Quarterly Journal of the National Center for Performing Arts, 7, No.1, 11 - 17.