Home | Biodata | Biography | Photo Gallery | Publications | Tributes
[Back to Tamil Studies List]

Tamil Studies


Efficient methods of telegraphy, typewriting and teleprinting in Tamil
Tamil culture, x, pp. 107-120, 1963
Gift Siromoney

i Telegraphy:

Sending a message in a given language by the most efficient way is a problem in Communication Engineering.  The message can be communicated using an arbitrary code made up of dots and dashes, but the method will not be efficient even though it may be practicable.  For instance, a passage in Tamil can be written in the Roman script and transmitted in the same way as a passage in English is, using the Morse code.  This method is practicable and it can be easily shown that it is not efficient.

The Morse code used for transmitting message in English is an efficient method for the English language of the days of Morse.  It was originally based on the frequencies of the letters of the alphabet in English. For example, the letter e is the most frequent letter in English and it is represented by a dot (o). z however, occurs very infrequently and it is represented by a long sequence of dots and dashes (--..). For the most efficient method of coding, the more frequently a letter occurs in a language, the shorter should be its code symbol. The frequencies of the letters in English are quite different from those in Tamil. Therefore, for the most efficient and economic method of sending a message, there must be a separate code for Tamil.

One method is to give a separate symbol in dots and dashes for each of the 247 letters in Tamil. For one thing, this involves a tremendous amount of mental effort on the operator, to remember 247 long sequences of dots and dashes which include about 120 sequences of length 7 units. Out of the 247 letters, 216 (Uyirmei) are combinations of the 30 (12 vowels + 18 consonants) basic letters excluding the auxiliary (aitham). For the combination we have the symbol but is not represented by a separate letter in Tamil. A statistical study shows that the combinations of vowel (Uyir) following a consonant (Mei) and a consonant following a vowel, are equally frequent. This means that must be represented by a sequence of dots and dashes in the code if is represented by a separate sequence. When the telegraphic system is manually operated, we have to find some other method where the number of symbols will not be large.

It is possible to reduce the number of symbols from 247 to about 60 different symbols as in the case of the Tamil typewriter. I have worked out the relative frequencies of these symbols (Fig. 2) and the shortest code symbol must be assigned to the most frequent letter and so on to obtain efficiency in coding. To transmit may be transmitted first, followed by the symbol for dot . For transmitting may be followed by the , as is the practice in writing Tamil.

A third method is to take the basic 30 letters of 12 vowels and 18 consonants. The word AM MA: will be treated as a four-letter one, which is treated as the three-letter word when the 247 letter alphabet is used. The frequencies of the different letters, which may be treated as the 30 basic sounds in Tamil, are given in Fig. 1, along with the code symbols suggested by me.

Each symbol is represented by a sequence of dots and dashes, whose length does not exceed 4 units. This method compares favourably with the Morse code for English and our code represents the quickest and the most economical method of transmitting messages in Tamil.

In practice, a few more symbols, (including Aitham, the Grantha letters, numerals and period) will have to be represented by sequences of dots and dashes of length 5 units.

ii. Typewriter:

With the introduction of Tamil as the medium of official communication in Madras state, the need for a Tamil typewriter was keenly felt and in 1958, the Government of Madras approved a  "standardised" keyboard. Accepting the different symbols and letters in this keyboard, we shall analyse here, whether the arrangement is efficient and whether it is possible to increase the speed of typing by changing the positions of some of the keys.

The total number of symbols used on the keyboard is 69, which are sufficient to type all the letters (except which occur mostly in Tamil primers) including Aitham and the usual Grantha letters. For typing the Tamil letters alone there are 62 symbols arranged on 31 keys. 32 keys cover all letters, a comma and a period. Half the symbols are in the upper casing and the others in the lower casing. As in the English typewriter, the shift key has to be pressed before typing the letters in the upper casing. The letter is typed by typing first the dot (o) and then the letter . It is so arranged that the typewriter carriage does not move after the dot is typed. There are three such "dead stops". For typing are typed, first and then .

As the number of symbols increases, the effort to remember these various positions increases. Compared to English, the effort is much greater in Tamil. To reduce the effort, the keyboard is constructed in such a way that there is a certain amount of regularity. For instance, . and .are arranged in the same key. However there is no general rule such as in English, where the capital letters and the corresponding small letters belong to the same key.

To overcome the greater effort needed for remembering the 62 symbols, the learning period for a typist, must be increased. Once one is thoroughly trained, there should be little difficulty in remembering the different positions. Even if the initial difficulties are greater, that system which will give the faster speed must be adopted. This principle is generally accepted (if it were not so, the keyboard would be in the direct alphabetic order, starting with a, a: , from  the top left hand corner) but not systematically put to practical use.

Therefore it is necessary that the letters which occur very frequently should be arranged in the most advantageous positions on the keyboard. This implies that the least frequent letters should be put in the upper casing thereby reducing the number of times the shift key has to be used. Among the letters which occur frequently, the most frequent letters should be arranged in such a way that they may be operated by the forefingers and the middle fingers in the middle rows of the typewriter.

To find out what letters are frequent and what are not, a statistical study was undertaken by me, to get reliable figures. The result is based on a sample taken using random sampling techniques. Only the prose works were considered and the frequencies are based on about 500,000 pages published in Madras State during 1946-57. More than 20,000 letters were counted to make sure of the reliability of the frequencies. In Fig.II, the number in the brackets gives the number of times the letter occurs (subject to fluctuations of sampling) in a sample of 10,000 symbols. For example, in a passage of length 10,000 letters, one may expect to occur 128 times and 68 times. In actual practice, the figures may not give the exact result but they will be very close. The larger the sample, the closer will be the approximation.

occurs 155 times, 128 times, 106 times and 81 times. All these are in the upper casing and it means that the shift key has to be used before typing out each letter. On the other hand, occurs once, 27 times, 59 times and 67 times. All these are in the lower casing. To minimize the use of shift keys (and to increase the speed of typing), , and must be brought to the lower casing and , and transferred to the upper casing.

The dot used for all the pure consonants like , ... is the most frequent symbol occurring 1848 times. This makes the little finger of the right hand, the most hard-worked. Also, the symbol has a high frequency of 645 to be typed by the same little finger. If these symbols are operated by the middle finger, for instance, the fatigue on the little finger will be reduced and the speed may be increased.

In English, the space bar is more frequently used than the most frequent letter E. In Tamil, however, the dot is used about 39% more often than the space bar, and some adjustment has to be made in the Tamil typewriter. It will be worthwhile constructing a keyboard, where the dot can be typed using part of the "bar" used for "space" For example, the bar can be divided into three equal sections and the middle portion used for the dot, so that it may be operated by either of the thumbs.

The bar (-) and the question mark (?) need not be kept so close to the other letters but put on a key to the extreme right. This will increase the compactness and all the necessary keys will be near the guide row. It is possible to make some more improvements by studying the frequencies given in Fig.II.

iii. Teleprinter:

For the purpose of constructing a teleprinter, the number of symbols must be drastically reduced. One solution is to do away with the Uyirmei and have the 30 basic letters only. One may even leave out the two diphthongs. Then the present English teleprinter can easily be converted to Tamil. It is very doubtful whether this suggestion will be accepted.

A more acceptable solution will be to reduce the number of symbols used in the present keyboard. are not frequent. They may be removed and written as Similarly ,... may be written as ,... thereby removing the symbol r. A new symbol may be introduced to take care of ... It must fit in with ,... to give ,... can be removed and written as + . If a new symbol is introduced, we can do away with ,... By this method 15 symbols can be omitted and the number of symbols is reduced to 47. For sending numbers some convention must be agreed upon as to what letters should represent each numeral (Fig. IV) and the Tamil numerals must be used following the decimal system. 320 will be written as O where O is a new symbol.

In the modern English teleprinter, 52 symbols can be printed and with our 47 symbols for Tamil letters, we can choose 5 more useful symbols, like the period, zero, Grantha letters like or some other symbols for Tamil numerals. Then the Grantha letters may be obtained by using the symbols combining them in suitable ways. Letters like can be represented by equivalent symbols. For instance can be printed as . The keyboards designed by the author are given in Figs.VI and VII.

In the Hindi teleprinter, 54 symbols can be printed, including the 10 Arabic numerals,  0, 1, 2,... 9. It must be possible to adapt such a keyboard to Tamil, provided the number of symbols for printing Tamil letters can be reduced to 44. may be represented as combinations of and a new symbol / can be represented as if the symbol | in is replaced by a longer symbol | . Instead of the separate symbol , we may have to use a combination of and . and as suggested here are very similar to the corresponding characters in the Raja Raja Chola's Tamil script. Now the total number of symbols reduces to 44 and the Hindi teleprinter can easily be converted into a Tamil one. In this case the Grantha letters and their combinations cannot be represented.

The counting experiment, using random sampling techniques, was conducted under the supervision of Dr. W. F. Kibble, Professor of Mathematics, Madras Christian College and the late Dr. R. P. Sethu Pillai was consulted at various stages for the experiment. The methods given are quite general and they can be applied to other languages also. The work on some Dravidian languages is being started in our Department of Mathematics, Madras Christian College, Tambaram.

FIG. I. FREQUENCIES OF SOUNDS AND THEIR PROPOSED CODE SYMBOLS FOR TAMIL
SoundFrequencyCodeSound FrequencyCode
a 150. t 27...-
k 79- c 22..-.
i 78.. l: 21.-..
u 77.- nh 21-...
th 71-.e18...-
a: 47--e: 13.-.-
r 45...n: 11-..-
p 44..-o 10.--.
m 42.-.o: 7-.-.
n 38-..l- 7--..
t: 36.-.ng 6.---
v 35-.-i:4 -.--
l 31--.u: 4--.-
y 29---nj 1---.
ai 27....au1----
1000 

FIG. II. POSITION OF SYMBOLS AND THEIR CORRESPONDING FREQUENCIES ON THE STANDARD KEYBOARD



FIG. III. REDUCTION OF NUMBER OF LETTERS FOR THE TELEPRINTER



FIG. IV. PROPOSED CONVENTION FOR NUMERALS



FIG. V. FURTHER REDUCTION OF SYMBOLS

The symbol | can be lengthened upwards and a new symbol / introduced.



FIG. VI. TELEPRINTER KEYBOARD (MODEL I)



FIG. VII. TELEPRINTER KEYBOARD (MODEL II)



Go to the top of the page

Home | Biodata | Biography | Photo Gallery | Publications | Tributes