A Computerized System for Translating Japanese Print into Braille

N. Ohtake

It is difficult for beginners to master the complex rules for translating Japanese text into braille. Unlike English braille, for example, a sentence is not translated letter by letter, and there is not a one-to-one matching between printed text and braille. To facilitate the production of braille and encourage volunteers to learn braille transcription, the author has developed a new braille-translation system, which automatically translates regular Japanese text files into braille files by using translation rules and dictionaries developed by the author.

DIFFICULTIES RELATED TO BRAILLE TRANSLATION

What makes Japanese more difficult than some other languages to translate into braille? First, in Japanese there are no delimiters between words. Second, Japanese has a variety of character sets (Kanji, Haragana, and Katakana, as well as the Roman alphabet for borrowed words); thus, a single word can be expressed by more than one character set, and one word may be expressed by a combination of various character sets. Third, a Japanese Kanji character has two kinds of pronunciations; sometimes different meanings are represented by the same pronunciation; and sometimes a single word is pronounced in more than one way. Fourth, the most significant component of an automatic braille translation system is the programming of specific grammar rules. This task is difficult, because the Japanese language has so many exceptions to its rules.

HARDWARE AND SOFTWARE

The first version of the braille-translation program works only in the UNIX and MS-DOS environments. However, because the aim is to allow the program to work on any computer, the program was written in C programming language, which allows users who know the language to modify the program for use on a specific computer. Versions of the program that are compatible with other user interfaces will be considered for development in the future.

TECHNIQUE AND DEVELOPMENT

To program the system to follow the rules of Japanese grammar, the author used Chomsky's (1976) phrase structure grammar to allow the program to perform structure analysis of the sentence. (Classification of productivity according to the phrase structure grammar is described in Hopcroft & Ulman, 1969.) However, there are some types of phrase structure grammar in which definitions are restricted to a syntactical level, making it necessary to develop another way to correct the analysis beyond the syntactical level. For such cases, the author devised a system of grammatical neighbor relationships between morphological elements. Each morphological element (MEn) has left and right grammatical neighbor relation factors (gnrn)- If there is an element gnr2 in the right side of ME, and the left of ME2, then ME1 and ME2 are connectable. This concept is illustrated in Figure 1.

This grammatical information brings about a semantic relationship between morphological elements. The author applied the combination of Chomsky's (1976) transformational generative grammar and this semantic relationship for Japanese sentence analysis to the braille translation.

An automatic braille-translation program must provide specific rules for translating into braille. There are two different types of rules: generative rules, which divide sentences into morphological segments, and rewriting rules, which are used for special pronunciations. The generative rules, which are the knowledge base of professional braille translators, are embedded in the braille translation program, and the rewriting rules are used to build a dictionary. There are 3,011 generative rules and 115 rewriting rules. Each rule is defined to comply with Japanese braille definitions.

Figure 1. A conceptual illustration of the grammatical neighbor relationship between morphological elements. NOTE: Figure not available at this site.

The program's dictionary consists of a listing of words, along with grammatical information about the word, its root, and its translated pronunciation, according to the rewriting rules. The dictionary is divided into two subdictionaries, a small one and a large one. The small subdictionary, which contains 30,400 words and requires 300K of memory, can be used when the computer's memory is limited. The large dictionary, which contains 97,000 words, requires 1,000K of memory. Even the large dictionary is restricted to a basic vocabulary and does not contain technical terms. Therefore, special technical dictionaries need to be developed. In addition, users can create a personal dictionary for such words as place names and personal names. Personal dictionaries are empty when the program is installed, and an option tool allows users to add to the dictionary as needed.

STRENGTHS AND LIMITATIONS

Multiple tests showed that the program achieves an accuracy of 93 percent. This result is high but not sufficient, and future research needs to be done to increase the program's accuracy. The lack of translation rules was the source of many mistakes. To correct this problem, the author added an output message to tell users the reason for errors and to allow them to correct the errors. An editorial tool for beginners has also been added to reduce inaccuracies. This tool allows users to correctly modify errors that appear in the sentences that users input and the sentences that the braille-translation program outputs. The most effective method to improve accuracy is to add words that are not in the program's dictionary to the personal dictionary. This method is not ideal because it puts the burden on the user, but it drastically improves the program's accuracy.

An important limitation to the program is that it cannot translate special descriptions, such as some mathematical expressions, figures, tables, and special symbols. Another limitation is that the interface can be used by persons with low vision but not persons who are blind. Future research needs to be done to make the program compatible with speech synthesizers and braille displays.

CONCLUSION

One of the purposes of developing an automatic braille translation system is to increase the number of Japanese braille translators. Although there are many people who want to volunteer as braille translators, the complexity of Japanese braille rules creates many obstacles to learning braille translation. It is hoped that the program described in this Random Access column will help remove some of those obstacles. The major advantage of the system is its ease of use, which makes it especially well suited for beginners. However, further research needs to be done to improve its accuracy, make it more interactive with other computer programs, and increase its accessibility to blind users. The author plans to use the feedback from current users to make future improvements.

REFERENCES

Chomsky, N. (1976). Aspects of the theory of syntax. Cambridge: MIT Press.

Hopcroft, J.E. & Ulman, J.D. (1969). Formal languages and their relation to automata. Reading, MA: Addison-Wesley.

Nobuyuki Ohtake, Research Centers for Developing Educational Methods, Tyukuba College of Technology, 4-12 Kasuga, Tsukuba 305, Japan.