Abstract
The font evolution with various types is a great impact on a recognition performance of optical character recognition (OCR) systems. The more diversity of fonts leads to the less accuracy of recognition rate, particularly Thai-fonts. In order to overcome this obstacle, this paper proposes a font descriptor for printed Thai-character recognition. The role of such a descriptor is a representative of various fonts and sizes. The font descriptor construction is based on principal component analysis (PCA) in a combination with predefined patterns in multi-level processing. The proposed font descriptor is tested on Thai character image corpus consisting of consonants, vowels, and tones. The experimental results show that the proposed font descriptor is efficient and robust to font type and size variations.
Ref:http://www.mva-org.jp