Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis.

Journal: Neuroinformatics
Published Date:

Abstract

How to read Uyghur text from biomedical graphic images is a challenge problem due to the complex layout and cursive writing of Uyghur. In this paper, we propose a system that extracts text from Uyghur biomedical images, and matches the text in a specific lexicon for semantic analysis. The proposed system possesses following distinctive properties: first, it is an integrated system which firstly detects and crops the Uyghur text lines using a single fully convolutional neural network, and then keywords in the lexicon are matched by a well-designed matching network. Second, to train the matching network effectively an online sampling method is applied, which generates synthetic data continually. Finally, we propose a GPU acceleration scheme for matching network to match a complete Uyghur text line directly rather than a single window. Experimental results on benchmark dataset show our method achieves a good performance of F-measure 74.5%. Besides, our system keeps high efficiency with 0.5s running time for each image due to the GPU acceleration scheme.

Authors

  • Shancheng Fang
    National Engineering Laboratory for Information Security Technologies, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China.
  • Hongtao Xie
    School of Information Science and Technology, University of Science and Technology of China, Hefei, China. htxie@ustc.edu.cn.
  • Zhineng Chen
    Institute of Automation, Chinese Academy of Sciences (CAS), China.
  • Yizhi Liu
    School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China.
  • Yan Li
    Interdisciplinary Research Center for Biology and Chemistry, Liaoning Normal University, Dalian, China.