A method for English paragraph grammar correction based on differential fusion of syntactic features.

Journal: PloS one
Published Date:

Abstract

The new progress of deep learning and natural language processing technology has strongly promoted the development of English grammar error correction. However, the existing methods mostly rely on large-scale corpus, and often ignore the fine syntactic correlation in paragraphs, which limits the efficiency in complex grammar error correction scenarios. In order to break through this bottleneck, this study proposes an innovative method to effectively use syntactic features to improve the quality and accuracy of paragraph-level grammar correction. Firstly, the sentence vector representation is constructed by BERT, and then the syntactic structure is extracted by dependency parsing. Then carry out difference fusion analysis, measure the syntactic differences of adjacent sentences by cosine similarity, identify the significant differences caused by grammatical errors according to the preset threshold, lock the position and type of errors, and input the original sentence vector into the Seq2Seq model based on Transformer. The model focuses on the wrong area by attention mechanism to generate correction suggestions. The preliminary results show that this method is significantly better than the existing grammar error correction system. In CoLA dataset, the accuracy is 0.88, which is three percentage points higher than that of BERT-GC. The accuracy of LCoLE dataset is 0.86, which is ahead of the baseline model. The accuracy of FCE data set is 0.89, which has obvious advantages. The accuracy is improved by 3% to a higher level. It shows the excellent effect of this method in grammar error recognition and correction, and has far-reaching significance in providing accurate error correction suggestions, helping English learners improve their writing ability and ensuring the quality of English writing. This study not only presents a powerful approach to English grammar error correction, but also highlights the key value of syntactic features in optimizing natural language processing applications.

Authors

  • Weiling Liu
    School of Mechanical Engineering, Tianjin Key Laboratory of Power Transmission and Safety Technology for NewEnergy Vehicles, Hebei University of Technology, Tianjin, China.
  • Caijun Zhao
    College of International Studies, College of Computer Science, Beibu Gulf University, Qinzhou, P. R. China.
  • Yongyi Li
    Rehabilitation department of traditional Chinese Medicine, The Second Rehabilitation Hospital of Shanghai, Shanghai, China.
  • Chenglong Cai
    College of International Studies, College of Computer Science, Beibu Gulf University, Qinzhou, P. R. China.
  • Hong Liu
    Key Laboratory of Grain and Oil Processing and Food Safety of Sichuan Province, College of Food and Bioengineering, Xihua University Chengdu 610039 China xingyage1@163.com.
  • Ruilin Qiu
    College of International Studies, College of Computer Science, Beibu Gulf University, Qinzhou, P. R. China.
  • Ruoci Su
    College of International Studies, College of Computer Science, Beibu Gulf University, Qinzhou, P. R. China.
  • Bingbing Li
    Department of Pathology, Nanfang Hospital and Basic Medical College, Southern Medical University, Guangzhou 510515, Guangdong Province, China; Guangdong Province Key Laboratory of Molecular Tumor Pathology, Guangzhou 510515, Guangdong Province, China.