Authorship identification for Chinese literature based on a pyramid deep bidirectional gated recurrent unit network with voting strategy.
Journal:
Scientific reports
Published Date:
May 29, 2026
Abstract
Identifying the author of given textual excerpts is a challenging and significant task in computational linguistics. In this paper, we propose a novel deep-learning model structured based on a Pyramid Deep Bidirectional Gated Recurrent Unit (biGRU) network for authorship identification of Chinese literature. Our model introduces a hierarchical architecture where the output sequence of each GRU layer is progressively downsampled by an optimizable convolutional module before serving as the input for the subsequent layer. To enhance computational efficiency, we implement a parameter-sharing strategy across all layers. Furthermore, we incorporate a multi-level soft-voting mechanism, where hidden features extracted at each layer independently contribute to the final prediction, effectively capturing diverse levels of characteristic abstraction. We evaluate our approach on a Chinese literary dataset comprising over 90 k tokens derived from short, fragmented utterances, where coherent and extensive contextual information is inherently limited. Experimental results demonstrate that our model achieves a superior F1 score of 0.8224 while requiring only approximately 41% of the FLOPs compared to a conventional deep biGRU network of equivalent depth, confirming its effectiveness and efficiency for authorship identification.
Authors
Keywords
No keywords available for this article.