An Efficient Deep Learning Framework for Revealing the Evolution of Characterization Methods in Nanoscience.
Journal:
Nano-micro letters
Published Date:
Jun 13, 2025
Abstract
Text mining has emerged as a powerful strategy for extracting domain knowledge structure from large amounts of text data. To date, most text mining methods are restricted to specific literature information, resulting in incomplete knowledge graphs. Here, we report a method that combines citation analysis with topic modeling to describe the hidden development patterns in the history of science. Leveraging this method, we construct a knowledge graph in the field of Raman spectroscopy. The traditional Latent DirichletAllocation model is chosen as the baseline model for comparison to validate the performance of our model. Our method improves the topic coherence with a minimum growth rate of 100% compared to the traditional text mining method. It outperforms the traditional text mining method on the diversity, and its growth rate ranges from 0 to 126%. The results show the effectiveness of rule-based tokenizer we designed in solving the word tokenizer problem caused by entity naming rules in the field of chemistry. It is versatile in revealing the distribution of topics, establishing the similarity and inheritance relationships, and identifying the important moments in the history of Raman spectroscopy. Our work provides a comprehensive tool for the science of science research and promises to offer new insights into the historical survey and development forecast of a research field.
Authors
Keywords
No keywords available for this article.