A machine learning assisted identification of optimum set of order parameters for study of gas hydrate nucleation in a molecular simulation.
Journal:
The Journal of chemical physics
Published Date:
Aug 14, 2025
Abstract
The study of crystalline materials is of scientific and technological importance. In this regard, tools such as molecular simulations are widely used to characterize their structure and study their mechanisms of formation. In this work, we develop models for identification and classification of crystal polymorphs during a molecular simulation. The models are based on the XGBoost algorithm, which is a scalable, distributed gradient-boosted decision tree model. The inputs to the model are a set of generic order parameters that have been identified from a large pool using machine learning techniques. This study focuses on gas hydrates, which are naturally occurring crystalline compounds of light gases and water. These materials have tremendous scientific and technological importance, and their formation mechanisms under natural/laboratory conditions are areas of active scientific research. The XGBoost models developed in this work are able to accurately classify gas hydrate polymorphs and also compute nucleation rates using the mean first passage time technique. The novelty of this work is to demonstrate that the use of machine learning techniques mitigates the need for considerable expertise in crystallography while identifying crystal order parameters for polymorph classification.
Authors
Keywords
No keywords available for this article.