Back translation for molecule generation.

Journal: Bioinformatics (Oxford, England)

Published Date: Feb 7, 2022

Abstract

MOTIVATION: Molecule generation, which is to generate new molecules, is an important problem in bioinformatics. Typical tasks include generating molecules with given properties, molecular property improvement (i.e. improving specific properties of an input molecule), retrosynthesis (i.e. predicting the molecules that can be used to synthesize a target molecule), etc. Recently, deep-learning-based methods received more attention for molecule generation. The labeled data of bioinformatics is usually costly to obtain, but there are millions of unlabeled molecules. Inspired by the success of sequence generation in natural language processing with unlabeled data, we would like to explore an effective way of using unlabeled molecules for molecule generation.

Authors

Yang Fan

Colby College, Waterville, Maine, United States of America.
Yingce Xia

Microsoft Research, Beijing 100080, China.
Jinhua Zhu

University of Science and Technology of China, Hefei, Anhui 230027, China.
Lijun Wu

Department of Rheumatism and Immunology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China.
Shufang Xie

Microsoft Research, Beijing 100080, China.
Tao Qin

Department of Hepatobiliary and Pancreatic Surgery, Henan Provincial People's Hospital, Zhengzhou University People's Hospital, Zhengzhou, Henan, China.

Keywords

Natural Language Processing

External Resources

View on PubMed Access via DOI PubMed (34875015)

Back translation for molecule generation.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals