Main content

Machine Learning Methods in Applied Chemical Research

Show full item record

Title: Machine Learning Methods in Applied Chemical Research
Author: Buin, Andrei
Department: School of Engineering
Program: Engineering
Advisor: S.Gadsden, Andrew
Abstract: Recent advancements in machine learning have led to widespread application of its algorithms to synthetic planning and reaction predictions in the eld of chemistry. One major area, known as supervised learning, is being explored for predicting certain properties such as reaction yields and types. Many chemical descriptors known as ngerprints are being explored as potential candidates for reaction properties prediction. However, only a few studies describe the permutational invariance of chemical ngerprints, which are concatenated at some stage before being fed to a deep learning architecture. In this thesis, we demonstrate that by utilizing permutational invariance, we consistently see improved results in terms of accuracy relative to previously published studies. Furthermore, we are able to accurately predict hydrogen peroxide loss (R=0.78 vs. R=0.61) with our own dataset, which consists of more than 20 ingredients in each chemical formulation. Additionally, we present results of three autoregressive models, namely Recurrent Neural Networks (RNN), Bidirectional Reaction Model with Alternated Learning (BIMODAL) and Temporary Convolutional Neural Networks (TCN) trained on textual representation of reactions' graphs with more than 90% correctly generated reactions vs. 72% previously reported. For the textual encodings, we concentrate on the relatively new Condensed Graph of Reaction Smiles like representation (CGRSmiles) of the reactions, where CGRSmiles represents reaction graphs with all the bond breaking/creation. Results shown in this thesis demonstrate the ability of deep learning architectures to learn underlying CGRSmiles grammar with the ability to generate new reactions with new and unseen reaction centers. The results also demonstrate the ability to transfer the learned approach to reactions of a certain type, which in our case involves reactions used in hydrogen peroxide synthesis.
URI: https://hdl.handle.net/10214/26295
Date: 2021-08
Rights: Attribution 4.0 International
Related Publications: Buin, Andrei, Chiang, Hung Yi, Gadsden, S. Andrew, and Alderson, Faraz A. Permutationally Invariant Deep Learning Approach to Molecular Fingerprinting with Application to Compound Mixtures. J. Chem. Inf. Model. 2021, 61, 2, 631–640, doi: https://doi.org/10.1021/acs.jcim.0c01097


Files in this item

Files Size Format View
Buin_Andrei_202108_PhD.pdf 17.31Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record

Attribution 4.0 International Except where otherwise noted, this item's license is described as Attribution 4.0 International