CASCADE-ChemicAl Shift CAlculator using DEep learning

CASCADE stands for ChemicAl Shift CAlculation with DEep learning. It is a stereochemically-aware online calculator for NMR chemical shifts using a graph network approach developed at Colorado State University. Molecular input can be specified as SMILES or through the graphical interface. An automated workflow executes 3D structure embedding and MMFF conformer searching. The full ensemble of optimized conformations are passed to a trained graph neural network to predict the NMR chemical shift (in ppm) for each C atom.

My image

Graph Convolutional Neural Network

The graph Network considers each atom as a node in a molecular graph that permits communication between atoms - this is through edges connecting atom pairs. Each molecule is represented by a 3D molecular graph, in which each pair of atoms are connected by a distance (i.e., as opposed to solely representing bonds in a 2D molecular graph). The architecture of the network used is shown below.


My image
Figure 1. Graph Network used in this work

Performance evaluation

We evaluated the performance of three neural network models. The first model, DFTNN, takes DFT optimized structures as input and was trained against DFT calculated shielding tensors. The MAE for the 500 testing molecules is 1.26 ppm compared to mPW1PW91/6-311+G(d,p) DFT calculations. The second model, ExpNN-dft, which was trained against experimental chemical shifts using DFT optimized structures as input through transfer learning achieves a MAE of 1.25 ppm for the 500 testing molecules compared to experimental observed chemical shifts. The eventual model, Expnn-ff, which serves as the backend predictor for this web app takes MMFF optimized structures as input and predict experimental chemical shift. The performance of the eventual model on the testing molecules is 1.43 ppm compared to the experimental 13C chemical shifts. The Scatter plot and histogram show the correlation between predicted and experimental chemical shift are shown below.

My image
Figure 2. Prediction of DFT computed chemical shifts using DFTNN for the testing set. Scatter plots and histograms show the correlation between chemical shifts predicted from DFTNN and computed with DFT for 1H (a) and 13C (b). The held-out testing set contains 500 randomly sampled structures (testing/training rate: 1/12). The grey dashed line in the scatter plot indicates a perfect correlation.

Some Guidelines for using CASCADE:

  • Remember to specify stereochemistry where appropriate
  • Molecular composition is currently restricted to the following elements: C, H, N, O, S, P, F, Cl
  • Molecules with formal charges are not processed.
  • CASCADE generates conformers and performs energy minimizations with MMFF before predicting the chemical shifts. While there is no theoretical size-limit to this approach, extremely large molecules cause difficulties for webserver performance, particularly if multiple queries are submitted. For this reason we ask that you limit your molecules to 50 heavy atoms.
  • For support with larger molecules, please install and run CASCADE on your own machine. The code is openly available: Gihub.
  • For futher support please contact us by email: patonlab@colostate.edu



For more details, please see our paper:Real-time Prediction of 1H and 13C Chemical Shifts with DFT accuracy using a 3D Graph Neural Network. Guan, Y.; Sowndarya, S. S. V.; Gallegos, L. C.; St. John, P. C.; Paton, R. S. Chem. Sci. 2021, 23, 12012-12026. DOI:https://doi.org/10.1039/D1SC03343C.