Research Article

A Novel Molecular Representation Learning for Molecular Property Prediction with a Multiple SMILES-Based Augmentation

Table 2

The RMSE and MAE values of various approaches in ESOL, lipophilicity, and FreeSolv datasets. The predictive values of the approaches are partly derived from the related references [2, 8].

ModelESOLLipophilicityFreeSolv
RMSEMAERMSEMAERMSEMAE

CheMixNetCNN_RNN1.04190.80101.05130.82821.35531.0156

Conventional methodsXGBoost0.99000.79901.7400
Multitask1.12000.85901.8700

Graph-based methodsMPNN0.58000.71901.1500
Weave0.61000.71501.2200

3D-based modelsDrug3D-Net0.96830.78410.99300.84041.47091.1598

Our method (multiple SMILES)RNN (one layer)0.65850.51050.79290.62111.60511.2313
RNN (two layers)0.63940.49400.79600.62171.35751.0468
CNN_RNN0.59160.44480.70540.54811.00330.7859

The best results are highlighted in bold.