ClickFF/density-nD
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Please first unzip the compressed files for all 12 models. For each model (e.g., density_GAFF_M1), the descriptor set and model category are indicated in the directory name: density_GAFF_M1 density_GAFF_M2 density_GAFF_M3 density_RDKIT_M1 density_RDKIT_M2 density_RDKIT_M3 nD_GAFF_M1 nD_GAFF_M2 nD_GAFF_M3 nD_RDKIT_M1 nD_RDKIT_M2 nD_RDKIT_M3 We constructed three categories of models using two types of molecular descriptors: GAFF and RDKIT. M1 models were trained using all available data. M2 models included temperature as an input feature during training. M3 models were trained using datasets containing temperature information, although temperature itself was not used as a training feature. Each model directory contains: Training and test datasets in CSV format Three MATLAB code files: retrain.m — retrains the model over 20 independent runs prediction.m — performs predictions using the best-performing model selected from the 20 runs and saves the resulting MATLAB data file trainRegressionModel.m — function called by retrain.m for model training train_log.docx: the performance of 20 independent runs, with the summary of the best model highlighted in yellow. The trained model is stored as a MATLAB data file (.mat) and can be loaded directly into MATLAB. After loading, the trained model object stored in the variable model_name can be used for prediction. Associated model parameters can also be inspected within the MATLAB workspace. The variable validationRMSE provides the cross-validation RMSE of the model. To use a trained model, navigate to the corresponding model directory, load the MATLAB data file into MATLAB, and execute prediction.m to generate predictions. The code may also be modified to perform predictions on new datasets. Due to the extremely large file size, the best-performing model file (bestmodel.mat) must be downloaded separately via web browser: https://mulan.pharmacy.pitt.edu/group/github/density-nD/density_GAFF_M1/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/density_GAFF_M2/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/density_GAFF_M3/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/density_RDKIT_M1/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/density_RDKIT_M2/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/density_RDKIT_M3/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/nD_GAFF_M1/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/nD_GAFF_M2/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/nD_GAFF_M3/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/nD_RDKIT_M1/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/nD_RDKIT_M2/bestmodel.mat https://mulan.pharmacy.pitt.edu/group/github/density-nD/nD_RDKIT_M3/bestmodel.mat After download, bestmodel.mat must be placed in the corresponding directory, e.g. bestmodel.mat downloaded from https://mulan.pharmacy.pitt.edu/group/github/density-nD/density_GAFF_M1/bestmodel.mat must be moved to density_GAFF_M1 One can also obtain the models using wget by running download_model.sh