To build a classification model on continuous data (e.g. pIC50 data) an interval defining the desired value range is needed at predictor creation. Predictors are managed in the "Predictors" tab of Makya. Please see QSAR Set Up for a brief video explanation.
To begin defining a TPP, click the New Predictor button. The name for this predictor can then be specified, which will then lead to the setup panel as shown in the figure below.
In this setup, the dataset that contains molecules to build the QSAR models on can be selected. If the dataset contains more than one target (or objective), multiple targets can be selected individually or all targets can be selected in a single click. User has the flexibility to add or remove targets at a later stage prior to training. Targets from multiple datasets can be integrated into the same Makya predictor object.
Once selected, objectives must then be configured. This is done via the Predictor setup panel as shown in the annotated figure below:
Following are the details of the Predictor setup panel:
- Information about the molecules matching all the objectives simultaneously. Clicking on the eye button will display these molecules.
- Panel to add additional targets.
- The trash button allows the user to delete the target from the predictor. Deleted targets can be added again (see 2 above)
- Information about molecules matching this objective in particular.
- Lower and upper bounds of the desired interval for this target. The value can either be typed in the box or can be defined using the slider. It is important that the chosen values do not yield classes that are too imbalanced. We recommend having at least 5% of the dataset and 20 molecules in the desired category.
NOTE: when only a lower or upper bound is selected, a continuous classification model is applied.
6. Clicking the Save button will save this predictor configuration. The predictor is now ready to be trained