References
For more information, please check our paper:
Clarisse Descamps, Vincent Bouttier, Juan Sanz García, Maoussi Lhuillier-Akakpo, Quentin Perron, Hamza Tajmouati, Growing and linking optimizers: synthesis-driven molecule design, Briefings in Bioinformatics, Volume 26, Issue 5, September 2025, bbaf482, https://doi.org/10.1093/bib/bbaf482
Introduction
The Fragment Linking generator can be used for generating compounds by proposing new linkers and/or scaffolds between two building blocks and their reaction centers. It is purely chemistry driven, so a good understanding of organic chemistry is required. By defining the exit vector (where the chemistry will take place) the generator will search for commercial building blocks which can react in such a position.
The Fragment Linking generator is particularly useful to guide molecule generation in scaffold hopping use-cases and during the hit-to-lead phase.
To access the Fragment Linking generator, click on the New Generator box in the "Generator" tab. Specify a name for this generator, and under "Generator Engine" select Fragment Linking.
Example Use-Case
Given a couple of building blocks: A and B and their reaction centers (exit vectors) -NH2 and -Br respectively, the fragment linking generator will propose novel compounds by attaching new scaffolds at those centers while keeping the rest of the building blocks intact.
Such scaffolds are selected from a database of commercially available compounds.
Set-up
- Exit Vectors (Required): In each of the text fields below "Enter SMILES", specify a building block and set the corresponding exit vector where the linking will happen. You can either sketch them (by clicking on the pen icon) or copy-paste the SMILES sequence. Once it is done, the building blocks will be visible in two panels with an ID assigned to each atom. Click on Set for each panel, and in the box provided below the structure, type atom IDs to start filtering possible exit vectors. Type one ID at a time and then press Enter.
How to select a proper Exit Vector
NOTES ON EXIT VECTORS:
The fragment should not contain charged atoms. Protonation is performed directly inside Makya where needed.
When you select the exit vectors, the atoms in red are the ones that will be involved in the reaction. The Fragment Linking generator is chemistry-driven and the choice of exit vectors need to be informed by organic chemistry knowledge.
-
Optional:
- Chemical Space: Select ≥1 datasets uploaded to this project, or specify SMILES that can be used to guide the generation in terms of similarity to the chemical space.
- Building Blocks: In this panel, add constraints on the building blocks that will react on your intermediates. For each compound, specify structural constraints based on RDKit descriptors in the "Descriptors" tab, or input the SMARTS for required and/or forbidden substructures in the "Substructures" tab. For instance, the screenshot below shows constraints to react on a building block which contains either Iodine and Chlorine, or Bromine and Chlorine.
NOTE: compound filters will drastically accelerate the generation by reducing the size of the catalog of commercially available building blocks that is explored by the algorithm.
-
- Products: In this panel, add constraints on the final generated molecules. Specify structural constraints based on RDKit descriptors in the "Descriptors" tab; input the SMARTS for required and/or forbidden substructures in the "Substructures" tab; or upload a csv file with custom Pains/Tox list of SMILES in the "Pains/Tox" tab.
NOTE: product filters are applied after molecule generation, to filter the results. As such, it is slower and less computationally efficient than building block filters (see above). If appropriate, we recommend using building block filters.
-
- QSAR: Select among trained QSAR models to guide the generation around a defined target product profile (see QSAR). (Be careful to select Models whose applicability domain covers the chemical space you are exploring!)
- API: Plug in external scores or models that can be accessed to guide this generation (see Scoring APIs).
- Scorers: Select any scores that will be calculated during the generation and available in post-processing (see Scorers).
- 3D Ligand based / 3D Structure based: Select 3D parameters to optimize docking score, contact score, shape similarity to a reference ligand, or other 3D objectives.