References
For more information, please check our paper:
Clarisse Descamps, Vincent Bouttier, Juan Sanz García, Maoussi Lhuillier-Akakpo, Quentin Perron, Hamza Tajmouati, Growing and linking optimizers: synthesis-driven molecule design, Briefings in Bioinformatics, Volume 26, Issue 5, September 2025, bbaf482, https://doi.org/10.1093/bib/bbaf482
Introduction
The Fragment Linking generator can be used for generating compounds by proposing new linkers and/or scaffolds between two building blocks and their reaction centers. It is purely chemistry driven, so a good understanding of organic chemistry is required. By defining the exit vector (where the chemistry will take place) the generator will search for commercial building blocks which can react in such a position.
The Fragment Linking generator is particularly useful to guide molecule generation in scaffold hopping use-cases and during the hit-to-lead phase.
To access the Fragment Linking generator, click on the New Generator box in the "Generator" tab. Specify a name for this generator, and under "Generator Engine" select Fragment Linking.
Example Use-Case
Set-up
Exit Vectors
In this tab, you can set up the two fragments that will be connected by the fragment linking, as well as the corresponding exit vectors where the linking will happen. This set-up is required.
To set up your starting fragments, in the "Fragment" text field, sketch the fragment (by clicking on the pen icon) or copy-paste the SMILES sequence.
Guide: How to select a proper Exit Vector
NOTES ON EXIT VECTORS:
The fragment should not contain charged atoms. Protonation is performed directly inside Makya where needed.
When you select the exit vectors, the atoms in red are the ones that will be involved in the reaction. The Fragment Linking generator is chemistry-driven and the choice of exit vectors need to be informed by organic chemistry knowledge.
To specify exit vector(s), click on the arrow near the fragment text box to compute possible exit vectors.
In the exit vectors text fields that appears once calculation is done, type atom IDs one at a time and press Enter to start filtering possible exit vectors.
Advanced search options in the Exit vectors tab
In the bottom of the exit vector tab, you can find the "Advanced search" section. All features here are optional. The "Advanced search" lets you set the following options:
1) Multistep reactions
Here, you can allow the generator to perform multiple steps of reactions, and not just the linking step. By doing so, single-reactant steps (such as linker deprotection, etc) are allowed, thereby widening the potential chemical space available for exploration.
2) Name reactions
Choose here one or more name reactions from the list of 2000+ available name reactions to drive the generation to your liking. You can specify reactions you want to use, or reactions you want to forbid.
Chemical constraints
Building Blocks: In this panel, add constraints on the building blocks that can be used to connect your fragments. Sspecify structural constraints based on RDKit descriptors, or input the SMARTS for required and/or forbidden substructures. For instance, the screenshot below shows constraints to react on a building block which contains either Iodine and Chlorine, or Bromine and Chlorine.
NOTE: compound filters will drastically accelerate the generation by reducing the size of the catalog of commercially available building blocks that is explored by the algorithm.
Products: In this panel, add constraints on the final generated molecules. Specify structural constraints based on RDKit descriptors, input the SMARTS for required and/or forbidden substructures, or select a list of forbidden PAINS/Tox.
NOTE: substructure constraints on the final product are applied after molecule generation, to filter the results. As such, it is slower and less computationally efficient than substructure constraints in building block filters (see above). If appropriate, we recommend using substructure constraints on the building block catalogue.
Rewards
Scoring APIs: Plug in external scores or models that can be accessed to guide this generation (see Scoring APIs).
QSAR: Select among trained QSAR Models to guide the generation around a defined target product profile (see QSAR). (Be careful to select Models whose applicability domain covers the chemical space you are exploring!)
Chemical Spaces: Select one or several datasets, or specify SMILES that can be used to guide the generation in terms of Tanimoto Similarity to the chemical space.
3D Ligand based / 3D Structure based: Select 3D parameters in order to optimize the docking score, contact score, shape similarity to a reference ligand, or pharmacophore fingerprint.
Additional Scores
Post-Processors: Select any scores that will be calculated during the generation and available in post-processing (see Scorers).
Retrosynthesis: Select a set of retrosynthesis search parameters so that all molecules are given a synthetic accessibility score (Rscore, between 0 and 1) indicating their feasibility under the specified constraints.