References
For more information, please check our paper:
Clarisse Descamps, Vincent Bouttier, Juan Sanz García, Maoussi Lhuillier-Akakpo, Quentin Perron, Hamza Tajmouati, Growing and linking optimizers: synthesis-driven molecule design, Briefings in Bioinformatics, Volume 26, Issue 5, September 2025, bbaf482, https://doi.org/10.1093/bib/bbaf482
Introduction
The Fragment Growing generates molecules block by block, combining commercially available building blocks (1 million) using the rules defined by organic reaction templates. Molecules are then scored on a set of user-defined scores, and this information is used to guide the generator towards optimized molecules. The Fragment Growing generator is particularly useful to guide molecule generation in the hit discovery and hit-to-lead phases.
This generator is purely chemistry driven, so a good understanding of organic chemistry is required. For instance, being able to draw the appropriate leaving group for a SN2 reaction to happen is important, otherwise the molecules won't be generated as expected.
Example Use-Cases
- Hit-To-Lead using Fragment Growing
- Growing around a hinge binding central core (kinase inhibitor)
- Growing around a fragment using a 3D reference molecule
- Growing around a fragment using 3D structure-based design
Set-Up
Exit Vectors
In this tab, you can set up a starting fragment and 1 or 2 exit vectors.
Both are optional features:
-
If no starting fragment is selected, for each new generated molecule, Makya will select a starting fragment from our catalogue of commercially available building blocks.
- Specifying a good starting fragment can be extremely powerful and is often necessary when using scoring functions that require a reference structure for alignment (anchor), such as 3D scores.
- If no exit vector is selected, Makya will use our reaction predictor to select not only the next building block to attach to your fragment, but also the reaction and reaction site on the fragment in order to maximize the scoring function.
To set up a starting fragment that molecules are grown from, in the "Fragment" text field, sketch the fragment (by clicking on the pen icon) or copy-paste the SMILES sequence.
Guide: How to select a proper Exit Vector
NOTES ON EXIT VECTORS:
The fragment should not contain charged atoms. Protonation is performed directly inside Makya where needed.
When you select the exit vectors, the atoms in red are the ones that will be involved in the reaction. The Fragment Growing generator is chemistry-driven and the choice of exit vectors need to be informed by organic chemistry knowledge.
To specify exit vector(s), click on the arrow near the fragment text box to compute possible exit vectors.
In the exit vectors text box that appears once calculation is done, type atom IDs one at a time and press Enter to start filtering possible exit vectors. The two side by side panels allow for specifying up to 2 exit vectors. If 2 exit vectors are specified, molecules are grown first from exit vector #1, then the resulting intermediate is grown from exit vector #2.
Advanced search options in the Exit vectors tab
In the bottom of the exit vector tab, you can find the "Advanced search" section. All features here are optional. The "Advanced search" lets you set the following options:
1) Macrocyclization
See more details in the generation with macrocyclization section. When a macrocyclization option is selected, the number of generation steps is necessarily ≥ 2.
2) Generation steps
Define here the number of algorithmic steps that Makya will use to generate a new molecule. Steps can be single-reactant (a modification on a fragment or intermediate, without addition of a building block; for example, protection and deprotection steps discovered automatically, and much more) or double-reactant (the combination of two fragments). By playing on this parameter, you can influence the generator to explore simpler or more complex structures.
NOTE: this limit is only a suggestion of a maximum limit for the generation process; Makya may choose to stop adding steps before reaching the specified maximum, with the goal to optimize the fitness function.
NOTE: we recommend starting with a small maximum number of steps, such as 2, to see if you are satisfied with the results; if not, you can increase the number to give more freedom of exploration to the generator.
Example of a reaction tree generating a molecule through one double-reactant growing step.
3) Name reactions
Choose here one or more name reactions from the list of 2000+ available name reactions to drive the generation to your liking. You can specify reactions you want to use, or reactions you want to forbid.
4) Initial fragment protection
Choose here whether the generator is allowed to perform reactions on atoms of the initial fragment that are not selected as an exit vector, if such reactions can generate molecules with interesting properties. If the button is not checked, the starting fragment will be protected and only the atoms selected as exit vectors will be modified by the generation process.
Example of generation with fragment protection (Option = "No")
Example of generation without fragment protection (Option = "Yes")
Chemical constraints
Building Blocks: In this panel, add constraints on the building blocks that are plugged on your intermediate. For each compound, specify structural constraints based on RDKit descriptors, or input the SMARTS for required and/or forbidden substructures. For instance you can add a Bromine atom as substructure match if you want to do a Suzuki type reaction (with the boronic acid on your initial fragment). You can also prevent some substructures.
NOTE: compound filters will drastically accelerate the generation by reducing the size of the catalog of commercially available building blocks that is explored by the algorithm.
Products: In this panel, add constraints on the final generated molecules. Specify structural constraints based on RDKit descriptors, input the SMARTS for required and/or forbidden substructures, or select a list of forbidden PAINS/Tox.
NOTE: substructure constraints on the final product are applied after molecule generation, to filter the results. As such, it is slower and less computationally efficient than substructure constraints in building block filters (see above). If appropriate, we recommend using substructure constraints on the building block catalogue.
Rewards
Scoring APIs: Plug in external scores or models that can be accessed to guide this generation (see Scoring APIs).
QSAR: Select among trained QSAR Models to guide the generation around a defined target product profile (see QSAR). (Be careful to select Models whose applicability domain covers the chemical space you are exploring!)
Chemical Spaces: Select one or several datasets, or specify SMILES that can be used to guide the generation in terms of Tanimoto Similarity to the chemical space.
3D Ligand based / 3D Structure based: Select 3D parameters in order to optimize the docking score, contact score, shape similarity to a reference ligand, or pharmacophore fingerprint.
Additional Scores
Post-Processors: Select any scores that will be calculated during the generation and available in post-processing (see Scorers).
Retrosynthesis: Select a set of retrosynthesis search parameters so that all molecules are given a synthetic accessibility score (Rscore, between 0 and 1) indicating their feasibility under the specified constraints.
Advanced
The Fragment Growing generator is a deep-learning algorithm that has been trained to generate molecules block by block. This generation is guided by reinforcement learning so that the selection of plugged building blocks is optimized for a combination of given rewards.
The Fragment Growing generator is very complementary to the fine-tuning generator. If an advanced intermediate is available, either use the Fragment Growing on this intermediate (with suitable descriptors to avoid undesirable fragments) or the fine-tuning generator with the intermediate as a substructure constraint. The fine-tuning will generate more conservative compounds, while the Fragment Growing (which has no inherent applicability domain constraint) can be more creative.
NOTE: the output of the Fragment Growing generator can be used as a chemical space of the fine-tuning generator. It’s a way to enrich a selection of interesting compounds and optimize them.