This article describes how Makya generators make use of GPU memory on which they run. Understanding it can be useful in proper allocation of GPU resources and how queuing of jobs takes place.

GPU autoscaling

Makya makes use of GPU autoscaling, which automatically adjusts the number of available GPUs based on workload and performance requirements of the system. It can take up to 5 minutes to spawn a new GPU when a generator starts (that is when you see "Waiting for next worker").

Generator stopping

Generators stop automatically after 10 000 molecules generated for the generators using only 2D scores (such as QSAR models), and 5 000 molecules for the generators using 3D scores (3D ligand-based or 3D structure-based). The generator can be restarted to continue exploring, either where it left off or with a new seed (to explore in a new direction).

After a generator has completed its run, it takes a few minutes for the GPUs to shut down.

GPU sharing

Makya makes use of GPU sharing, thus allowing several generators to run on the same GPU.

The GPUs used by default are Nvidia T4 devices, which have 15GB of usable memory. It is important to understand how individual generators make use of GPUs:

1. Fine Tuning Generator: When one clicks Run on a Fine Tuning generator that has been setup, it has a pre-training step that goes over the configuration (as determined by the user) and chemical space. This trains an initial model that will allow it to determine a number of generation agents (that generates molecules) and parameters for these agents. You can view it as an hyper-parameter optimisation step: the pre-training tries to find the best set of agents. It then pushes these agents to a queue for the actual generation of final molecules.

Each agent (as described above) is run by a separate process which takes also about 3GB of memory on average. So one Fine Tuning generation can require between 1 to 13 processes to finish. 5 agent processes can run at the same time.

2. Fragment Growing and Fragment Linking generators: They both need only one process which takes about 10 to 14GB of memory. So you can run only one of them per GPU, and if you are lucky (when they take less than 12GB of Memory) you can have a Fine Tuning agent process running on the same GPU.

Runtime estimation

It is difficult to predict the runtime of each process, as it depends on the type of generator, the configuration you put on them, the speed of your APIs if you have any. The ETA gives you an estimate based on the elapsed time but it can be inaccurate, if your APIs slow down, or the algorithm finds less and less molecules as it advances.

In the case of a generator that uses structure-based parameters, the total runtime is heavily impacted by the docking time (several minutes for each generated molecule). Makya performs docking in parallel, on up to several thousands molecules at a time.

GPU autoscaling

Generator stopping

GPU sharing

Runtime estimation

Related articles