Automatic fragmentation driver.
The autofrag module was created with the intent of making fragmentation-based calculations (FLMO, iOI, etc.) easy to perform. In particular, the following tasks, which normally require intensive user intervention, can now be done completely automatically in an almost black-box manner:
- Generate atomic connectivity information (in other words, generate bonds between atoms), when only the atomic coordinates but not the connectivity information are known
- Cut some of the bonds to give reasonably sized, chemically sensible fragments
- Add buffer atoms to the fragments
- Add link atoms (hydrogens, or projected hybrid orbital (PHO) boundary atom) to saturate dangling bonds
- Prepare subsystem input files, and run them in an embarrassingly parallel fashion
- Prepare and run the global system input file
Unlike most other modules which are written in Fortran, autofrag is written in Python, yet it is called in exactly the same way as other modules. One writes an $autofrag block in the input file, which must precede any other modules, including $compass, i.e.
$autofrag ... $end $compass ... $end # remaining input blocks...
When creating subsystem and global system input files, autofrag uses the input blocks after the $autofrag block as a template for creating the input files. Below is a full example:
$autofrag method ioi # To request a conventional FLMO calculation, change ioi to flmo nprocs 2 # Use at most 2 parallel processes in calculating the subsystems $end $compass Title hydroxychloroquine (diprotonated) Basis 6-31G(d) Geometry # snapshot of GFN2-xTB molecular dynamics at 298 K C -4.2028 -1.1506 2.9497 C -4.1974 -0.4473 4.1642 C -3.7828 0.9065 4.1812 C -3.4934 1.5454 2.9369 C -3.4838 0.8240 1.7363 C -3.7584 -0.5191 1.7505 H -4.6123 -0.8793 5.0715 C -3.3035 3.0061 2.9269 H -3.1684 1.2214 0.8030 H -3.7159 -1.1988 0.9297 C -3.1506 3.6292 4.2183 C -3.3495 2.9087 5.3473 H -2.8779 4.6687 4.2878 H -3.2554 3.3937 6.3124 N -3.5923 1.5989 5.4076 Cl -4.6402 -2.7763 3.0362 H -3.8651 1.0100 6.1859 N -3.3636 3.6632 1.7847 H -3.4286 2.9775 1.0366 C -3.5305 5.2960 -0.0482 H -2.4848 5.4392 -0.0261 H -3.5772 4.3876 -0.6303 C -4.1485 6.5393 -0.7839 H -3.8803 6.3760 -1.8559 H -5.2124 6.5750 -0.7031 C -3.4606 7.7754 -0.2653 H -2.3720 7.6699 -0.3034 H -3.7308 7.9469 0.7870 N -3.8415 8.9938 -1.0424 H -3.8246 8.8244 -2.0837 C -2.7415 9.9365 -0.7484 H -1.7736 9.4887 -0.8943 H -2.8723 10.2143 0.3196 C -2.7911 11.2324 -1.6563 H -1.7773 11.3908 -2.1393 H -3.5107 10.9108 -2.4646 H -3.0564 12.0823 -1.1142 C -5.1510 9.6033 -0.7836 H -5.5290 9.1358 0.1412 H -5.0054 10.6820 -0.6847 C -6.2224 9.3823 -1.8639 H -6.9636 10.1502 -1.7739 H -5.8611 9.4210 -2.8855 O -6.7773 8.0861 -1.6209 H -7.5145 7.9086 -2.2227 C -4.0308 4.9184 1.3736 H -3.7858 5.6522 2.1906 C -5.5414 4.6280 1.3533 H -5.8612 3.8081 0.7198 H -5.9086 4.3451 2.3469 H -6.1262 5.5024 1.0605 End geometry Skeleton $end $xuanyuan Direct Schwarz rs # the range separation parameter omega (or mu) of wB97X 0.3 $end $scf rks dft wB97X iprt 2 charge 2 coulpot+cosx $end $localmo FLMO $end
This performs an iOI calculation of hydroxychloroquine dication at the wB97X/6-31G(d) level of theory, using MPEC+COSX to accelerate the calculation (though given the small size of the system there is hardly any acceleration). Some comments:
- The job will automatically generate a few input files xxx.fragment*.inp (which are the input files for subsystem calculations), and one input file xxx.global.inp (which is the input file for global calculation), and run them automatically in the correct order.
- If the $autofrag block is removed, the rest of the input file is still a valid input file, that calculates the global system without using fragmentation methods.
- The charge and spin of the global system (when they differ from 0 and 1, respectively) must be given explicitly in the $scf block.
- All input blocks are simply copied to the subsystem input files; all input blocks except the final $localmo block are copied to the global input file. Thus, while all subsystem SCF calculations are followed by an orbital localization step, the global system SCF calculation is not followed by orbital localization; instead, the converged FLMOs are obtained by the bottom-up least-change method. Of course, for the subsystem input files, the geometry in the $compass block will be replaced by the subsystems' geometries, and the charge and spin in the $scf block will be modified accordingly.
- Input blocks other than $compass, $xuanyuan, $scf and $localmo may not be handled correctly. Thus, if you want to do, e.g. a fragmentation-based SCF gradient or TDDFT calculation, you have to prepare a separate input file and read in the converged global wavefunction from xxx.global.scforb.
Autofrag depends on bdffrag as a low-level facility. For certain tasks it may be necessary to call bdffrag from command line; please see the bdffrag page for details.
Keywords
method
Fragmentation method. Valid values: FLMO; iOI.
Note that as for now only RKS calculations are supported by iOI. However, FLMO supports RKS, UKS and ROKS calculations.
nprocs
Maximum number of subsystems allowed to run in parallel in the embarrassingly parallel subsystem calculations. When nprocs is smaller than OMP_NUM_THREADS, Autofrag will adopt a hybrid parallelization scheme: each subsystem calculation will themselves be OpenMP-parallelized, so that the total number of concurrent OpenMP threads is smaller than or equal to OMP_NUM_THREADS. For example, if OMP_NUM_THREADS=16 and
nprocs 2
then a maximum of 2 subsystems will be calculated at the same time, each using 8 OpenMP threads. For obvious reasons, it is recommended that OMP_NUM_THREADS be an integer multiple of nprocs.
radcent
A parameter used for determining the size of primitive fragments. Increasing radcent will increase the size of the subsystems and decrease the number of the subsystems. Default: 3.0.
radbuff
Buffer radius (in Angstrom). Unlike radcent, increasing radbuff will only increase the size of the subsystems but not decrease the number of the subsystems. Default: 2.0.
For iOI calculations, radbuff refers to the buffer radius of macroiteration 0. The buffer radius will be automatically enlarged for subsequent macroiterations.
ioithresh
Subsystem convergence threshold in an iOI calculation. Decreasing ioithresh will lead to more iOI macroiterations, but better initial guess LMOs for the global SCF and therefore fewer global SCF iterations. Default: 0.1.
PHO
If specified, use PHO boundary atoms instead of hydrogen link atoms; the former is slightly more expensive, but more accurate, than the latter, and is especially recommended when the buffer radius is small or even zero. This is the default.
NoPHO
Turn off PHO, and use hydrogen link atoms instead.
Charge
Specify formal charges of certain atoms, to aid the program fragment the molecule correctly and assign correct charges to the subsystems. This is sometimes necessary for molecules with unconventional bonding patterns, especially molecules that contain metals. For example,
charge 10 +2 25 -1 78 -1
sets atom 10 to +2 charge, and atoms 25 and 78 to -1 charge each.
SpinOcc
Specify formal spins of certain atoms, to guide the calculation towards a desired spin state. The grammar is the same as that of keyword "charge". For example,
spinocc 13 +1 17 -1
specifies one unpaired alpha electron on atom 13 and one unpaired beta electron on atom 17. Note that all open-shell atoms must be explicitly indicated. When one or more of the spins are delocalized over more than one atoms, one should draw the dominant resonance structure and assign the spins as if the spins are localized according to that resonance structure.
MaxIter
Maximum number of iOI macroiterations.
Dryrun
Only generate the subsystem and global input files, but do not run them.
Interactive
Pause the program after the subsystems have been calculated, so that the user can manually adjust the orbital occupations by inspecting the subsystem wavefunctions and modifying the global calculation's input file. When the user has finished editing, the user should create a file named $BDFTASK.done under the working directory, so that the program knows that the editing has been finished. The program then deletes the $BDFTASK.done file and go on with the global calculation.