STRUCTURE multi PBS Pro scripts

Submitted by vojta on Thu, 02/25/2021 - 17:24

Set of scripts to run STRUCTURE in parallel on computing grids like MetaCentrum. Scripts are designed for grids and clusters using PBS Pro, but can be easily adopted for another queue system.

Homepage and reporting issues

See, ask about usage or so at and report any issues or wishes using


GNU General Public License 3.0, see

About STRUCTURE and its parallelization

STRUCTURE itself process single file in time. It has simple Java GUI available to create batch task and run on desktop, or also possibly on MetaCentrum. Other option in ParallelStructure R package (see my example and slides), but it has problems with some input file formats. It runs on single computer, using multiple cores. Provided scripts distribute individual runs of STRUCTURE among multiple computers in computing cluster/grid, which speeds up everything a lot.

Requirements to use the scripts

The scripts are written for Linux servers. They might be running on another UNIX systems. Apart of BASH, the only requirement is STRUCTURE. It is already installed on MetaCentrum, so that user can simply load the module. If using own installation of STRUCTURE, either comment out or update respective line in script If you are unsure how to work in Linux command line on computing cluster, consult e.g. my slides or MetaCentrum wiki.

Postprocessing of the results

For next step collect all res.k.X.rep.Y.out_f files in the output directory. Select the best K using e.g. Structure_sum R script (see my example and slides) or Structure Harvester. Align and reorder the results with CLUMPP and draw final plots by e.g. distruct. See also my complete example.