Arlequin and R under Linux

Submitted by vojta on Tue, 07/01/2014 - 12:42

Arlequin is very popular tool for population genetics and in recent version (3.5) it has version running on Linux (arlecore, only computational core without GUI) as well as possibility to parse output using R statistical language. Those two features are described only briefly in official manual. I faced some issues when running Arlequin on Linux and parsing output using R. I'm describing here my solutions in case someone else would hava similar needs and problems. :-) I work on openSUSE 13.1, but it should work on any other Linux distribution in same way.

Running Arlequin on Linux

Windows version of Arlequin runs perfectly on Linux using Wine. Just install wine package and launch WinArl35.exe. It is possible to run all analysis in Windows version using Wine, but using native Linux version is much more faster. It is necessary to launch Windows Arlequin at least once to create Arlequin project (*arp) and settings (*.ars) files, respectively. Linux version is only computational core without graphical user interface and especially settings file is too complicated to create it manualy. I'm not going to describe those files and how to create them. Please, consult original documentation.

Arlecore (Arlequin computational core) is available in 32 and 64 bit versions. Those versions are same, 32bit is supposed to be just about 10 % slower. I had problems with crashing of 64bit version, but 32bit was running well. Basic information are described in arlecore_readme.txt (part of downloaded ZIP file together with binaries). To start computations copy Arlecore binary and settings and project files to same directory and type something like ./arlecore project_file.arp settings_file.ars. When computations are over, directory named project_name.res containing all results is created. You can view result by opening project_name_main.htm in web browser.

How to view results correctly

For correct display of JavaScript tree menu in left column and correct formating of HTML output Arlequin supposes certain location of needed images, scripts and styles, which will likely be wrong on custom system. Following section describes how to correct it. Create directory style in Arlequin results directory and copy there all GIF, JS and XSL files from ZIP containing installation of Arlequin downloaded from its web. Result files with copied images and style files should like like this:

├── Arlequin_log.txt
├── project_name.js
├── project_name_main.htm
├── project_name_sim1.arp
├── project_name_tree.htm
├── project_name.xml
├── fdist2_ObsOut.txt
├── fdist2_simOut.txt
├── ld_dis.xl
├── MissDataPerLocus.txt
└── style
    ├── ArlequinStyleSheet.xsl
    ├── diffDoc.gif
    ├── diffFolder.gif
    ├── ftiens4.js
    ├── ftv2blank.gif
    ├── ftv2doc.gif
    ├── ftv2folderclosed.gif
    ├── ftv2folderopen.gif
    ├── ftv2lastnode.gif
    ├── ftv2link.gif
    ├── ftv2mlastnode.gif
    ├── ftv2mnode.gif
    ├── ftv2node.gif
    ├── ftv2plastnode.gif
    ├── ftv2pnode.gif
    ├── ftv2vertline.gif
    └── ua.js

Now, it is necessary to correct paths for images and styles. Open the JS file and edit ICONPATH line to ICONPATH = 'style/' and delete repeated lines (if there are any). Don't be surprised if you shrink filesize from ~10 MB to ~4 kB. :-) In the XML file edit line starting <?xml-stylesheet... (head of the document) to <?xml-stylesheet type="text/xsl" href="style/ArlequinStyleSheet.xsl"?> Now ensure there are relative paths of frames in the file project_name_main.htm. It should look like this:

<FRAMESET cols="200,*">
  <FRAME src="project_name_tree.htm" name="treeframe">
  <FRAME SRC="project_name.xml" name="basefrm">

Last file to edit is the second htm file called project_name_tree.htm where you again must set relative paths for scripts ua.js and ftiens4.js, so that the respective lines should look like <script src="styl/ua.js"></script> and <script src="styl/ftiens4.js"></script></code> respectively. Now, the results should look fine in any browser and you can safely move the folder to any other location.

Generating graphical output with R

From Arlequin download page download file a unzip it anywhere. One of those functions is called rParsingSettings.r contains paths you need to edit. As infile set path to that XML file within Arlequin results directory (for example /home/vojta/Documents/analysis/project_name.res/project_name.xml), as outfiles set some directory within Arlequin results directory (for example /home/vojta/Documents/analysis/project_name.res/Graphics/). The directory Graphics doesn't exist, so that you have to create it in advance. Last parameter is sourcePath. Set there path to the directory with R functions (for example /home/vojta/bin/arlequin/Rfunctions/).

Now You can launch R. You can work from command line (just type R in terminal) or use some graphical environment as RStudio or RKWard. First, you need to install R XML package. If you are going to compile the package yourself (common situation in Linux), you need also development package of XML library (this package is called libxml2-devel in openSUSE) and common compilation tools (it use to be preinstalled by most of Linux distributions). So install XML package in R using command install.packages("XML") and load it using library(XML). Now, you can just launch function rParsingSettings_fst.r and it adds some graphics to your Arlequin output. So use command like source("/home/vojta/bin/arlequin/Rfunctions/rParsingSettings_fst.r") and that's it. Done. Now, check all your Arlequin results as usually in the browser.

One script to rull them all

You can of course make an easy script to do all the work for you, keep it and reuse it in any future analysis. It can look like this:

#! /bin/bash
# Ensure we are in correct directory
cd ~/Documents/research/projectX/arlequin/
# Launch Arlequin and record it output to log file
./arlecore3513_32bit projectx.arp project.ars | tee projectx_arlequin.log
# Create directory for graphics
mkdir projectx.res/Graphics
# Launch easy R script to add some graphical output (see further)
# You can also use command Rscript arlequin.r which opens classical R command line interface
# R CMD BATCH logs R output to file nuphar.r.Rout and doesn't keep R window opened when done
R CMD BATCH arlequin.r

The R scprit can look like this:

# Ensure we are in correct directory
# Load XML library
# Do the work

You can of course copy rParsingSettings.r anywhere and edit it. All this is just about correct paths to ensure all files are in correct locations. When so, it is very easy and straightforward.