encore

Encore is a free, open-source command-line tool for analysis of GWAS (SNP) and other types of biological data. Several modes are available for various types of analysis, including:

  • Regression of SNPs and quantitative data in epistasis interaction networks (Regression GAIN/reGAIN)
  • Eigenvector centrality ranking of top SNPs/quantitative attributes (SNPrank)
  • Machine learning feature selection algorithms useful for filtering initially large data sets to the top few thousand attributes for subsequent analysis (Evaporative Cooling/EC)
  • Data formats and encoding of SNP and quantitative data types
    (PLINK)
    Several additional options ported from the PLINK library

Encore utilizes a handful of libraries to provide various functionality. The most prominent is PLINK, a ubiquitous third-party
GWAS analysis tool. We have modified the latest stable source release of PLINK into a library used to handle GWAS formats, filter data, and provide statistical/association tests. See our PLINK project page for more details.

Evaporative Cooling (EC) is another key library, and provides feature selection of SNPs and quantitative data, using ReliefF and Random Jungle for interactions and main effects, respectively. EC is also available as a standalone tool, both as a source and binary release.

When getting started with Encore, the tutorial should be helpful. Several usage scenarios are described, and full command-line examples are provided.

Encore is available as a binary release for Linux, Mac, and Windows. The download links below reference either a compressed archive (.zip on Windows, .tar.bz2 on Linux), or a package installer (Mac OS X). We also provide the complete source code used to build the binaries. The source and associated distribution files are hosted on Github. Instructions on compiling Encore from source, as well as the required dependencies, are provided on the Github project page.

The latest version of Encore is 1.0, released 2-29-12.