iQTL Tutorial

Below is a summary of command line options, usage scenarios, and output for analyses performed using iqtl.
iQTL requires three input files: PLINK .bed/.bim/.fam or .map/.ped, a matrix file of transcript expression, and a coordinates file.

Summary of Commands

--iqtl
Command to run epistasis eqtl
--transcript-matrix [transcript.matrix]
Dimensions: (n+1) by (m+2); n= of individuals (plus one for header row); m = of transcripts (plus two for the FID and IID columns) (tab-separated)
--coordinates [transcripts.bed]
Format is four columns: CHR, BP_Start, BP_End, Transcript_Name (tab-separated) Transcript_Name should match header in transcript.matrix
--local-cis
Command specifies to not consider the whole chromosome as cis but only the area surrounding the transcript
--radius [default=1000]
Specifies number of kilobases to use as “radius” or the area around the transcript to be considered “cis”
--TF
Select SNPs in range of 1 or more transcription factors
--TF-radius [default=0] arg
Number of kilobases considered within TF radius
--TF-file arg
Select coordinates file different from default
--full
iQTL computes pairwise interactions for all SNPs (including trans-trans) and all transcripts. Output matches other iQTL style output. Not recommended unless user has a significant amount of hard disk space available.
--covar [covars.txt]
Covariate file for eQTL and epiQTL models that match Plink style for covariates

Use Cases

Setup

Imagine a transcript with the following line in the coordinate file:

11 45000000 50000000 ILMN_113452

Case 1
All SNPs on chromosome 11 are considered cis, and iQTL computes all interactions, such that at least one SNP is on chromosome 11 is in the regression model.
Case 2
All SNPs on chromosome 11 within the coordinates 44000000-51000000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region.
Case 3
All SNPs on chromosome 11 within the coordinates 44500000-50500000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region.
Case 4
All SNPs on chromosome 11 within the coordinates 44500000-50500000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region. Consider interactions with transcription factors only.
Case 5
All SNPs on chromosome 11 within the coordinates 44500000-50500000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region. Consider interactions with transcription factors only. Use a radius to extend the range of lookup table ranges.
Case 6
All SNPs on chromosome 11 within the coordinates 44500000-50500000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region. Consider interactions with transcription factors only. Use a radius to extend the range of lookup table ranges. Use a file of transcription factor ranges rather than the default, built-in lookup table.

Command Lines


  1. ./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --out run

  2. ./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --out run2

  3. ./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --radius 50 --out run3

  4. ./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --radius 50 --TF --out run4

  5. ./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --radius 50 --TF --TF-radius 50 --out run5

  6. ./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --radius 50 --TF --TF-radius 50 --TF-file TF-coord.bed --out run6

Outputs

To reduce output file sizes, each unique transcript analyzed has its own eQTL and iQTL output file.
run.testnumbers.txt
Format is two tab-separated columns. The first column is each transcript analyzed. The second is the number of tests performed, later used for BH correction.
run.loopinfo.txt
Format is three tab-separated columns. The first column is each transcript analyzed. The second is the number of SNPs found cis (or all if full model.) The third is the number of SNPs found as transcription factors (or all if full model.)
run.*transcript*.eqtl.txt
Since the single locus regression fit (Gene Expression ~ B0 + B1SNP) is necessary for the interaction model, iQTL automatically computes the single-locus eQTLs. This enables the analysis of the implicated epistatic eQTL for marginal effects for either of the SNPs by themselves. Format is four tab-separated columns. The first column is the SNP. The second is the transcript. The third is the B1 coefficient, and fourth is the pvalue of the beta.
run.*transcript*.iqtl.txt
This output is the most significant output of iQTL. Assumes the regression model Gene Expression ~ B0 + B1SNPa + B2SNPb + B#(SNPaxSNPb). Output is five tab-separated columns. The first is SNPa. The second is SNPb. The third is the transcript name, where *transcript* is the name of the transcript specified in transcript.matrix. The fourth is the B3 coefficient, and the fifth is the B3 p-value.