Below is a summary of command line options, usage scenarios, and output for analyses performed using iqtl.
iQTL requires three input files: PLINK .bed/.bim/.fam or .map/.ped, a matrix file of transcript expression, and a coordinates file.
Summary of Commands
--iqtl
- Command to run epistasis eqtl
--transcript-matrix [transcript.matrix]
- Dimensions: (n+1) by (m+2); n= of individuals (plus one for header row); m = of transcripts (plus two for the FID and IID columns) (tab-separated)
--coordinates [transcripts.bed]
- Format is four columns: CHR, BP_Start, BP_End, Transcript_Name (tab-separated) Transcript_Name should match header in transcript.matrix
--local-cis
- Command specifies to not consider the whole chromosome as cis but only the area surrounding the transcript
--radius [default=1000]
- Specifies number of kilobases to use as “radius” or the area around the transcript to be considered “cis”
--TF
- Select SNPs in range of 1 or more transcription factors
--TF-radius [default=0] arg
- Number of kilobases considered within TF radius
--TF-file arg
- Select coordinates file different from default
--full
- iQTL computes pairwise interactions for all SNPs (including trans-trans) and all transcripts. Output matches other iQTL style output. Not recommended unless user has a significant amount of hard disk space available.
--covar [covars.txt]
- Covariate file for eQTL and epiQTL models that match Plink style for covariates
Use Cases
Setup
Imagine a transcript with the following line in the coordinate file:
11 45000000 50000000 ILMN_113452
- Case 1
- All SNPs on chromosome 11 are considered cis, and iQTL computes all interactions, such that at least one SNP is on chromosome 11 is in the regression model.
- Case 2
- All SNPs on chromosome 11 within the coordinates 44000000-51000000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region.
- Case 3
- All SNPs on chromosome 11 within the coordinates 44500000-50500000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region.
- Case 4
- All SNPs on chromosome 11 within the coordinates 44500000-50500000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region. Consider interactions with transcription factors only.
- Case 5
- All SNPs on chromosome 11 within the coordinates 44500000-50500000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region. Consider interactions with transcription factors only. Use a radius to extend the range of lookup table ranges.
- Case 6
- All SNPs on chromosome 11 within the coordinates 44500000-50500000 are considered cis. Thus, all interactions would be computed where at least one SNP is in this region. Consider interactions with transcription factors only. Use a radius to extend the range of lookup table ranges. Use a file of transcription factor ranges rather than the default, built-in lookup table.
Command Lines
-
./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --out run
-
./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --out run2
-
./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --radius 50 --out run3
-
./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --radius 50 --TF --out run4
-
./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --radius 50 --TF --TF-radius 50 --out run5
-
./iqtl --bfile plinkBfile --transcript-matrix transcript.matrix --coordinates transcripts.bed --iqtl --local-cis --radius 50 --TF --TF-radius 50 --TF-file TF-coord.bed --out run6
Outputs
To reduce output file sizes, each unique transcript analyzed has its own eQTL and iQTL output file.
run.testnumbers.txt
- Format is two tab-separated columns. The first column is each transcript analyzed. The second is the number of tests performed, later used for BH correction.
run.loopinfo.txt
- Format is three tab-separated columns. The first column is each transcript analyzed. The second is the number of SNPs found cis (or all if full model.) The third is the number of SNPs found as transcription factors (or all if full model.)
run.*transcript*.eqtl.txt
- Since the single locus regression fit (Gene Expression ~ B0 + B1SNP) is necessary for the interaction model, iQTL automatically computes the single-locus eQTLs. This enables the analysis of the implicated epistatic eQTL for marginal effects for either of the SNPs by themselves. Format is four tab-separated columns. The first column is the SNP. The second is the transcript. The third is the B1 coefficient, and fourth is the pvalue of the beta.
run.*transcript*.iqtl.txt
- This output is the most significant output of iQTL. Assumes the regression model Gene Expression ~ B0 + B1SNPa + B2SNPb + B3(SNPaxSNPb). Output is five tab-separated columns. The first is SNPa. The second is SNPb. The third is the transcript name, where *transcript* is the name of the transcript specified in transcript.matrix. The fourth is the B3 coefficient, and the fifth is the B3 p-value.