Association Studies

Genome-Wide Association Studies (GWAS) with PLINK

Version: v0.0.1

Use Cases

Determine association between a trait and genotypes. Association is a statistically significant difference in genotype between affected individuals (those with the trait) and unaffected individuals (those without the trait).


This workflow uses PLINK [12] to determine associations between traits and genotypes. This workflow removes position and samples with too much missing data (> 95%), positions with minor allele frequency (threshold 0.05) and uses HWE to filter out all variants which have Hardy-Weinberg equilibrium exact test p-value below the provided threshold (default 0.001). Population stratification is determined by complete linkage clustering using IBS calculations. Positions exhibiting excessive linkage disequilibrium are removed. Finally association is calculated using fisher exact tests.


  • Original standard text format for sample pedigree information and genotype calls.
  • Variant information file
  • A text file with no header line, and one line per variant with the following 3-4 fields:
    1. Chromosome code. PLINK 1.9 also permits contig names here, but most older programs do not.
    2. Variant identifier
    3. Position in morgans or centimorgans (optional; also safe to use dummy value of '0')
    4. Base-pair coordinate


  • Association Data: AnalysisName.prune
  • Filtered Data: AnalysisName.filt
  • Hardy Weinberg Equilibrium: AnalysisName.hwe

Workflow Walkthrough

  1. Navigate to the GWAS with PLINK launcher card. You can also find this workflow using the search bar in the top right corner, or by using the “Association Studies” filter on the left-hand side.
  2. image
  3. Select the version from the dropdown menu in the top right corner, and click “Run Workflow” when ready
  4. image
  5. Launcher Tabs
    1. Select a MAP file for variant information, and a PED file for pedigree information and genotype calling
    2. image
    3. Tune workflow parameters, including the minor allele frequency (MAF) and Hardy-Weinberg Equilibrium statistic (HWE).
    4. image
    5. Name the workflow run and review workflow inputs and parameters. When ready to submit, click “Run Workflow” in the bottom left corner
    6. image


  1. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81.
  2. PLINK 2.0, Shaun Purcell, http://pngu.mgh.harvard.edu/purcell/plink/

Built with