plink --vcf chr19-clean.vcf.gz --genome gz --out chr19-clean --extract chr19-clean.prune.in. The --genome option tells plink to compute genome-wide IBD estimates (as this file is rather large, we ask plink to store it in a gzipped file). We've used the --extract option to tell plink to only use the LD-pruned set of SNPs we computed above.

This is basically a matrix with the first three 5 columns identical to those in a vcf - i.e. chromosome, ID, position, reference allele, alternative allele. After this, each entry is the phased allele for each individual, where 0 is the reference allele and 1 is the alternative.

Errors and warnings. When PLINK detects that something is nonstandard and/or wrong, it will usually display and log a message to that effect. In order of increasing severity, there are three classes of such messages: 'Note', 'Warning', and 'Error'.

A toy dataset in VCF. For the purpose of concretely introducing the four approaches mentioned above, here we consider a very simple toy project, that consists of data on 8 unique individuals and 4 variants from 2 VCF files, called ex1.vcf

Import Sorted VCF Files. This function imports 1000 Genomes .vcf file data into multiple spreadsheets. Special handling is provided for genotype data. The user can choose to import one VCF file or several VCF files simultaneously. More info » 2015-04-22: Add Annotation Data to Marker Map From Spreadsheet

BCF1. The BCF1 format output by versions of samtools <= 0.1.19 is not compatible with this version of bcftools. To read BCF1 files one can use the view command from old versions of bcftools packaged with samtools versions <= 0.1.19 to convert to VCF, which can then be read by this version of bcftools.

