snpFreq

significant SNPs in case-control data (Galaxy Version 1.0.1)

Tool Parameters

Format of input
Please provide a value for this option.
* required
Parameter 'group1_1': an invalid option (None) was selected, please verify
* required
Parameter 'group1_2': an invalid option (None) was selected, please verify
* required
Parameter 'group1_3': an invalid option (None) was selected, please verify
* required
Parameter 'group2_1': an invalid option (None) was selected, please verify
* required
Parameter 'group2_2': an invalid option (None) was selected, please verify
* required
Parameter 'group2_3': an invalid option (None) was selected, please verify
* required

Additional Options

Send an email notification when the job completes.

Help

Dataset formats

The input is tabular, with six columns of allele counts. The output is also tabular, and includes all of the input data plus the additional columns described below. (Dataset missing?)


What it does

This tool performs a basic analysis of bi-allelic SNPs in case-control data, using the R statistical environment and Fisher's exact test to identify SNPs with a significant difference in the allele frequencies between the two groups. R's "qvalue" package is used to correct for multiple testing.

The input file includes counts for each allele combination (AA aa Aa) for each group at each SNP position. The assignment of codes (1 2 3) to these genotypes is arbitrary, as long as it is consistent for both groups. Any other input columns are ignored in the computation, but are copied to the output. The output appends eight additional columns, namely the minimum expected counts of the three genotypes for each group, the p-value, and the q-value.


Example

  • input file:

    chr1  210  211  38  4  15  56  0   1   x
    chr1  228  229  55  0  2   56  0   1   x
    chr1  230  231  46  0  11  55  0   2   x
    chr1  234  235  43  0  14  55  0   2   x
    chr1  236  237  55  0  2   13  10  34  x
    chr1  437  438  55  0  2   46  0   11  x
    chr1  439  440  56  0  1   55  0   2   x
    chr1  449  450  56  0  1   13  20  24  x
    chr1  518  519  56  0  1   38  4   15  x
    

Here the group 1 genotype counts are in columns 4 - 6, while those for group 2 are in columns 7 - 9.

Note that the "x" column has no meaning. It was added to this example to show that extra columns can be included, and to make it easier to see where the new columns are appended in the output.

  • output file:

    chr1  210  211  38  4  15  56  0   1   x  47    2   8     47    2   8     1.50219088598917e-05  6.32501425679652e-06
    chr1  228  229  55  0  2   56  0   1   x  55.5  0   1.5   55.5  0   1.5   1                     0.210526315789474
    chr1  230  231  46  0  11  55  0   2   x  50.5  0   6.5   50.5  0   6.5   0.0155644201009862    0.00409590002657532
    chr1  234  235  43  0  14  55  0   2   x  49    0   8     49    0   8     0.00210854461554067   0.000739840215979182
    chr1  236  237  55  0  2   13  10  34  x  34    5   18    34    5   18    6.14613878554783e-17  4.31307984950725e-17
    chr1  437  438  55  0  2   46  0   11  x  50.5  0   6.5   50.5  0   6.5   0.0155644201009862    0.00409590002657532
    chr1  439  440  56  0  1   55  0   2   x  55.5  0   1.5   55.5  0   1.5   1                     0.210526315789474
    chr1  449  450  56  0  1   13  20  24  x  34.5  10  12.5  34.5  10  12.5  2.25757007974134e-18  2.37638955762246e-18
    chr1  518  519  56  0  1   38  4   15  x  47    2   8     47    2   8     1.50219088598917e-05  6.32501425679652e-06
    

Help Forum

There are no questions on the Help Forum about this tool.

Ask a new question

Unnamed history

Draggable