BACKGROUND: CHROMOSOME Y STRUCTURE

Genes on ChrY

SEQUENCE DATA SOURCE

1000 Genomes Project

 

1000 Genomes Pilot Project release 2010_7 - vcf file (genotype) (hg18)
File shows 2788 SNPs, of which 2095 (75.1%) have rs#`

Paper describes 2870 ChrY SNPs, of which 74% novel in 79 males (including 2 sequenced at 15.2x depth, rest 1.8x)

DATA ANALYSIS

1. Find ethnicity of male samples

Done in Excel. Imported into Galaxy as ChrY SNP low_cov 2010_06b.txt

1.1 Parse vcf format to plain genotype format in Excel

#CHROM	POS	ID	        REF	ALT	QUAL	FILTER	INFO	                        FORMAT	        NA06986	NA06994	NA07051	...
Y	2715180	rs11575897	G	A	39	.	AC=5;AN=58;DB;DP=168;NS=61	GT:GQ:DP	0:60:5	0:19:1	0:51:3	...

1.2 Obtain Sample ID and Ethnicity from 1000 Genomes Site

 

2. Compare SNPs with dbSNP 129, 130

2.1 Compare against dbSNP 129, 130 in hg18

2.2 Compare against dbSNP131 and dbSNP132

2.2.1 Convert 1kG data into hg19

2.2.2 Compare 1kG against dbSNP131 in hg19

2.2.3 Compare 1kG against dbSNP132 which needs to be directly downloaded from NCBI (not available from UCSC Table Browser yet). Get file: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/reference/dbsnp132_20101103.vcf.gz. Select ChrY rows. Convert from vcf format.

dbSNP v129 v130 v131 v132
Known 704 (25%) 934 (33%) 1012 (36%) 1187 (43%)
Novel 2095 (75%) 1870 (67%) 1801 (64%) 1603 (57%)

All analyses from here on are on hg19

3. Find Genes which have SNPs

Using RefSeq genes (although one could use UCSC, Ensembl and other gene lists)

This is a casual analysis. The exact location and effect of each SNP should be assessed.

4. Find Function of All Genes / Genes with SNPs on ChrY

Using GSEA GSEA

Gene Set name [# Genes (K)] Description # Genes in Overlap (k) p value
RUNNE_GENDER_EFFECT_UP [10] Up-regulated genes detecting gender effects in global expression profiling studies. 9 0 e
PYEON_CANCER_HEAD_AND_NECK_VS_CERVICAL
CAL_DN
[29]
Down-regulated genes in head and neck cancer compared to cervical carcinoma samples. 6 1.91 e-9
SEXUAL_REPRODUCTION [137] Genes annotated by the GO term GO:0019953. The regular alternation, in the life cycle of haplontic, diplontic and diplohaplontic organisms, of meiosis and fertilization which provides for the production offspring. In diplontic organisms there is a life cycle in which the products of meiosis behave directly as gametes, fusing to form a zygote from which the diploid, or sexually reproductive polyploid, adult organism will develop. In diplohaplontic organisms a haploid phase (gametophyte) exists in the life cycle between meiosis and fertilization (e.g. higher plants, many algae and Fungi); the products of meiosis are spores that develop as haploid individuals from which haploid gametes develop to form a diploid zygote; diplohaplontic organisms show an alternation of haploid and diploid generations. In haplontic organisms meiosis occurs in the zygote, giving rise to four haploid cells (e.g. many algae and protozoa), only the zygote is diploid and this may form a resistant spore, tiding organisms over hard times. 8 1.05 e-7
ACEVEDO_LIVER_CANCER_WITH_H3K9ME3_UP [141] Genes whose promoters display higher histone H3 trimethylation mark at K9 (H3K9me3) in hepatocellular carcinoma (HCC) compared to normal liver. 7 2.03 e-6
GAMETE_GENERATION [113] Genes annotated by the GO term GO:0007276. The generation and maintenance of gametes. A gamete is a haploid reproductive cell. 6 7.69 e-6
REPRODUCTION [261] Genes annotated by the GO term GO:0000003. The production by an organism of new individuals that contain some portion of their genetic material inherited from that organism. 8 1.32 e-5
TAVAZOIE_METASTASIS [113] Putative metastasis genes: up-regulated in metastatic cell lines LM2 (lung) and BoM2 (bone) relative to the parental MDA-MB-231 line (breast adenocarcinoma). 5 1.09 e-4
Gene Set name [# Genes (K)] Description # Genes in Overlap (k) p value
RUNNE_GENDER_EFFECT_UP [10] Up-regulated genes detecting gender effects in global expression profiling studies. 8 0 e
PYEON_CANCER_HEAD_AND_NECK_VS_CERVICAL
CAL_DN
[29]
Down-regulated genes in head and neck cancer compared to cervical carcinoma samples. 5 1.46 e-10
RICKMAN_HEAD_AND_NECK_CANCER_B [54] Cluster b: genes identifying an intrinsic group in head and neck squamous cell carcinoma (HNSCC). 3 3.1 e-5
CTCAAGA,MIR-526B [64] Targets of MicroRNA CTCAAGA,MIR-526B 3 5.17 e-5
TAVAZOIE_METASTASIS [113] Putative metastasis genes: up-regulated in metastatic cell lines LM2 (lung) and BoM2 (bone) relative to the parental MDA-MB-231 line (breast adenocarcinoma). 3 2.8 e-4
BOYLAN_MULTIPLE_MYELOMA_D_DN [86] Genes down-regulated in group D of tumors arising from overexpression of BCL2L1 and MYC [Gene ID=598, 4609] in plasma cells. 2 4.27 e-3
SENGUPTA_NASOPHARYNGEAL_CARCINOMA_WITH
ITH_LMP1_UP
[399]
Genes up-regulated in nasopharyngeal carcinoma (NPC) positive for LMP1 [Gene ID=9260], a latent gene of Epstein-Barr virus (EBV). 3 1.01 e-2
SEXUAL_REPRODUCTION [137] Genes annotated by the GO term GO:0019953. The regular alternation, in the life cycle of haplontic, diplontic and diplohaplontic organisms, of meiosis and fertilization which provides for the production offspring. In diplontic organisms there is a life cycle in which the products of meiosis behave directly as gametes, fusing to form a zygote from which the diploid, or sexually reproductive polyploid, adult organism will develop. In diplohaplontic organisms a haploid phase (gametophyte) exists in the life cycle between meiosis and fertilization (e.g. higher plants, many algae and Fungi); the products of meiosis are spores that develop as haploid individuals from which haploid gametes develop to form a diploid zygote; diplohaplontic organisms show an alternation of haploid and diploid generations. In haplontic organisms meiosis occurs in the zygote, giving rise to four haploid cells (e.g. many algae and protozoa), only the zygote is diploid and this may form a resistant spore, tiding organisms over hard times. 2 1.05 e-2

As expected, most genes on ChrY are involved in gender determination and sexual reproduction. There doesn't seem to be any significant difference between SNP-containing genes and other ChrY SNPs.