MafDb-class {GenomicScores}R Documentation

MafDb class

Description

Class for annotation packages storing minor allele frequency data.

Usage

## S4 method for signature 'MafDb'
mafByOverlaps(x, ranges, pop="AF", type=c("snvs", "nonsnvs"), maf.only=FALSE, caching=TRUE)
## S4 method for signature 'MafDb'
mafById(x, ids, pop="AF", maf.only=FALSE, caching)
## S4 method for signature 'MafDb'
populations(x)

Arguments

x

A MafDb object.

ranges

Either a GRanges object, a GPos object or a character string vector with the format "CHR:START[-END]".

ids

A character string vector with variant identifiers annotated by the MAF data source, typically dbSNP 'rs' identifiers. Note that the mapping of these identifiers to genomic positions and MAF values might be a subset of the most up to date dbSNP 'rs' identifier assignment to variants. To access the latter, please use the snpsById() method from the BSgenome package with the desired SNPlocs.* package.

pop

Character string vector with the populations for which we want to retrieve MAF values.

type

Character string setting the type of variant to seek, which can be either 'snvs' (default) when we seek single nucleotide variants or 'nonsnvs', otherwise.

maf.only

Flag set to FALSE (default) when MAF values are returned in metadata columns from the input GenomicRanges object. When set to TRUE, a DataFrame object is returned with the MAF values.

caching

logical; TRUE (default) indicates that the function stores into main memeory the MAF data as it gets loaded from disk, improving performance; FALSE forces this function to load MAF data from disk each time, decreasing performance and memory requirements.

Details

This class has been deprecated in Bioconductor 3.7, is being replaced by the GScores-class and will become defunct and unavailable in Bioconductor 3.8.

The MafDb class is derived from the GScores class and it serves the purpose of providing support to store and access minor allele frequency (MAF) data from R and Bioconductor. Two annotation packages using the MafDb class are:

MafDb.1Kgenomes.phase1.hs37d5 MAF values from the 1000 Genomes Project Phase 1.
MafDb.1Kgenomes.phase3.hs37d5 MAF values from the 1000 Genomes Project Phase 3.

This object class tries to reduce the disk space required to store MAF values for millions of SNPs by coding their double-precision values, which range between 0 and 1, into a single-byte raw object type. To achieve this, the original MAF values are rounded to one significant digit for AF < 0.1 and two significant digits for AF >= 0.1. When a variant has multiple alternate alleles, only the largest MAF value is stored.

Author(s)

R. Castelo

Examples


## Not run: 
  ## lookup allele frequencies for rs1129038, a SNP associated to blue and brown eye colors
  ## as reported by Eiberg et al. Blue eye color in humans may be caused by a perfectly associated
  ## founder mutation in a regulatory element located within the HERC2 gene inhibiting OCA2 expression.
  ## Human Genetics, 123(2):177-87, 2008 [http://www.ncbi.nlm.nih.gov/pubmed/18172690]

  if (require(MafDb.1Kgenomes.phase1.hs37d5)) {
    mafdb <- MafDb.1Kgenomes.phase1.hs37d5
    mafdb

    ## specialized interface
    populations(mafdb)

    rng <- GRanges("15", IRanges(28356859, 28356859))
    mafByOverlaps(mafdb, rng)
    mafByOverlaps(mafdb, "15:28356859-28356859")
    mafByOverlaps(mafdb, "15:28356859")
    mafById(mafdb, "rs1129038")
  }

## End(Not run)


[Package GenomicScores version 1.4.1 Index]