resourcerer2BioC {Resourcerer}R Documentation

A function that downloads an annotation file from TIGR Resourcerer and then creates a bioC annotation data package

Description

TIGR Resourcerer maintains various annotation files for Affymetrix or cDNA chips. This function allows users to create a bioC annotation data package for the probes contained in the Resourcerer annotation file.

Usage

getProbe2ID(tigrFile, baseMapType = c("gb", "ll", "ug"))
resourcerer2BioC(which, organism = c("human", "mouse", "rat"), destDir =
file.path(.path.package("Resourcerer"), "temp"), pkgName, pkgPath,
srcUrls = getSrcUrl("all", organism), otherSrc = NULL, baseMapType =
c("gb", "ug", "ll"), version = "1.1.0", makeXML = TRUE, fromWeb = TRUE,
baseUrl = "ftp://ftp.tigr.org/pub/data/tgi/Resourcerer", check = FALSE,
author = list(author = "Anonymous", maintainer = "anonymous@email.com"),
exten = "zip") 
checkMapping(pkgName, map2LL, llRda, outFile =
file.path(.path.package("Resourcerer"), "temp", "checkMapping.out") ) 

Arguments

which which a character string indicating which Resourcerer annotation file to be read in
destDir destDir a character string for the path of a directory where the downloaded file will be stored. If missing, the temp directory will be the default
baseUrl baseUrl a character string for the url of Resourcerer ftp site where directories containing annotation files for human, rat, mouse ... are stored
tigrFile tigrFile a character string for the downloaded TIGR Resourcerer annotation file
srcUrls srcUrls a vector of names character strings for the urls where source data files are retained. Valid sources are LocusLink, UniGene, Golden Path, Gene Ontology, and KEGG. The names for the character strings should be LL, UG, GP, GO, and KEGG, respectively. LL and UG are required
baseMapType baseMapType a character string that is either "gb","ug", or "ll" to indicate whether the probe ids in baseName are mapped to GenBack accession numbers, UniGene ids, or LocusLink ids
otherSrc otherSrc a vector of named character strings for the names of files that contain mappings between probe ids of baseName and LocusLink ids that will be used to obtain the unified mappings between probe ids of baseName and LocusLink ids based on all the sources. The strings should not contain any number and the files have the same structure as baseName
pkgName pkgName a character string for the name of the data package to be built (e. g. hgu95a, rgu34a)
pkgPath pkgPath a character string for the full path of an existing directory where the built backage will be stored
organism organism a character string for the name of the organism of concern (now can only be "human", "mouse", or "rat")
version version a character string for the version number
makeXML makeXML a boolean to indicate whether an XML version will also be generated
author author a list of character strings with an author element for the name of the author and maintainer element for the email address of the author
fromWeb fromWeb a boolean indicating whether source files used to build a data package will be obtained on line
map2LL map2LL a character string for the name of a tab separated file with the first column for probes id and second column for the matching LL ids
llRda llRda a character string for the name of an rda file for an environment with keys being probe ids and values for matching LL ids
outFile outFile a character string for the name of a file to store the results of comparisons conducted by function checkMapping
check check a boolean indicating whether function checkMapping will be called to check the mappings between probe and LocusLik ids obtained using AnnBuilder against that provided by Resourcerer
exten exten a character string for the extension of the source data file to be processed

Details

Function getProbe2ID reads from an annotation file downloaded from Resourcerer and then subtracts a matrix from the downloaded file. The matrix has probe ids as one column and the ids defined by baseMapType as another column.

Function checkMapping compares the mapping of LocusLink ids to probe ids that are common to both llRda and map2LL. The results of the comparison are always written to a file named checkMapping.out in the temp directory of the Resourcer package. If a file by the same name already exists, the results will be appended to the end of the file.

baseUrl is the root directory of TIGR ftp site for Resourcerer that contains subdirectories holding data for different organism.

Value

Function getProb2LL returns returns a matrix with probe and selected public ids as two columns.
Function resourcerer2BioC returns invisible() if successfully executed.

Note

This function is part of the Bioconductor project at Dana-Farber Cancer Institute to provide Bioinformatics functionalities through R

Author(s)

Jianhua Zhang

References

http://pga.tigr.org/tigr-scripts/magic/r1.pl

See Also

getResourcerer

Examples

  #############################################################
  ## The example takes a loooong time (about an hour) to run ##
  #############################################################
  if(interactive()){
    resourcerer2BioC("Agilent_Human1_cDNA.zip")
    unlink(file.path(.path.package("Resourcerer"), "temp",
           "AgilentHuman1cDNA"), TRUE)
  }    

[Package Contents]