GDCprepare {TCGAbiolinks} | R Documentation |
Reads the data downloaded and prepare it into an R object
GDCprepare(query, save = FALSE, save.filename, directory = "GDCdata", summarizedExperiment = TRUE, remove.files.prepared = FALSE, add.gistic2.mut = NULL, mut.pipeline = "mutect2", mutant_variant_classification = c("Frame_Shift_Del", "Frame_Shift_Ins", "Missense_Mutation", "Nonsense_Mutation", "Splice_Site", "In_Frame_Del", "In_Frame_Ins", "Translation_Start_Site", "Nonstop_Mutation"))
query |
A query for GDCquery function |
save |
Save result as RData object? |
save.filename |
Name of the file to be save if empty an automatic will be created |
directory |
Directory/Folder where the data was downloaded. Default: GDCdata |
summarizedExperiment |
Create a summarizedExperiment? Default TRUE (if possible) |
remove.files.prepared |
Remove the files read? Default: FALSE This argument will be considered only if save argument is set to true |
add.gistic2.mut |
If a list of genes (gene symbol) is given, columns with gistic2 results from GDAC firehose (hg19) and a column indicating if there is or not mutation in that gene (hg38) (TRUE or FALSE - use the MAF file for more information) will be added to the sample matrix in the summarized Experiment object. |
mut.pipeline |
If add.gistic2.mut is not NULL this field will be taken in consideration. Four separate variant calling pipelines are implemented for GDC data harmonization. Options: muse, varscan2, somaticsniper, MuTect2. For more information: https://gdc-docs.nci.nih.gov/Data/Bioinformatics_Pipelines/DNA_Seq_Variant_Calling_Pipeline/ |
mutant_variant_classification |
List of mutant_variant_classification that will be consider a sample mutant or not. Default: "Frame_Shift_Del", "Frame_Shift_Ins", "Missense_Mutation", "Nonsense_Mutation", "Splice_Site", "In_Frame_Del", "In_Frame_Ins", "Translation_Start_Site", "Nonstop_Mutation" |
A summarizedExperiment or a data.frame
query <- GDCquery(project = "TCGA-KIRP", data.category = "Simple Nucleotide Variation", data.type = "Masked Somatic Mutation", workflow.type = "MuSE Variant Aggregation and Masking") GDCdownload(query, method = "api", directory = "maf") maf <- GDCprepare(query, directory = "maf") ## Not run: query <- GDCquery(project = "TCGA-ACC", data.category = "Copy number variation", legacy = TRUE, file.type = "hg19.seg", barcode = c("TCGA-OR-A5LR-01A-11D-A29H-01", "TCGA-OR-A5LJ-10A-01D-A29K-01")) # data will be saved in GDCdata/TCGA-ACC/legacy/Copy_number_variation/Copy_number_segmentation GDCdownload(query, method = "api") acc.cnv <- GDCprepare(query) query <- GDCquery(project = "TCGA-GBM", legacy = TRUE, data.category = "Gene expression", data.type = "Gene expression quantification", platform = "Illumina HiSeq", file.type = "normalized_results", experimental.strategy = "RNA-Seq") GDCdownload(query, method = "api") data <- GDCprepare(query,add.gistic2.mut = c("PTEN","FOXJ1")) ## End(Not run)