createDB {RCAS} | R Documentation |
Creates an sqlite database consisting of various tables of data obtained from processed BED files
createDB(dbPath = file.path(getwd(), "rcasDB.sqlite"), projDataFile, gtfFilePath = "", update = FALSE, genomeVersion, annotationSummary = TRUE, coverageProfiles = TRUE, motifAnalysis = TRUE, nodeN = 1)
dbPath |
Path to the sqlite database file (could be an existing file or a new file path to be created at the given path) |
projDataFile |
A file consisting of meta-data about the input samples. Must minimally consist of two columns: 1. sampleName (name of the sample) 2. bedFilePath (full path to the location of the BED file containing data for the sample) |
gtfFilePath |
Path to the GTF file (preferably downloaded from the Ensembl database) that contains genome annotations |
update |
TRUE/FALSE (default: FALSE) whether an existing database should be updated |
genomeVersion |
A character string to denote for which genome version the analysis is being done. Available options are hg19/hg38 (human), mm9/mm10 (mouse), ce10 (worm) and dm3 (fly). |
annotationSummary |
TRUE/FALSE (default:TRUE) whether annotation summary module should be run |
coverageProfiles |
TRUE/FALSE (default: TRUE) whether coverage profiles module should be run |
motifAnalysis |
TRUE/FALSE (default: TRUE) whether motif discovery module should be run |
nodeN |
Number of cpus to use for parallel processing (default: 1) |
Path to an SQLiteConnection object created by RSQLite package
FUS_path <- system.file("extdata", "FUS_Nakaya2013c_hg19.bed", package='RCAS') FMR1_path <- system.file("extdata", "FMR1_Ascano2012a_hg19.bed", package='RCAS') projData <- data.frame('sampleName' = c('FUS', 'FMR1'), 'bedFilePath' = c(FUS_path,FMR1_path), stringsAsFactors = FALSE) write.table(projData, 'myProjDataFile.tsv', sep = '\t', quote =FALSE, row.names = FALSE) gtfFilePath <- system.file("extdata", "hg19.sample.gtf", package='RCAS') createDB(dbPath = 'hg19.RCASDB.sqlite', projDataFile = './myProjDataFile.tsv', gtfFilePath = gtfFilePath, genomeVersion = 'hg19', motifAnalysis = FALSE, coverageProfiles = FALSE) #Note: to add new data to an existing database, set update = TRUE