| Title: | Enable Cross-Study Analysis of 'CDISC' 'SEND' Datasets | 
| Version: | 1.0.0 | 
| Description: | A system enables cross study Analysis by extracting and filtering study data for control animals from 'CDISC' 'SEND' Study Repository. These data types are supported: Body Weights, Laboratory test results and Microscopic findings. These database types are supported: 'SQLite' and 'Oracle'. | 
| License: | MIT + file LICENSE | 
| URL: | https://github.com/phuse-org/sendigR | 
| BugReports: | https://github.com/phuse-org/sendigR/issues | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.1.2 | 
| Imports: | RSQLite, data.table, readxl, magrittr, xfun, stringr, DescTools, parsedate, shiny, shinydashboard, htmltools, DT, dplyr, ggplot2, Hmisc, haven, plotly, cicerone, reticulate, sjlabelled | 
| Suggests: | knitr, rmarkdown, logr, shinycssloaders, testthat | 
| VignetteBuilder: | knitr | 
| Config/testthat/edition: | 3 | 
| Depends: | R (≥ 4.1.0) | 
| SystemRequirements: | Python(>=3.9.6) | 
| NeedsCompilation: | no | 
| Packaged: | 2022-08-17 02:48:35 UTC; wangw106 | 
| Author: | Bo Larsen [aut], Yousuf Ali [aut], Kevin Snyder [aut], William Houser [aut], Brianna Paisley [aut], Cmsabbir Ahmed [aut], Susan Butler [aut], Michael Rosentreter [aut], Michael Denieu [aut], Wenxian Wang [cre, aut], BioCelerate [cph] | 
| Maintainer: | Wenxian Wang <wenxian.wang@bms.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2022-08-18 08:50:02 UTC | 
Create indexes in SEND database
Description
Create a set of indexes on the tables in an SQLite SEND database to optimize performance of extraction of data from the different functions in the package.
Usage
dbCreateIndexes(dbToken, replaceExisting = FALSE)
Arguments
| dbToken | Mandatory | 
| replaceExisting | Mandatory, character | 
Details
All the indexes are named <domain name>_sendigr_<nn> - .e.g.
TS_sendigr_01.
If any additional indexes are manually created in the database, avoid to
include 'sendigr' in the name, because all existing indexes with that
included in the name will be initially deleted when execution the function
with replaceExisting = TRUE.
It's recommended to wait with the creation of the indexes until the major
amount of studies to be loaded in to the database are loaded.
The database must be an SQLite database - no other types of databases are supported by this function.
Value
No return value, called for side effects
Examples
## Not run: 
createAllIndexes(myDbToken)
## End(Not run)
Create a SEND schema in an open and empty database
Description
Create all the domains and variables which are described in the SEND IG versions 3.0 and 3.1 in the database - i.e. a union of domains from the SEND IG versions and in each domain a union of variables from the SEND IG versions.
Usage
dbCreateSchema(dbToken)
Arguments
| dbToken | Mandatory | 
Details
The database must be an SQLite database - no other types of databases are supported by this function.
Value
No return value, called for side effects
Examples
## Not run: 
# Create an empty SQLite database and create the SEND schema
myDbToken <- initEnvironment(dbType = 'sqlite',
                             dbPath ='/mydatapath/db/send.db',
                             dbCreate = TRUE)
dbCreateSchema(myDbToken)
## End(Not run)
Delete one or more studies in SEND database
Description
Deletes data from all domains for one or more studies in an SQLite based SEND database
Usage
dbDeleteStudies(dbToken, studyIdList)
Arguments
| dbToken | Mandatory | 
| studyIdList | Mandatory, character | 
Details
The database must be a SQLite database - no other types of databases are supported by this function.
Value
No return value, called for side effects
Examples
## Not run: 
# delete one study
dbDeleteStudies(myDbToken, '122312')
# delete multiple studies
dbDeleteStudies(myDbToken, list('122312', '552343', '0942347'))
## End(Not run)
Import SEND study data in SAS xport format into a SEND database from a single study folder
Description
Check each of the SAS xpt file located in the specified folder - import content from file and load it into the corresponding SEND domain table in the open database.
Usage
dbImportOneStudy(dbToken, xptPath, overWrite = FALSE, checkRequiredVars = TRUE)
Arguments
| dbToken | Mandatory | 
| xptPath | Mandatory, character | 
| overWrite | Mandatory, boolean | 
| checkRequiredVars | Mandatory, boolean | 
Details
These requirements to the content of the folder must be fulfilled:
- The folder must contain some SAS xport files named - [send domain].xpt- the case of the file names doesn't care
- A minimum set of required domain files must be included: - ts.xpt,- tx.xpt,- dm.xpt.
- Each xpt file must contain one data table with same name as the file name - i.e. a send domain name. 
- Each xpt file must contain a non-empty STUDYID value in each row equal to the value of TS.STUDYID. 
- Each xpt file must contain a set of required column(s). 
 In general it's (where relevant for the different kinds of domains):
 - STUDYID, DOMAIN, --SEQ, USUBJID, --TESTCD, --TEST,--ORRES, --ORRESU, --STRESC, --STRESN, --STRESU
- The DOMAIN variable must contain the name of the actual domain in all rows 
The last two requirements are checked for the required domains in all cases.
For other domains, these two requirements are only checked if parameter
checkRequiredVars = TRUE.
If an error is detected, the import and load of data is canceled, and further
execution is aborted (i.e. error message is written to the console).
These error situations are checked and reported:
- Any of the requirements 1 to 3 are not fulfilled or any of the following requirements are not fulfilled for one of the required domains 
- A study with the same value if STUDYID exists in the database and parameter - overWrite = FALSE.
If one of the requirements 4 to 6 are not fulfilled for a not-required
domain, this domain is excluded from the import. These kinds of issues are
reported as one warning message to the console when data has been loaded.
Some non-critical issues, which doesn't prohibit data to be loaded to the
database may be detected. These are reported as one warning message to the
console when data has been loaded (together with eventual warning messages
for skipped domains).
These non-critical issues are checked and reported:
- The study folder contains one or more xpt file(s) with names(s) not matching SEND domain name(s). 
 Such files are ignored by the import/load process.
- An imported data tables contains one or more column(s) which do(es)n't exist(s) in the corresponding domain. 
The database must be an SQLite database - no other types of databases are supported by this function.
Value
No return value, called for side effects
Examples
## Not run: 
# Do not overwrite if study already exists in the database
dbImportOneStudy(myDbToken,'/mydatapath/studies/1213443')
# Allow to overwrite data if study already exists in the database
dbImportOneStudy(myDbToken,'/mydatapath/studies/786756', overwrite = TRUE)
## End(Not run)
Import SEND study data in SAS xport format into a SEND database from a hierarchy study folders.
Description
For each non-empty folder below the specified root folder, the actions to import a set of SAS xpt files into the opened SQLlite database described for function dbImportOneStudy.
Usage
dbImportStudies(
  dbToken,
  xptPathRoot,
  overWrite = FALSE,
  checkRequiredVars = TRUE,
  verbose = FALSE,
  logFilePath = NULL
)
Arguments
| dbToken | Mandatory | 
| xptPathRoot | Mandatory, character | 
| overWrite | Mandatory, boolean | 
| checkRequiredVars | Mandatory, boolean | 
| verbose | Mandatory, boolean | 
| logFilePath | Optional, character | 
Details
The status for the processing of each sub folder is caught and returned as
described below.
If parameter verbose = TRUE, the status for each processed sub folder
is also printed to the console each time a sub folder has been processed -
i.e. it's possible to followed the progress of the import process.
If parameter logFilePath has been specified with an existing path to a
folder, the status for each processed sub folder is also printed to a log
file in this folder each time a sub folder has been processed.
The database must be an SQLite database - no other types of databases are supported by this function.
Value
A list containing a named element with the import status for each
of the processed sub folders.
Each of the statuses are one of three variants:
- 'OK' - the SAS xport files has been imported to the database with no errors or warnings 
- 'Warning: [list of warnings]' - the SAS xport files has been imported to the database but have one or more warnings 
- 'Cancelled: [error message]' - the SAS xport files have not been imported to the database because an error has been detected. 
Examples
## Not run: 
# Import studies from a set of folders - do not allow to overwrite
# existing study data in the database, follow the progress
dbImportStudies(myDbToken,'/mydatapath/studies', verbose = TRUE)
# Import studies from another set of folders - allow to overwrite existing
# study data in the database
dbImportStudies(myDbToken,'/mydatapath/project123/studies', overwrite = TRUE)
# Import studies from a set of folders , save the status of each study load
# in a log file
dbImportStudies(myDbToken,'/mydatapath/studies',
                logFilePath = '/my/log file/path')
## End(Not run)
Disconnect from the open database.
Description
Close database session and disconnect from open database.
Usage
disconnectDB(dbToken)
Arguments
| dbToken | Mandatory | 
Value
No return value, called for side effects
Examples
## Not run: 
disconnectDB()
## End(Not run)
Execute sendDashboard app
Description
Executes an encapsulated Shiny which to query, visualize and extract historical control data from a SEND database.
Usage
execSendDashboard(dbToken)
Arguments
| dbToken | Mandatory - token for the open database connection | 
Value
The function does not return anything, but it is possible to extract data from the app in different formats to use for further processing
Examples
## Not run: 
dbToken <- initEnvironment(dbType='sqlite', dbPath='/path/to/database/send.db')
execSendDashboard(dbToken)
disconnectDB(dbToken)
## End(Not run)
Create json file for vocabulary mappings. Keys are synonyms and values are the CDISC Controlled Terminology Submission values. Vocabularies are defined by column values from the tab-delimited files.
Description
Create json file for vocabulary mappings. Keys are synonyms and values are the CDISC Controlled Terminology Submission values. Vocabularies are defined by column values from the tab-delimited files.
Usage
gen_vocab(in_file, out_path)
Arguments
| in_file | Mandatory. | 
| out_path | Mandatory. | 
Value
No return value, called for side effects
Examples
## Not run: 
gen_vocab(list(infile1, infile2),jsonfile)
## End(Not run)
Execute database query and returns fetched rows.
Description
The function executes a SQL select statements in the database and returns the fetched set of rows as a data.table.
Usage
genericQuery(dbToken, queryString, queryParams = NULL)
Arguments
| dbToken | Mandatory | 
| queryString | Mandatory, character. | 
| queryParams | Optional, character. | 
Value
Data.table with the set of fetched rows
Examples
## Not run: 
genericQuery(dbToken,
             'select studyid, tsseq, tsgrpid, tsparmcd, tsval from ts')
genericQuery(dbToken,
             'select studyid, tsval from ts where tsprmcd = "SDESIGN" and studyid in (:1)',
             list("1234546","222333","444555"))
## End(Not run)
Extract a list of control animals for a list of studies
Description
Returns a data table with a list of animals belonging to the groups for
negative control in the given list of studies.
Usage
getControlSubj(dbToken, studyList, inclUncertain = FALSE)
Arguments
| dbToken | Mandatory. | 
| studyList | Mandatory, data.table. | 
| inclUncertain | Mandatory, boolean. | 
Details
The set of animals contains all animals from DM where the SETCD is associated with a TX parameter 'TCNTRL'. Negative control animals are further defined by
- either containing a word from a set of words, to automatically distinguish it as a negative control: - ['placebo', 'untreated', 'sham'] 
 
- or containing a combination of a word from of two lists: - ['negative', 'saline', 'peg', 'vehicle', 'citrate', 'dextrose', 'water', 'air'] 
- ['item', 'control', 'article'] 
 
Animals are in all cases excluded (i.e. whether inclUncertain=TRUE or
inclUncertain=FALSE) from the output set, when they are identified as
positive control animals - i.e. they are associated with a TX parameter
'TCNTRL' containing a word from this set of words:
- ['positive','reference'] 
The age in days at reference start date is calculated for each animal based on the age related variables in DM:
- If BRTHDTC is populated compute DM.RFSTDTC – DM.BRTHDTC + 1 
- Else If AGE is populated convert from units specified in AGEU to days. 
- Else If AGETXT is populated convert the mid-point of the range from units specified in AGEU to days. 
 These AGEU units are handled with the described conversion from value to number of days:- DAYS 
- WEEKS : value * 7 
- MONTHS : value * 365/12 
- YEARS : value * 365 
 
If input parameter inclUncertain=TRUE, uncertain animals are included
in the output set. These uncertain situations are identified and reported (in
column UNCERTAIN_MSG):
- TX parameter 'TCNTRL' is missing 
- TXVAL for TX parameter 'TCNTRL' cannot be identified as Negative or Positive control according to the algorithm described above 
Value
The function return a data.table with columns:
- STUDYID (character) 
- Additional columns contained in the - studyListtable
- TCNTRL (character) 
 The value of the TX parameter TCNTRL which is used for identification of whether it is a negative control group or not
- USUBJID (character) 
- RFSTDTC (character) 
- DM_AGEDAYS (integer) 
 The calculated age in days of the animal at the reference start day - i.e. the age registered in DM.
- DSDECOD (character) 
 The standardized disposition term for the animal
- DS_AGEDAYS (integer) 
 The calculated age in days of the animal at the disposition
- NO_AGE_MSG (character) 
 Empty or contains the reason if a DM_AGEDAYS couldn't be calculated
- UNCERTAIN_MSG (character) 
 Included when parameter- inclUncertain=TRUE.
 Contains the reason for an uncertain animal is NA for rows for confident identified negative control animals.
- NOT_VALID_MSG (character) 
 Included if the column is included in data table specified in- studyList,
Examples
## Not run: 
controlAnimals <- getControlSubj(myDbToken, allSTudies)
## End(Not run)
Extract a set of findings for a specified study phase - or just add phase for each animal.
Description
Returns a data table with the set of findings rows included in the
findings of the phase(s) specified in the phaseFilter.
If the phaseFilter is empty (null, na or empty string), all rows from
findings are returned with the an additional PHASE column.
Usage
getFindingsPhase(
  dbToken,
  findings,
  phaseFilter = NULL,
  inclUncertain = FALSE,
  noFilterReportUncertain = TRUE
)
Arguments
| dbToken | Mandatory | 
| findings | Mandatory, data.table. 
 where domain is the name of the actual findings domain - e.g. LBSEQ and LBDTC | 
| phaseFilter | Optional, character. | 
| inclUncertain | Mandatory, boolean. | 
| noFilterReportUncertain | Mandatory, boolean. | 
Details
The logic for the extraction is based on the subject elements and the trial design domains - for each finding row:
- The related subject element is found in SE as the row where the value of domainDTC is within the interval from SESTDTC to SEENDTC 
- The actual EPOCH is found in TA in the row matching the found element (via the ETCD value) 
- The actual study phase is derived from the EPOCH value matching at set of text patterns 
For pooled findings rows - i.e. POOLID is populated instead of USUBJID - the phase is identified per animal included in the each pool and finding, and if all identified phases are equal per pool and finding, the identified phase are returned per pool and finding.
The populated value of a phase is one of:
- 'Screening' 
 If TA.EPOCH fulfills one:- contains 'pre' followed by one of ['treat','trt','dos',test','study','exposure'] 
- contains one of ['acclimat','screen','baseline','allocat','random'] 
 
- 'Recovery' 
 If TA.EPOCH doesn't fulfill the pattern for 'Screening' and fulfills one of:- contains 'recovery' 
- contains 'post' followed by one of ['treat','trt','dos','test','study','exposure'] 
 
- 'Treatment' 
 If TA.EPOCH doesn't fulfill the patterns for 'Screening' or 'Recovery' and fulfills both:- contains one of ['treat','trt','dos','test','exposure'] 
- does not contain any of ['off','non'|','free'|','holiday'] 
 
- 'Uncertain' 
 If the TA.EPOCH is empty or does not fulfills any of the requirements described for the three phases above.
If input parameter inclUncertain=TRUE, findings rows where the phase
cannot be confidently identified are included in the output set. These
uncertain situations are identified and reported (in column UNCERTAIN_MSG):
- One of the date/time values SESTDTC, SEENDTC or domainDTC is empty or contains an invalid ISO 8601 value 
- The value of domainDTC is included in more then one SESTDTC/SEENDTC interval 
- The EPOCH value does not match any of the patterns identifying the set of possible study phases. 
- Different phases have been identified for individual subjects in a pool for a given finding 
The same checks are performed and reported in column NOT_VALID_MSG if
phaseFilter is empty and noFilterReportUncertain=TRUE.
Value
The function returns a data.table with columns in this order:
- All columns contained in the - findingsinput table (original order except optional UNCERTAIN_MSG and NOT_VALID_MSG)
- PHASE (character) 
- UNCERTAIN_MSG (character) 
 Included when parameter- inclUncertain=TRUE.
 In case the phase cannot be confidently matched during the filtering of data, the column contains an indication of the reason.
 If any uncertainties have been identified for individual subjects included in pools for pooled finding rows, all messages for subjects per pool/findings are merged together and reported as one message per pool/finding.
 Is NA for rows where phase can be confidently matched.
 A non-empty UNCERTAIN_MSG value generated by this function is merged with non-empty UNCERTAIN_MSG values which may exist in the input set of findings specified in- findings- separated by '|'.
- NOT_VALID_MSG (character) 
 Included when parameter- noFilterReportUncertain=TRUE.
 In case the phase cannot be confidently decided, the column contains an indication of the reason.
 Is NA for rows where phase can be confidently decided.
 A non-empty NOT_VALID_MSG value generated by this function is merged with non-empty NOT_VALID_MSG values which may exist in the input set of findings- findings- separated by '|'.
Examples
## Not run: 
# Extract LB rows for the Treatment phase - include uncertain rows
getFindingsPhase(dbToken, lb,
                 phaseFilter = 'Treatment',
                 inclUncertain = TRUE)
# No filtering, just add PHASE to FW rows - do not include messages when
# the phase cannot be confidently identified
getFindingsPhase(dbToken, fw,
                 noFilterReportUncertain = FALSE)
## End(Not run)
Add the subject age at finding time - and optionally extract the set of findings within a specified range of age.
Description
Returns a data table with the set of findings rows included in the
findings where the age of subjects at finding time is within the
interval specified in fromAge to fromAge.
If the fromAge and fromAge are empty (null, na or empty
string), all rows from findings are returned.
Usage
getFindingsSubjAge(
  dbToken,
  findings,
  animalList,
  fromAge = NULL,
  toAge = NULL,
  inclUncertain = FALSE,
  noFilterReportUncertain = TRUE
)
Arguments
| dbToken | Mandatory | 
| findings | Mandatory, data.table. 
 where  | 
| animalList | Mandatory, data.table. 
 | 
| fromAge | Optional, character  
 The unit is case-insensitive, space(s) between age value and unit is allowed. | 
| toAge | Optional. character  | 
| inclUncertain | Mandatory, boolean. | 
| noFilterReportUncertain | Optional, boolean. | 
Details
In both situation, the subject age at finding time is calculated into an
additional column AGEDAYS for each row in findings combined
with the the additional input data.table animalList using this
algorithm:
- Determine the number of study days between study start and findings - if - findings.[domain]DYis populated- If - findings.[domain]DY > 0then use- findings.[domain]DY - 1
- Else use - findings.[domain]DY
 
- Else If - findings.[domain]DTCis populated compute- animalList.RFSTDTC – findings.[domain]DTCin days
 where- animalList.RFSTDTCis each subject's reference start date (- DM.RFSTDTC)
 
- Animal age at time of finding is then calculated as - animalList.AGEDAYS + [study days between study start and findings]
 where- animalList.AGEDAYSis the subject age at reference start date(calculated during extraction of control subjects in- getControlSubj.
- For pooled findings rows - i.e. POOLID is populated instead of USUBJID - the animal age at time of finding is calculated per animal included in the each pool and finding. - If all calculated ages are equal within a pool and finding, the calculated age is populated for this pool/finding. 
- If all calculated ages are within the same time internal (2 days) within a pool and finding, the minimum calculated age plus 1 day is populated for this pool/finding. 
 
If both fromAge and toAge values are specified - all the rows
from the input table findings where value of the calculated
AGEDYAS is within the interval of the specified start/end age interval
are returned - including the values equal to the start/end age values.
If only a fromAge value is specified - all the rows from the input
table findings where value of AGEDYAS equal to or greater than
the input age are returned.
If only a toAge value is specified - all the rows from input table
findings where value of AGEDAYS is equal to or less than the input age
are extracted and returned.
The input age value(s) is/are converted to days before extraction of rows
from the input data tables using the input value(s) as filter - using this
conversion:
-  DAYS
-  WEEKS : value * 7
-  MONTHS : value * 365/12
-  YEARS : value * 365
If input parameter inclUncertain=TRUE, findings rows where the age at
finding time cannot be confidently identified are included in the output set.
These uncertain situations are identified and reported (in column
UNCERTAIN_MSG):
- No age at reference time has been calculated for subject ( - animalList.AGEDAYS)
- Reference start time is missing or contains invalid ISO8601 date value for subject ( - animalList.RFSTDTC).
- Missing - [domain]DYvalue and missing or invalid ISO8601 date- [domain]DTCvalue for finding
- For pooled findings: - More than two days between minimum and maximum of - animalList.AGEDAYSfor the set of animals in a pool.
- Different values in - animalList.RFSTDTCfor the set of animals in a pool.
 
The same checks are performed and reported in column NOT_VALID_MSG if
fromAge and fromAge are empty and noFilterReportUncertain
= TRUE.
Value
The function returns a data.table with columns in this order:
- All columns contained in the - findingsinput table (original order except optional- UNCERTAIN_MSGand- NOT_VALID_MSG)
-  AGEDAYS(character)
 The subject age at finding time calculated in days. IsNAif thge age cannot be confidently calculated.
-  UNCERTAIN_MSG(character)
 Included when parameterinclUncertain=TRUE.
 In case the age at finding time cannot be confidently matched during the filtering of data, the column contains an indication of the reason.
 If any uncertainties have been identified for individual subjects included in pools for pooled finding rows, one message for is reported per pool/finding.
 Is NA for rows where the age at finding time can be confidently matched.
 A non-emptyUNCERTAIN_MSGvalue generated by this function is merged with non-emptyUNCERTAIN_MSGvalues which may exist in the input set of findings specified infindings- separated by '|'.
-  NOT_VALID_MSG(character)
 Included when parameternoFilterReportUncertain=TRUE.
 In case the age at finding time cannot be confidently calculated, the column contains an indication of the reason.
 Is NA for rows where age at finding time can be confidently calculated.
 A non-emptyNOT_VALID_MSGvalue generated by this function is merged with non-emptyNOT_VALID_MSGvalues which may exist in the input set of findingsfindings- separated by '|'.
Examples
## Not run: 
# Extract LB rows for the animals at age between 8 and 12 weeks at finding
# time - include uncertain rows
getFindingsSubjAge(dbToken = db,
                   findings = lb,
                   animalList = animals,
                   fromAge = '8w',
                   toAge = '12w',
                   inclUncertain = TRUE)
# No filtering, just add AGEDAYS to FW rows - do not include messages when
# the AGEDAYS cannot be confidently identified
getFindingsSubjAge(dbToken = db, findings = fw,  animalList = animals,
                   noFilterReportUncertain = FALSE)
## End(Not run)
Extract a list of SEND studies with a specified study design - or just add actual study design for each study.
Description
Returns a data table with the list of study ids from TS where the value of
TSVAL for the TSPARMCD 'SDESIGN' is equal to a given study design.
If the studyDesignFilter is empty (null, na or empty string) - all
rows for the TSPARMCD 'SDESIGN' are returned.
Usage
getStudiesSDESIGN(
  dbToken,
  studyList = NULL,
  studyDesignFilter = NULL,
  exclusively = TRUE,
  inclUncertain = FALSE,
  noFilterReportUncertain = TRUE
)
Arguments
| dbToken | Mandatory. | 
| studyList | Optional, data.table. | 
| studyDesignFilter | Mandatory, character. The study design to use as criterion for filtering of the study id values. It can be a single string, a vector or a list of multiple strings. | 
| exclusively | Mandatory, boolean. 
 | 
| inclUncertain | Mandatory, boolean. | 
| noFilterReportUncertain | Mandatory, boolean | 
Details
Extracts the set of studies from TS where the value of TSVAL for the TSPARMCD
'SDESIGN' is equal to a given study design.
The comparison of study design values are done case insensitive.
If a data table with a list of studies is specified in studyList, only
the subset of studies included in that set is processed.
If input parameter inclUncertain=TRUE, uncertain animals are included
in the output set. These uncertain situations are identified and reported (in
column UNCERTAIN_MSG):
- without any row for TSPARMCD='SDESIGN' or 
- TSVAL doesn't contain a value included in the CDISC CT list 'DESIGN' for TSPARMCD='SDESIGN' (case insensitive comparison) 
The same checks are performed and reported in column NOT_VALID_MSG if
studyDesignFilter is empty and noFilterReportUncertain=TRUE.
Value
The function returns a data.table with columns:
- STUDYID (character) 
- Additional columns contained in the - studyListtable (if such an input table is given)
- SDESIGN (character) 
 If multiple TSPARMCD 'SDESIGN' values are extratced for a studies, all the values are merged into a comma separated string.
- UNCERTAIN_MSG (character) 
 Included when parameter- inclUncertain=TRUE.
 Contains indication of whether STSTDTC is missing of has wrong format.
 Is NA for rows where SDESIGN is valid.
 A non-empty UNCERTAIN_MSG value generated by this function is merged with non-empty UNCERTAIN_MSG values which may exist in the optional input set of studies specified in- studyList- separated by '|'.
- NOT_VALID_MSG (character) 
 Included when parameter- noFilterReportUncertain=TRUE.
 In case the SDESIGN cannot be confidently decided, the column contains an indication of the reason.
 Is NA for rows where SDESIGN can be confidently decided.
 A non-empty NOT_VALID_MSG value generated by this function is merged with non-empty NOT_VALID_MSG values which may exist in the input set of studies specified in- studyList- separated by '|'.
Examples
## Not run: 
GetStudyListSDESIGN(myDbToken, 'PARALLEL')
## End(Not run)
Extract a list of SEND studies with study start date within a specified interval - or just add actual study start date for each study
Description
Returns a data table with the list of study ids from TS where the value of
TSVAL for the TSPARMCD 'STSTDTC' is within a a given date interval.
If the fromDTC andtoDTC are empty (null, na or empty string)
- all rows for the TSPARMCD 'STSTDTC' are returned. 
Usage
getStudiesSTSTDTC(
  dbToken,
  studyList = NULL,
  fromDTC = NULL,
  toDTC = NULL,
  inclUncertain = FALSE,
  noFilterReportUncertain = TRUE
)
Arguments
| dbToken | Mandatory. | 
| studyList | Optional. | 
| fromDTC | Optional (either or both of  | 
| toDTC | Optional (either or both of  | 
| inclUncertain | Mandatory, boolean. | 
| noFilterReportUncertain | Mandatory, boolean | 
Details
Extracts the set of study ids from TS where the value of TSVAL for the
TSPARMCD 'STSTDTC' falls within a specified start/end date interval in IS8601
format (input parameters fromDTC/toDTC).
Both complete and incomplete input start/end dates can be handled.
- If only a year is specified - the date set to the first of January that year. 
- If only a year and month is specified - the date set to the first day in that month. 
- If a time part is included in a specified input start/end date, it is ignored. 
If both a start and end input date are specified - all the STUDYID values
from TS where TSVAL for TSPARMCD 'STSTDTC' is with the interval of the
specified start/end date interval are extracted and returned - including the
values equal to the start/end dates. are included.
If only a start input date is specified - all the STUDYID values from TS
where TSVAL for TSPARMCD 'STSTDTC' is equal to or later than the input date
are extracted and returned.
If only an end date is specified - all the STUDYID values from TS where TSVAL
for TSPARMCD 'STSTDTC' is equal to or earlier than the are date are extracted
and returned.
If a data table with a list of studies is specified in studyList, only
the subset of studies included in that set is processed.
If input inclUncertain is TRUE, uncertain studies are included in the
output set. These uncertain situations are identified and reported (in column
UNCERTAIN_MSG):
- TS contains now row for TSPARMCD='STSTDTC' 
- TSVAL contains an invalid ISO8601 date format for TSPARMCD='STSTDTC' 
The same checks are performed and reported in column NOT_VALID_MSG if
fromDTC and toDTC are empty and
noFilterReportUncertain=TRUE.
Value
The function return a data.table with columns:
- STUDYID (character) 
- Additional columns contained in the - studyListtable (if such an input table is given)
- STSTDTC (character - ISO8601 format) 
- UNCERTAIN_MSG (character) 
 Only included when parameter- inclUncertain=TRUE.
 Contains indication of whether STSTDTC is missing of has wrong format.
 Is NA for rows where SDESIGN is valid.
 A non-empty UNCERTAIN_MSG value generated by this function is merged with non-empty UNCERTAIN_MSG values which may exist in the optional input set of studies specified in- studyList- separated by '|'.
- NOT_VALID_MSG (character) 
 Included when parameter- noFilterReportUncertain=TRUE.
 In case the STSTDTC cannot be confidently decided, the column contains an indication of the reason.
 Is NA for rows where STSTDTC can be confidently decided.
 A non-empty NOT_VALID_MSG value generated by this function is merged with non-empty NOT_VALID_MSG values which may exist in the input set of studies specified in- studyList- separated by '|'.
Examples
## Not run: 
GetStudyListSTSTDTC(myDbToken, allSTudies, '2018','2020')
## End(Not run)
Extract data from a subject level domain.
Description
Extracts and returns all rows from the specified domain for the set
of subjects included in animalList.
Usage
getSubjData(dbToken, animalList, domain, colList = NULL)
Arguments
| dbToken | Mandatory | 
| animalList | Mandatory, data.table. | 
| domain | Mandatory, character, not case sensitive. | 
| colList | Optional, character, not case sensitive. | 
Value
The function returns a data.table with all the rows for the animals
included in animalList.
If no columns have been specified in colList, all the columns in
the table colList are included.
If a list of columns have been specified in colList, these are
included. In addition, a set of columns are always included, whether they
are included in colList or not:
- To ensure each row can be uniquely identified: - DOMAIN 
- STUDYID 
- USUBJID 
- POOLID (if it exists) 
- domainSEQ (if it exists) 
 
- For finding tables - to support age calculation and evaluation of study phase: - domainDTC 
- domainDY 
 
The order of the columns are as they are defined for the domain in the
SEND IG.
The data table contains both
- subject level data - i.e. rows where USUBJID is not empty 
- if applicable for the - domain, pool level data - i.e. rows where POOLID is not empty.
 In this case, all pools, which includes any of the subjects included in- animalList, are included
Examples
## Not run: 
# Extract all columns from DM:
getSubjData(myDbToken, myControlAnimals, 'dm')
# Extract selected columns from LB:
getSubjData(myDbToken, myControlAnimals, 'LB',
            list('LBTESTCD', 'LBCAT',
                 'LBSTRESC', 'LBSTRESN', 'LBSTRESU',
                 'LBSTAT', 'LBREASND',
                 'LBTPT'))
## End(Not run)
Extract the set of animals of the specified route of administration - or just add actual route of administration for each animal.
Description
Returns a data table with the set of animals included in the
animalList matching the route of administration specified in the
routeFilter.
If the routeFilter is empty (null, na or empty string) - all rows from
animalList are returned with an additional populated ROUTE column.
Usage
getSubjRoute(
  dbToken,
  animalList,
  routeFilter = NULL,
  exclusively = FALSE,
  matchAll = FALSE,
  inclUncertain = FALSE,
  noFilterReportUncertain = TRUE
)
Arguments
| dbToken | Mandatory | 
| animalList | Mandatory, data.table. | 
| routeFilter | Optional, character. | 
| exclusively | Mandatory if  
 | 
| matchAll | Mandatory if  
 | 
| inclUncertain | Mandatory if  | 
| noFilterReportUncertain | Mandatory if  | 
Details
The route of administration per animal are identified by a hierarchical lookup in these domains
- EX - If a distinct not empty EXROUTE value is found for animal, this is included in the output. 
 
- TS - if a distinct TS parameter 'ROUTE' value exists for the study, this is included in the output. 
 
The comparison of route values is done case insensitive and trimmed for leading/trailing blanks.
If input parameter inclUncertain=TRUE, uncertain animals are included
in the output set. These uncertain situations are identified and reported (in
column UNCERTAIN_MSG):
- TS parameter ROUTE is missing for study and no EX rows contain a EXROUTE value for the animal 
- The selected EXROUTE or TS parameter ROUTE value is invalid (not CT value - CDISC SEND code list ROUTE) 
- Multiple EXROUTE values have been found for the animal 
- Multiple TS parameter ROUTE values are registered for study but no EX rows contain a EXROUTE value for the animal 
- The found EXROUTE value for animal is not included in the TS parameter ROUTE value(s) registered for study 
The same checks are performed and reported in column NOT_VALID_MSG if
routeFilter is empty and noFilterReportUncertain=TRUE.
Value
The function returns a data.table with columns:
- STUDYID (character) 
- Additional columns contained in the - animalListtable
- ROUTE (character) 
 The value is always returned in uppercase and trimmed for leading/trailing blanks.
- UNCERTAIN_MSG (character) 
 Included when parameter- inclUncertain=TRUE.
 In case the ROUTE cannot be confidently matched during the filtering of data, the column contains an indication of the reason.
 Is NA for rows where ROUTE can be confidently matched.
 A non-empty UNCERTAIN_MSG value generated by this function is merged with non-empty UNCERTAIN_MSG values which may exist in the input set of animals specified in- animalList- separated by '|'.
- NOT_VALID_MSG (character) 
 Included when parameter- noFilterReportUncertain=TRUE.
 In case the ROUTE cannot be confidently decided, the column contains an indication of the reason.
 Is NA for rows where the ROUTE can be confidently decided.
 A non-empty NOT_VALID_MSG value generated by this function is merged with non-empty NOT_VALID_MSG values which may exist in the input set of animals- animalList- separated by '|'.
Examples
## Not run: 
# Extract animals administered oral or oral gavage plus uncertain animals
getSubjRoute(dbToken, controlAnimals,
             routeFilter = c('ORAL', 'ORAL GAVAGE'),
             inclUncertain = TRUE)
# Extract animals administered oral or oral gavage.
# Do only include studies which include both route values
getSubjRoute(dbToken, controlAnimals,
             routeFilter = c('ORAL', 'ORAL GAVAGE'),
             matchAll = TRUE)
# Extract animals administered subcutaneous.
# Include only animals from studies which do not contain other route values
getSubjRoute(dbToken, controlAnimals,
             routeFilter = 'subcutaneous',
             exclusively = TRUE)
# No filtering, just add ROUTE - do not include messages when
# these values cannot be confidently found
getSubjRoute(dbToken, controlAnimals,
             noFilterReportUncertain = FALSE)
## End(Not run)
Extract the set of animals of the specified sex - or just add the sex of each animal.
Description
Returns a data table with the set of animals included in the
animalList of the sex specified in the sexFilter.
If the sexFilter is empty (null, na or empty string) - all rows from
animalList are returned with the an additional populated SEX column.
Usage
getSubjSex(
  dbToken,
  animalList,
  sexFilter = NULL,
  inclUncertain = FALSE,
  noFilterReportUncertain = TRUE
)
Arguments
| dbToken | Mandatory | 
| animalList | Mandatory, data.table. | 
| sexFilter | Optional, character. | 
| inclUncertain | Mandatory, boolean. | 
| noFilterReportUncertain | Mandatory, boolean. | 
Details
The sex value is decided from the DM.SEX variable.
The comparison of DM.SEX with the given value(s) in sexFilter is done
case-insensitive.
If input parameter inclUncertain=TRUE, uncertain animals are included
in the output set. These uncertain situations are identified and reported (in
column UNCERTAIN_MSG):
- The DM.SEX value is empty or invalid (not CT value - CDISC codelist SEX - case insensitive comparison) 
The same checks are performed and reported in column NOT_VALID_MSG if
sexFilter is empty and noFilterReportUncertain=TRUE.
Value
The function returns a data.table with columns:
- STUDYID (character) 
- Additional columns contained in the - animalListtable
- SEX (character) 
- UNCERTAIN_MSG (character) 
 Included when parameter- inclUncertain=TRUE.
 In case the sex cannot be confidently matched during the filtering of data, the column contains an indication of the reason.
 Is NA for rows where SEX can be confidently matched.
 A non-empty UNCERTAIN_MSG value generated by this function is merged with non-empty UNCERTAIN_MSG values which may exist in the input set of animals specified in- animalList- separated by '|'.
- NOT_VALID_MSG (character) 
 Included when parameter- noFilterReportUncertain=TRUE.
 In case the sex cannot be confidently decided, the column contains an indication of the reason.
 Is NA for rows where sex can be confidently decided.
 A non-empty NOT_VALID_MSG value generated by this function is merged with non-empty NOT_VALID_MSG values which may exist in the input set of animals- animalList- separated by '|'.
Examples
## Not run: 
getSubjSex(myDbToken, controlAnimals, 'M')
## End(Not run)
Extract the set of animals of the specified species and strain - or just add the species and strain for each animal.
Description
Returns a data table with the set of animals included in the
animalList matching the species and strain specified in the
speciesFilter and strainFilter.
If the speciesFilter and strainFilter are empty (null, na or
empty string) - all rows from animalList are returned with additional
populated SPECIES and STRAIN columns.
Usage
getSubjSpeciesStrain(
  dbToken,
  animalList,
  speciesFilter = NULL,
  strainFilter = NULL,
  inclUncertain = FALSE,
  exclusively = FALSE,
  noFilterReportUncertain = TRUE
)
Arguments
| dbToken | Mandatory | 
| animalList | Mandatory, data.table. | 
| speciesFilter | Optional, character. | 
| strainFilter | Optional, character. | 
| inclUncertain | Mandatory, boolean. | 
| exclusively | Mandatory, boolean. 
 | 
| noFilterReportUncertain | Optional, boolean. | 
Details
The species and strain per animal respectively are identified by a hierarchical lookup in these domains
- DM - If the DM.SPECIES (DM.STRAIN) isn't empty, this value is included in the output. 
 
- TX - if a TX parameter 'SPECIES' ('STRAIN') exists for the group related to the animal, the TXVAL value for this is included in the output. 
 
- TS - if a TS parameter 'SPECIES' ('STRAIN') exists, this is included in the output. 
 
The comparisons of species/strain values is done case insensitive and trimmed for leading/trailing blanks.
If input parameter inclUncertain=TRUE, uncertain animals are included
in the output set. These uncertain situations are identified and reported for
SPECIES and STRAIN respectively (in column UNCERTAIN_MSG):
- TS parameter SPECIES/STRAIN is missing or invalid (not CT value - CDISC SEND code list SPECIES/STRAIN) and TX parameter SPECIES/STRAIN is missing or invalid (not CT value) and DM.SPECIES/STRAIN is missing or invalid (not CT value) 
- Different values of SPECIES/STRAIN across TS, TX and DM for studies where no or only one TS parameter SPECIES/STRAIN is registered 
- Multiple TS parameter SPECIES/STRAIN values are registered for study and TX parameter SPECIES/STRAIN and/or DM.SPECIES/STRAIN do not match any of the TS values. 
- Multiple TS parameter SPECIES/STRAIN values are registered for study and TX parameter SPECIES/STRAIN and DM.SPECIES/STRAIN are unequal. 
The same checks are performed and reported in column NOT_VALID_MSG if
speciesFilter and strainFilter are empty and
noFilterReportUncertain=TRUE.
Value
The function returns a data.table with columns:
- STUDYID (character) 
- Additional columns contained in the - animalListtable
- SPECIES (character) The value is always returned in uppercase and trimmed for leading/trailing blanks. 
- STRAIN (character) The value is always returned in uppercase and trimmed for leading/trailing blanks. 
- UNCERTAIN_MSG (character) 
 Included when parameter- inclUncertain=TRUE.
 In case the species or strain cannot be confidently matched during the filtering of data, the column contains an indication of the reason.
 Is NA for rows where species and strain can be confidently matched.
 A non-empty UNCERTAIN_MSG value generated by this function is merged with non-empty UNCERTAIN_MSG values which may exist in the input set of animals specified in- animalList- separated by '|'.
- NOT_VALID_MSG (character) 
 Included when parameter- noFilterReportUncertain=TRUE.
 In case the species or strain cannot be confidently decided, the column contains an indication of the reason.
 Is NA for rows where species and strain can be confidently decided.
 A non-empty NOT_VALID_MSG value generated by this function is merged with non-empty NOT_VALID_MSG values which may exist in the input set of animals- animalList- separated by '|'.
Examples
## Not run: 
# Extract rats and mice plus uncertain animals
getSubjSpeciesStrain(dbToken, controlAnimals,
                     speciesFilter = c('RAT', 'MOUSE'),
                     inclUncertain = TRUE)
# Extract Spargue-Dawley rats plus uncertain animals.
# Include only animals from studies which do not contain other species or
# strains
getSubjSpeciesStrain(dbToken, controlAnimals,
                     speciesFilter = 'RAT',
                     strainFilter = 'SPRAGUE-DAWLEY',
                     inclUncertain = TRUE,
                     exclusively = TRUE,
                     noFilterReportUncertain = TRUE)
# Extract Wistar rats and and Beagle dogs - and no uncertain animals
getSubjSpeciesStrain(dbToken, controlAnimals,
                     speciesFilter = c('RAT', 'DOG'),
                     strainFilter = c('RAT: WISTAR', 'DOG: BEAGLE'))
# No filtering, just add SPECIES and STRAIN - do not include messages when
# these values cannot be confidently found
getSubjSpeciesStrain(dbToken, controlAnimals,
                     noFilterReportUncertain = FALSE)
## End(Not run)
Get labels for columns in a data.table
Description
Get labels for columns in a data.table
Usage
getTabColLabels(table)
Arguments
| table | Mandatory | 
Value
A named vector with each column/label pair. If a column have no defined label, the label is 'na'
Examples
## Not run: 
colLabels = getTabColLabels(controlAnimalsAll)
## End(Not run)
Initialize the environment.
Description
Open or create a SEND database and return a token for the open database connection.
Usage
initEnvironment(
  dbType = NULL,
  dbPath = NULL,
  dbCreate = FALSE,
  dbUser = NULL,
  dbPwd = NULL,
  dbSchema = NULL,
  ctFile = NULL
)
Arguments
| dbType | Mandatory, character 
 | 
| dbPath | Mandatory, character | 
| dbCreate | Mandatory, boolean | 
| dbUser | Mandatory, character - if login credentials are required for the
specific db type | 
| dbPwd | Mandatory, character - if login credentials are required for the
specific db type | 
| dbSchema | Optional, character | 
| ctFile | Optional, character. | 
Details
If the function is executed with parameter dbCreate=FALSE (default),
a connection to the specified database is opened. Dependent of the type of
database (parameter dbType), a login using specified user credentials
(parameters dbUser and dbPwd) may be done.
The database must contain a set of tables representing the SEND domains
compliant with SEND IG version 3.0 and/on 3.1.
If the function is executed with parameter dbCreate=TRUE, an empty
database is created and opened. This is only supported for a SQLite database,
i.e. parameter dbType='sqlite'. The SEND domain tables may then be
created by execution of the function dbCreateSchema.
Besides the open database connection, a set of CDISC SEND controlled
terminology values are imported. If parameter ctFile is specified with
a path to an Excel file containing a CDISC SEND ct version downloaded from
https://evs.nci.nih.gov/ftp1/CDISC/SEND/, the content
from this file is imported and used by some of the package's functions.
Else a set of CDISC SEND CT values which are included in the  packages is
used by the package's functions. It's the newest CDISC SEND CT version at the
time of the build of the current version of the package which is included.
Value
The function returns a token which is a data structure describing the open database connection. This token must be given as input parameter to all functions accessing the actual database.
Examples
## Not run: 
db <- initEnvironment(dbType='sqlite',
                      dbPath='//servername/SendData/db/send.db',
                      ctFile='//servername/SendData/metadata/SEND_Terminology_2019-12-27.xls')
db <- initEnvironment(dbType='oracle',
                      dbPath='dbserver:1521/send_db',
                      dbUser='ME',
                      dbPwd='mypassword',
                      dbSchema = 'send',
                      ctFile='//servername/SendData/metadata/SEND_Terminology_2019-12-27.xls')
## End(Not run)
Standardizes SEND xpt files using CDISC controlled terminologies
Description
Standardizes SEND xpt files using CDISC controlled terminologies
Usage
standardize_file(input_xpt_dir, output_xpt_dir, json_file)
Arguments
| input_xpt_dir | Mandatory. | 
| output_xpt_dir | Mandatory. | 
| json_file | Mandatory. | 
Value
No return value, called for side effects