Data owners can manage privacy settings in two main ways: (i) setting
the privacy control level, and (ii) controlling which additional
functions can be passed to ds.tidyverse.
DataSHIELD implements privacy
control levels, which allows data owners to control which functions
can be used by researchers. The table below shows which dsTidyverse
functions are permitted in which privacy mode. This option can be set on
the server: for example to set to non-permissive mode use
datashield.privacyControlLevel = "non-permissive"
| Function | Permissive | Banana | Avocado | Non-Permissive |
|---|---|---|---|---|
arrangeDS |
✔ | ✔ | ||
asTibbleDS |
✔ | ✔ | ✔ | ✔ |
bindColsDS |
✔ | ✔ | ||
bindRowsDS |
✔ | ✔ | ||
caseWhenDS |
✔ | ✔ | ||
distinctDS |
✔ | ✔ | ✔ | ✔ |
filterDS |
✔ | ✔ | ||
groupByDS |
✔ | ✔ | ||
groupKeysDS |
✔ | ✔ | ||
mutateDS |
✔ | ✔ | ||
renameDS |
✔ | ✔ | ✔ | ✔ |
selectDS |
✔ | ✔ | ✔ | ✔ |
sliceDS |
✔ | ✔ | ||
ungroupDS |
✔ | ✔ |
dsTidyverse allows additional functions to be passed via the
tidy_expr argument. For example, using ds.mutate you can
pass as.numeric:
ds.mutate("mtcars", list(cyl = as.numeric(cyl)), "newobj")
Functions are only allowed to be passed which do not risk disclosing individual level data. The default list of allowed functions is:
"everything", "last_col", "group_cols", "starts_with", "ends_with", "contains",
"matches", "num_range", "all_of", "any_of", "where", "rename", "mutate", "if_else",
"case_when", "mean", "median", "mode", "desc", "last_col", "nth", "where", "num_range",
"exp", "sqrt", "scale", "round", "floor", "ceiling", "abs", "sd", "var",
"sin", "cos", "tan", "asin", "acos", "atan", "c", "as.character", "as.integer",
"lag", "diff", "cumsum"
These defaults can be managed by the data owner on their server using
the option tidyverse.permitted.functions. For example, if
as a data owner you want to restrict the permitted functions to
as.numeric and as.integer, you can set the
option
tidyverse.permitted.functions = c("as.numeric", "as.integer").
WARNING: This feature gives the data manager the option to restrict allowed functions, but also to allow additional functions. If you choose to allow functions not included in the default list, please take steps to ensure that they are compatible with your research setting, and in sensitive settings with secure data they do not risk returning individual level data to the researcher. If you have doubts please contact the maintainers of dsTidyverse who can discuss the risks with you.