| Title: | 'NoSQL' Database Connector | 
| Description: | Simplified JSON document database access and manipulation, providing a common API across supported 'NoSQL' databases 'Elasticsearch', 'CouchDB', 'MongoDB' as well as 'SQLite/JSON1', 'PostgreSQL', and 'DuckDB'. | 
| Version: | 0.13.1 | 
| License: | MIT + file LICENSE | 
| LazyData: | true | 
| URL: | https://docs.ropensci.org/nodbi/, https://github.com/ropensci/nodbi | 
| BugReports: | https://github.com/ropensci/nodbi/issues | 
| Depends: | R (≥ 3.4.0) | 
| Encoding: | UTF-8 | 
| Language: | en-US | 
| Imports: | stringi, jsonlite, uuid, jqr, DBI, V8, R.utils | 
| Suggests: | sofa, elastic, mongolite, RSQLite, duckdb, RPostgres, testthat, withr, callr, webfakes, knitr, rmarkdown, tibble | 
| RoxygenNote: | 7.3.2 | 
| X-schema.org-applicationCategory: | Databases | 
| X-schema.org-keywords: | database, MongoDB, Elasticsearch, CouchDB, SQLite, PostgreSQL, DuckDB, NoSQL, JSON, documents | 
| X-schema.org-isPartOf: | https://ropensci.org | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2025-06-26 19:27:03 UTC; ralfherold | 
| Author: | Ralf Herold | 
| Maintainer: | Ralf Herold <ralf.herold@mailbox.org> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-06-26 19:50:02 UTC | 
Document database connector
Description
Simplified document database access and manipulation, providing a common API across supported 'NoSQL' databases 'Elasticsearch', 'CouchDB', 'MongoDB' as well as 'SQLite/JSON1', 'PostgreSQL' and 'DuckDB'.
Author(s)
Ralf Herold ralf.herold@mailbox.org
Scott Chamberlain sckott@protonmail.com
Rich FitzJohn rich.fitzjohn@gmail.com
Jeroen Ooms jeroen.ooms@stat.ucla.edu
See Also
Useful links:
- Report bugs at https://github.com/ropensci/nodbi/issues 
Data set 'contacts'
Description
Data set 'contacts'
Usage
contacts
Format
A JSON string with ragged, nested contact details
Data set 'diamonds'
Description
Data set 'diamonds'
Format
A data frame with 53940 rows and 10 variables:
- price price in US dollars (326-18,823 USD) 
- carat weight of the diamond (0.2-5.01) 
- cut quality of the cut (Fair, Good, Very Good, Premium, Ideal) 
- color diamond colour, from J (worst) to D (best) 
- clarity a measurement of how clear the diamond is (I1 (worst), SI1, SI2, VS1, VS2, VVS1, VVS2, IF (best)) 
- x length in mm (0-10.74) 
- y width in mm (0-58.9) 
- z depth in mm (0-31.8) 
- depth total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43-79) 
- table width of top of diamond relative to widest point (43-95) 
Source
from ggplot2
Create documents in a database
Description
A message is emitted if the container key already exists.
Usage
docdb_create(src, key, value, ...)
Arguments
| src | Source object, result of call to any of functions
 | 
| key | (character) The name of the container in the
database backend (corresponds to  | 
| value | The data to be created in the database: a single data.frame, a JSON string, a list, or a file name or URL that points to NDJSON documents | 
| ... | Passed to functions  | 
Details
An error is raised for document(s) in value when their
_id already exist(s) in the collection key;
use docdb_update() to update such document(s).
Value
(integer) Number of successfully created documents
Identifiers
If value is a data.frame that has a column _id,
or is a JSON string having a key _id at root level,
or is a list having an item _id at its top level,
this will be used as _id's and primary index
in the database. If there are no such _id's in value,
row names (if any exist) of value will be used as _id's,
otherwise random _id's will be created (using
uuid::UUIDgenerate() with use.time = TRUE for
SQLite and PostgreSQL, or using DuckDB's built-in uuid()).
Examples
## Not run: 
src <- src_sqlite()
docdb_create(src,
  key = "diamonds_small",
  value = as.data.frame(diamonds[1:3000L, ])
)
head(docdb_get(src, "diamonds_small"))
docdb_create(src, key = "contacts", value = contacts)
docdb_get(src, "contacts")[["friends"]]
## End(Not run)
Delete documents or container
Description
Delete documents or container
Usage
docdb_delete(src, key, ...)
Arguments
| src | Source object, result of call to any of functions
 | 
| key | (character) The name of the container in the
database backend (corresponds to  | 
| ... | Optionally, specify  | 
Value
(logical) Success of operation. Typically TRUE if
document(s) or collection existed, and FALSE if document(s)
did not exist, or collection did not exist, or delete was not successful.
Examples
## Not run: 
src <- src_sqlite()
docdb_create(src, "iris", iris)
docdb_delete(src, "iris", query = '{"Species": {"$regex": "a$"}}')
docdb_delete(src, "iris")
## End(Not run)
Check if container exists in database
Description
Check if container exists in database
Usage
docdb_exists(src, key, ...)
Arguments
| src | Source object, result of call to any of functions
 | 
| key | (character) The name of the container in the
database backend (corresponds to  | 
| ... | Passed to functions  | 
Value
(logical) TRUE or FALSE to indicate
existence of container key in database.
Note this does not indicate if the container
holds any documents.
Examples
## Not run: 
src <- src_sqlite()
docdb_exists(src, "nonexistingcontainer")
docdb_create(src, "mtcars", mtcars)
docdb_exists(src, "mtcars")
## End(Not run)
Get all documents from container in database
Description
Get all documents from container in database
Usage
docdb_get(src, key, limit = NULL, ...)
Arguments
| src | Source object, result of call to any of functions
 | 
| key | (character) The name of the container in the
database backend (corresponds to  | 
| limit | (integer) Maximum number of documents
to be returned. If  | 
| ... | Passed on to functions: 
 | 
Value
Data frame, one document per row
Examples
## Not run: 
src <- src_sqlite()
docdb_create(src, "mtcars", mtcars)
docdb_get(src, "mtcars", limit = 10L)
## End(Not run)
List containers in database
Description
List containers in database
Usage
docdb_list(src, ...)
Arguments
| src | Source object, result of call to any of functions
 | 
| ... | Passed to function  | 
Value
Vector of names of containers that can be
used as parameter key with other functions such as
docdb_create().
Examples
## Not run: 
src <- src_sqlite()
docdb_create(src, "iris", iris)
docdb_list(src)
## End(Not run)
Get documents or parts with filtering query
Description
Complements the databases' native query and filtering functions
by using jqr::jqr().
If query = "{}" and neither fields
nor listfields is specified, runs docdb_get().
Usage
docdb_query(src, key, query, ...)
Arguments
| src | Source object, result of call to any of functions
 | 
| key | (character) The name of the container in the
database backend (corresponds to  | 
| query | (character) A JSON query string, see examples.
Can use comparisons / tests ( | 
| ... | Optional parameters: 
 | 
Value
Data frame with requested documents, one per row,
may have nested lists in columns;
NULL if no documents could be found.
If listfields is specified: vector of all field names
in dot path notation.
Note
A dot in query or fields is interpreted as a dot path,
pointing to a field nested within another, e.g. friends.id in
the example.
Examples
## Not run: 
src <- src_sqlite()
docdb_create(src, "myKey", mtcars)
docdb_create(src, "myKey", contacts)
docdb_create(src, "myKey", mapdata)
docdb_query(src, "myKey", query = '{"mpg":21}')
docdb_query(src, "myKey", query = '{"mpg":21, "gear": {"$lte": 4}}')
docdb_query(src, "myKey", query = '{"mpg":21}', fields = '{"_id":0, "mpg":1, "cyl":1}')
docdb_query(src, "myKey", query = '{"_id": {"$regex": "^.+0.*$"}}', fields = '{"gear": 1}')
docdb_query(src, "myKey", query = '{"$and": [{"mpg": {"$lte": 18}}, {"gear": {"$gt": 3}}]}')
docdb_query(src, "myKey", query = '{}', fields = '{"_id":0, "mpg":1, "cyl":1}')
docdb_query(src, "myKey", query = '{"$and": [{"age": {"$gt": 21}},
 {"friends.name": {"$regex": "^B[a-z]{3,9}.*"}}]}')
docdb_query(src, "myKey", query = '{"$or": [{"rows.elements.status": "OK"}, {"$and": [
 {"_id": "5cd6785325ce3a94dfc54096"}, {"friends.name": {"$regex": "^B[a-z]{3,90}.*"}}]}]}')
docdb_query(src, "myKey", query = '{"$and": [{"_id": "5cd6785325ce3a94dfc54096"},
 {"friends.name": {"$regex": "^B[a-z]{3,90}.*"}}]}')
docdb_query(src, "myKey", query = '{"origin_addresses": {"$in": ["Santa Barbara, CA, USA",
 "New York, NY, USA"]}}', fields = '{"age": 1, "friends.id": 1, "_id": 0,
 "rows.elements.status": 1}')
docdb_query(src, "myKey", query = '{"rows.elements.status": "OK"}', listfields = TRUE)
## End(Not run)
Update documents
Description
Documents are updated by patching their JSON with
value. Documents are identified by a query
or by _id's in value, where the latter takes
precedence. value can have multiple documents (with
_id's), which then are iteratively updated.
Usage
docdb_update(src, key, value, query, ...)
Arguments
| src | Source object, result of call to any of functions
 | 
| key | (character) The name of the container in the
database backend (corresponds to  | 
| value | The data to be created in the database: a single data.frame, a JSON string, a list, or a file name or URL that points to NDJSON documents | 
| query | (character) A JSON query string, see examples.
Can use comparisons / tests ( | 
| ... | Passed on to functions  | 
Details
Uses native functions in MongoDB (mongolite::mongo()$update()),
SQLite (jsonb_update()), DuckDB (jsonb_merge_patch()),
Elasticsearch (elastic::docs_bulk_update());
a plpgsql function added when calling src_postgres(),
and a jqr::jqr() programme for CouchDB.
Value
(integer) Number of successfully updated documents
Examples
## Not run: 
src <- src_sqlite()
docdb_create(src, "mtcars", mtcars)
docdb_update(src, "mtcars", value = mtcars[3, 4:5], query = '{"gear": 3}')
docdb_update(src, "mtcars", value = '{"carb":999}', query = '{"gear": 5}')
docdb_update(src, "mtcars", value = '{"_id":"Fiat 128", "carb":888}', query = '{}')
docdb_get(src, "mtcars")
## End(Not run)
Data set 'mapdata'
Description
Data set 'mapdata'
Usage
mapdata
Format
A JSON string with ragged, nested travel details
Setup database connections
Description
There is a src_*() function to setup a connection to each
of the database backends. The backends may have specific parameters
in the respective function src_*(), but all other nodbi functions
are independent of the backend (e.g., see docdb_query()).
Details
- MongoDB - - src_mongo()
- SQLite - - src_sqlite()
- Elasticsearch - - src_elastic()
- CouchDB - - src_couchdb()
- PostgreSQL - - src_postgres()
- DuckDB - - src_duckdb()
Documentation details for each database:
- SQLite/JSON1 - https://www.sqlite.org/json1.html 
- Elasticsearch - https://www.elastic.co/docs/get-started 
- PostgreSQL - https://www.postgresql.org/docs/current/functions-json.html 
- DuckDB - https://duckdb.org/docs/stable/data/json/overview.html 
Documentation of R packages used by nodbi for the databases:
- mongolite - https://CRAN.R-project.org/package=mongolite 
- RPostgres - https://CRAN.R-project.org/package=RPostgres 
Setup a CouchDB database connection
Description
Setup a CouchDB database connection
Usage
src_couchdb(
  host = "127.0.0.1",
  port = 5984,
  path = NULL,
  transport = "http",
  user = NULL,
  pwd = NULL,
  headers = NULL
)
Arguments
| host | (character) host value, default: 127.0.0.1 | 
| port | (integer/numeric) Port. Remember that if you don't want a port set, set this parameter to NULL. Default: 5984 | 
| path | (character) context path that is appended to the end of the url, e.g., bar in http://foo.com/bar. Default: NULL, ignored | 
| transport | (character) http or https. Default: http | 
| user | (character) Username, if any | 
| pwd | (character) Password, if any | 
| headers | (list) list of named headers | 
Details
Uses sofa as backend. nodbi creates or uses
a CouchDB database with JSON documents. If documents do not have
root-level _id's, UUID's are created as _id's. Function
docdb_update() uses jqr::jqr() to implement patching JSON.
For a benchmark, see https://github.com/ropensci/nodbi#benchmark.
Value
A nodbi source object
Examples
## Not run: 
con <- src_couchdb()
print(con)
## End(Not run)
Setup a DuckDB database connection
Description
Setup a DuckDB database connection
Usage
src_duckdb(drv = duckdb::duckdb(), dbdir = attr(drv, "dbdir"), ...)
Arguments
| drv | Object returned by  | 
| dbdir | Location for database files. Should be a path to an existing
directory in the file system. With the default (or  | 
| ... | Additional named parameters passed on to  | 
Details
Uses duckdb::duckdb() as backend. nodbi creates or
uses a DuckDB table, with columns _id and json created and used
by package nodbi, applying SQL functions as per
https://duckdb.org/docs/extensions/json to the json column.
Each row in the table represents a JSON document.
Any root-level _id is extracted from the document(s) and used for
column _id, otherwise a UUID is created as _id.
The table is indexed on _id.
For a benchmark, see https://github.com/ropensci/nodbi#benchmark.
Value
A nodbi source object
Examples
## Not run: 
con <- src_duckdb()
print(con)
## End(Not run)
Setup an Elasticsearch database connection
Description
Setup an Elasticsearch database connection
Usage
src_elastic(
  host = "127.0.0.1",
  port = 9200,
  path = NULL,
  transport_schema = "http",
  user = NULL,
  pwd = NULL,
  force = FALSE,
  ...
)
Arguments
| host | (character) the base url, defaults to localhost (http://127.0.0.1) | 
| port | (character) port to connect to, defaults to 9200 (optional) | 
| path | (character) context path that is appended to the end of the
url. Default:  | 
| transport_schema | (character) http or https. Default: http | 
| user | (character) User name, if required for the connection. You can specify, but ignored for now. | 
| pwd | (character) Password, if required for the connection. You can specify, but ignored for now. | 
| force | (logical) Force re-load of connection details | 
| ... | Further args passed on to  | 
Details
Uses elastic as backend. nodbi creates or uses
an Elasticsearch index, in which nodbi creates JSON documents.
Any root-level _id is extracted from the document(s) and used as
document ID _id, otherwise a UUID is created as document ID _id.
Only lowercase is accepted for container names (in parameter key).
Opensearch can equally be used.
For a benchmark, see https://github.com/ropensci/nodbi#benchmark
Value
A nodbi source object
Examples
## Not run: 
con <- src_elastic()
print(con)
## End(Not run)
Setup a MongoDB database connection
Description
Setup a MongoDB database connection
Usage
src_mongo(collection = "test", db = "test", url = "mongodb://localhost", ...)
Arguments
| collection | (character) Name of collection | 
| db | (character) Name of database | 
| url | (character) Address of the MongoDB server in Mongo connection
string URI format, see to  | 
| ... | Additional named parameters passed on to  | 
Details
Uses monoglite as backend. nodbi creates or uses
a MongoDB collection, in which nodbi creates JSON documents.
If documents do not have root-level _id's, UUID's are created as _id's.
MongoDB but none of the other databases require to specify the container
already in the src_mongo() function.
For a benchmark, see https://github.com/ropensci/nodbi#benchmark
Value
A nodbi source object
Examples
## Not run: 
con <- src_mongo()
print(con)
## End(Not run)
Setup a PostgreSQL database connection
Description
Setup a PostgreSQL database connection
Usage
src_postgres(dbname = "test", host = "localhost", port = 5432L, ...)
Arguments
| dbname | (character) name of database, has to exist to open a connection | 
| host | (character) host of the database,
see  | 
| port | (integer) port of the database,
see  | 
| ... | additional named parameters passed
on to  | 
Details
Uses RPostgres as backend. nodbi creates or uses
a PostgreSQL table, with columns _id and json created and used
by package nodbi, applying SQL functions as per
https://www.postgresql.org/docs/current/functions-json.html
to the json column.
Each row in the table represents a JSON document.
Any root-level _id is extracted from the document(s) and used
for column _id, otherwise a UUID is created as _id.
The table is indexed on _id. A custom plpgsql function
is used for docdb_update().
The order of variables in data frames returned by docdb_get()
and docdb_query() can differ from their order the input to
docdb_create().
For a benchmark, see https://github.com/ropensci/nodbi#benchmark
Value
A nodbi source object
Examples
## Not run: 
con <- src_postgres()
print(con)
## End(Not run)
Setup a RSQLite database connection
Description
Setup a RSQLite database connection
Usage
src_sqlite(dbname = ":memory:", ...)
Arguments
| dbname | (character) name of database file,
defaults to ":memory:" for an in-memory database,
see  | 
| ... | additional named parameters passed
on to  | 
Details
Uses RSQLite as backend. nodbi creates or uses
an SQLite table, with columns _id and json created and used by
package nodbi, applying SQL functions as per
https://www.sqlite.org/json1.html to the json column.
Each row in the table represents a JSON document.
Any root-level _id is extracted from the document(s) and used for
column _id, otherwise a UUID is created as _id.
The table is indexed on _id.
For a benchmark, see https://github.com/ropensci/nodbi#benchmark
Value
A nodbi source object
Examples
## Not run: 
con <- src_sqlite()
print(con)
## End(Not run)