The R package easybgm provides a
user-friendly package for performing a Bayesian analysis of psychometric
networks. In particular, it helps to fit, extract, and visualize the
results of a Bayesian graphical model commonly used in the
social-behavioral sciences. The package is a wrapper around existing
packages. So far, the package supports fitting and extracting results of
cross-sectional network models using BDgraph (Mohammadi
& Wit, 2015), BGGM(Williams & Mulder, 2019), and
bgms (Marsman & Haslbeck, 2023). As output, the package
extracts the posterior parameter estimates, the posterior inclusion
probability, the inclusion Bayes factor, and optionally posterior
samples of the parameters and the nodes centrality. The package comes
with an extensive suite of visualization functions.
To install this package from CRAN use:
install.packages("easybgm")To install this package from Github use
install.packages("remotes")
remotes::install_github("KarolineHuth/easybgm")To rather install the most up-to-date developer version, use
install.packages("remotes")
remotes::install_github("KarolineHuth/easybgm", ref = "developer")The package consists of wrapper functions around existing R packages
(i.e., BDgraph, bgms, and BGGM).
To initiate estimation, researchers must specify the data set and the
data type (i.e., continuous, mixed, ordinal, or binary). Based on the
data type specification, easybgm estimates the network
using the appropriate R package (i.e., BDgraph for
continuous and mixed data, and bgms for ordinal and binary
data). Users can override the default package selection by specifying
their preferred R package with the package argument. All
other arguments, such as package-specific informed prior specifications,
can be passed to easybgm. As output, easybgm
returns the posterior parameter estimates, the posterior inclusion
probability, and the inclusion Bayes factor. In addition, the package
extracts the posterior samples of the parameters by setting
save = TRUE and the strength centrality samples by setting
centrality = TRUE.
The package comes with an extensive suite of functions to visualize
the results of the Bayesian analysis of networks. We provide more
information on each of the plots below. The visualization functions use
qgraph (Epskamp et al., 2012) or ggplot2
(Wickham, 2016) as the backbone.
The edge evidence plot aids researchers in deciding which edges provide robust inferential conclusions. In the edge evidence plot, edges represent the inclusion Bayes factor \(`\text{BF}_{10}`\). Red edges indicate evidence for edge absence (i.e., conditional independence), grey edges indicate the absence of evidence, and blue edges indicate evidence for edge presence (i.e., conditional dependence). By default, a \(`\text{BF}_{10} > 10`\) is considered strong evidence for inclusion and \(`\text{BF}_{01} > 10`\) for exclusion. Users can specify the threshold for Bayes factors.
In the network plot, edges indicate the strength of partial association between two nodes. The network plot shows all edges with an inclusion Bayes factor greater than \(1\), i.e. all edges that have some evidence of inclusion. Edge thickness and saturation represent the strength of the association; the thicker the edge, the stronger the association. Red edges indicate negative associations and blue edges indicate positive associations.
The structure uncertainty can be assessed with the posterior structure probability plot and the posterior complexity plot. The posterior structure probability plot shows the posterior probabilities of the visited structures, sorted from the most to the least probable. Each dot represents one structure. The more structures with similar posterior probability, the more uncertain the true structure. If one structure dominates the posterior structure probability, we can be relatively certain about the true structure.
The posterior complexity plot shows the posterior probability of a structure complexity (i.e., number of present edges in a network). Here, the posterior probability of all structures with the same complexity are aggregated into one plot.
The 95 % highest density intervals (HDI) of the parameters are visualized with a parameter forest plot. In the plot, dots represent the median of the posterior samples and the lines indicate the shortest interval that covers 95% of the posterior distribution. The narrower an interval, the more stable a parameter.
Researchers often use centrality measures to obtain aggregated information for each node, such as the connectedness quantified by strength centrality. Credible intervals for strength centrality can be obtained by calculating the centrality measure for each sample of the posterior distribution. The higher the centrality, the more connected the node; error bars represent the 95% High Density Interval (HDI).
We want to illustrate the package use with an example. In particular,
we use the women and mathematics data which can be loaded with the
package BGGM. We fit the model and extract its results with
the function easybgm. We specify the data and the data
type, which in this case is binary.
library(easybgm)
library(BGGM)
data <- na.omit(women_math)
colnames(women_math) <- c("LCA", "GND", "SCT", "NMF", "SBP", "FTP")
res <- easybgm(data, type = "binary")Having fitted the model, we can now take a look at its results.
summary(res)
#>  BAYESIAN ANALYSIS OF NETWORKS 
#>  Model type: binary 
#>  Number of nodes: 6 
#>  Fitting Package: bgms 
#> --- 
#>  EDGE SPECIFIC OVERVIEW 
#>  Relation Estimate Posterior Incl. Prob. Inclusion BF Category
#>   LCA-GND    0.000                 0.034        0.035 excluded
#>   LCA-SCT    0.001                 0.027        0.028 excluded
#>   GND-SCT    0.003                 0.043        0.044 excluded
#>   LCA-NMF    0.001                 0.038        0.040 excluded
#>   GND-NMF    0.508                 1.000          Inf included
#>   SCT-NMF   -0.012                 0.084        0.092 excluded
#>   LCA-SBP    0.001                 0.031        0.032 excluded
#>   GND-SBP   -0.756                 1.000          Inf included
#>   SCT-SBP    0.337                 0.982       54.556 included
#>   NMF-SBP   -0.980                 1.000          Inf included
#>   LCA-FTP   -0.004                 0.051        0.054 excluded
#>   GND-FTP    0.000                 0.040        0.042 excluded
#>   SCT-FTP    1.176                 1.000          Inf included
#>   NMF-FTP    0.670                 1.000          Inf included
#>   SBP-FTP   -0.014                 0.090        0.099 excluded
#> 
#> Bayes Factors larger than 10 were considered sufficient evidence for the categorization. 
#>  --- 
#>  AGGREGATED EDGE OVERVIEW 
#>  Number of included edges: 6 
#>  Number of inconclusive edges: 0 
#>  Number of excluded edges: 9 
#>  Number of possible edges: 15 
#>  
#>  --- 
#>  STRUCTURE OVERVIEW 
#>  Number of visited structures: 109 
#>  Number of possible structures: 32768 
#>  Posterior probability of most likely structure: 0.6264 
#> ---Furthermore, we can visualize the results with plots. In a first
step, we assess the edge evidence plot in which edges represent the
inclusion Bayes factor \(`\text{BF}_{10}`\). Especially in a large
network like ours, it is useful to split the edge evidence plot in two
parts by setting the split argument to TRUE.
As such, the left plot shows edges with some evidence for inclusion
(i.e., \(`\text{BF}_{10} > 1`\)),
where blue edges represent evidence for inclusion (\(`\text{BF}_{10} > 10`\)) and grey edges
absence of evidence (\(`1 < \text{BF}_{10}
< 10`\)). The right edge evidence plot shows edges with some
evidence for exclusion (i.e., \(`\text{BF}_{10} < 1`\)) with evidence
for exclusion shown as red (\(`\text{BF}_{01}
> 10`\)) and inconclusive evidence as grey (\(`0.1 < \text{BF}_{10} < 1`\)).
plot_edgeevidence(res, edge.width = 2, split = T)
Furthermore, we can look at the network plot in which edges represent the partial associations.
plot_network(res, layout = "spring", 
             layoutScale = c(.8,1), palette = "R",
             theme = "TeamFortress", vsize = 6)
We can also assess the structure specifically with three plots. Note
that this only works, if we use either the BDgraph or
bgms package.
plot_structure_probabilities(res, as_BF = FALSE)
plot_complexity_probabilities(res, as_BF = FALSE)
plot_structure(res, layoutScale = c(.8,1), palette = "R",
               theme = "TeamFortress", vsize = 6, edge.width = .3, layout = "spring")
In addition we can obtain posterior samples from the posterior
distribution by setting save = TRUE in the
easybgm function and thereby open up new possibilities of
assessing the model. We can extract the posterior density of the
parameters with a parameter forest plot.
res <- easybgm(data, type = "binary", save = TRUE, centrality = TRUE)
plot_parameterHDI(res)
Furthermore, researcher can wish to aggregate the findings of the
network model, commonly done with centrality measures. Due to the
discussion around the meaningfulness of centrality measures in
psychometric network models, we recommend users to stick to the strength
centrality. To obtain the centrality measures, users need to set
save = TRUE and centrality = TRUE, when
estimating the network model with easybgm. The centrality
measures can be inspected with the centrality plot.
plot_centrality(res, measures = "Strength")
For more information on the package, the Bayesian background, its application to networks and the respective plots, check out:
Huth, K., Keetelaar, S., Sekulovski, N., van den Bergh, D., & Marsman, M. (2023). Simplifying Bayesian analysis of graphical models for the social sciences with easybgm: A user-friendly R-package. https://doi.org/10.31234/osf.io/8f72p
Huth, K., de Ron, J., Luigjes, J., Goudriaan, A., Mohammadi, R., van Holst, R., Wagenmakers, E.J., & Marsman, M. (2023). Bayesian Analysis of Cross-sectional Networks: A Tutorial in R and JASP. PsyArXiv https://doi.org/10.31234/osf.io/ub5tc.
If you encounter any bugs or have ideas for new features, you can
submit them by creating an issue on Github. Additionally, if you want to
contribute to the development of easybgm, you can initiate
a branch with a pull request; we can review and discuss the proposed
changes.
Epskamp, S., Cramer, A. O. J., Waldorp, L. J., Schmittmann, V. D., & Borsboom, D. (2012). qgraph: Network Visualizations of Relationships in Psychometric Data. Journal of Statistical Software, 48 . doi: 10.18637/jss.v048.i04
Huth, K., de Ron, J., Luigjes, J., Goudriaan, A., Mohammadi, R., van Holst, R., Wagenmakers, E.J., & Marsman, M. (2023). Bayesian Analysis of Cross-sectional Networks: A Tutorial in R and JASP. PsyArXiv https://doi.org/10.31234/osf.io/ub5tc.
Marsman, M., Haslbeck, J. M. B. (2023). Bayesian Analysis of the Ordinal Markov Random Field. PsyArXiv https://doi.org/10.31234/osf.io/ukwrf.
Mohammadi, Reza, and Ernst C Wit. (2015). “BDgraph: An R Package for Bayesian Structure Learning in Graphical Models.” Journal of Statistical Software 89 (3). https://doi.org/10.18637/jss.v089.i03.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag New York. Retrieved from https://ggplot2.tidyverse.org
Williams, Donald R, and Joris Mulder. (2019). “Bayesian Hypothesis Testing for Gaussian Graphical Models: Conditional Independence and Order Constraints.” PsyArXiv. https://doi.org/10.31234/osf.io/ypxd8.