| Type: | Package | 
| Title: | Automate Latent Growth Mixture Modelling in 'Mplus' | 
| Version: | 1.0.0 | 
| Description: | Provide a suite of functions for conducting and automating Latent Growth Modeling (LGM) in 'Mplus', including Growth Curve Model (GCM), Growth-Based Trajectory Model (GBTM) and Latent Class Growth Analysis (LCGA). The package builds upon the capabilities of the 'MplusAutomation' package (Hallquist & Wiley, 2018) to streamline large-scale latent variable analyses. “MplusAutomation: An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus.” Structural Equation Modeling, 25(4), 621–638. <doi:10.1080/10705511.2017.1402334> The workflow implemented in this package follows the recommendations outlined in Van Der Nest et al. (2020). “An Overview of Mixture Modeling for Latent Evolutions in Longitudinal Data: Modeling Approaches, Fit Statistics, and Software.” Advances in Life Course Research, 43, Article 100323. <doi:10.1016/j.alcr.2019.100323>. | 
| Depends: | R (≥ 4.1.0), | 
| License: | GPL (≥ 3) | 
| Imports: | MplusAutomation, magrittr, tibble, dplyr, tidyr, tidyselect, stringr, purrr, ggplot2, glue, parallel | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.3.2 | 
| URL: | https://github.com/OlivierPDS/MplusLGM | 
| BugReports: | https://github.com/OlivierPDS/MplusLGM/issues | 
| LazyData: | true | 
| NeedsCompilation: | no | 
| Packaged: | 2025-02-01 20:20:08 UTC; olivierpercie | 
| Author: | Olivier Percie du Sert | 
| Maintainer: | Olivier Percie du Sert <olivier.perciedusert@mail.mcgill.ca> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-02-03 12:20:02 UTC | 
Create Mplus model objects for Latent Growth Modelling (LGM)
Description
Provide flexibility for specifying Mplus LGM objects with various latent class and residual variance structures, and capturing individual differences in growth trajectories.
Support Growth Curve Models (GCM), Growth-Based Trajectory Models (GBTM) and Latent Class Growth Analysis (LCGA).
Once created, the model can be estimated using the runLGM function.
Usage
LGMobject(
  data,
  outvar,
  catvar = FALSE,
  idvar,
  k,
  starting_val,
  estimator = c("MLR", "ML", "WLSMV", "WLS"),
  transformation = c("LOGIT", "PROBIT"),
  lgm_type = c("gcm", "gbtm", "lcga_t", "lcga_c", "lcga_tc"),
  polynomial = 1,
  timescores,
  timescores_indiv = FALSE,
  output,
  plot,
  save
)
Arguments
| data | A data frame containing all variables for the trajectory analysis. | 
| outvar | A character vector specifying the outcome variables at different times. | 
| catvar | A logical value indicating whether the outcome variable is categorical. Default is  | 
| idvar | A character string specifying the ID variable. | 
| k | An integer specifying the number of latent classes for the model. | 
| starting_val | A numeric value specifying the number of random starting values to generate for the initial optimization stage. Note that the number of final stage optimizations will be set as equal to half of this value. | 
| estimator | A character string to specify the estimator to use in the analysis. Default is 'MLR'. | 
| transformation | A character string to specify the latent response variable transformation to use when the outcome variable is categorical. Default is  | 
| lgm_type | A character string specifying the residual variance structure of the growth model. Options include: 
 | 
| polynomial | An integer specifying the order of the polynomial used to model trajectories. Supported values are: 1 (linear), 2 (quadratic), 3 (cubic). Default is 1. | 
| timescores | A numeric vector specifying the time scores for the model. If  | 
| timescores_indiv | A logical value indicating whether to use individually varying times of observation for the outcome variable. Default is  | 
| output | A character vector specifying the requested Mplus output options for the model. | 
| plot | A character string specifying the requested Mplus plot options for the model. | 
| save | A character string specifying the type of results to be saved by Mplus. | 
Details
The LGMobject function facilitates and automates the appropriate model specification for conducting latent growth modeling in Mplus.
It creates the relevant sections of an Mplus input file, including: TITLE, VARIABLE, ANALYSIS, MODEL, OUTPUT, PLOT, and SAVEDATA.
This function builds upon the capabilities of the mplusObjectfunction
from the MplusAutomation package.
Value
A list of class mplusObject with elements specifying sections of an Mplus input file for conducting latent growth modeling.
See Also
mplusObject for creating an mplusObject.
runLGM for conducting latent growth modelling with an mplusObject.
Examples
# Example usage:
GBTM_object <- LGMobject(
  data = symptoms,
  outvar = paste("sx", seq(from = 0, to = 24, by = 6), sep = "_"),
  idvar = "id",
  catvar = FALSE,
  k = 3L,
  starting_val = 500,
  lgm_type = "gbtm",
  polynomial = 3,
  timescores = seq(from = 0, to = 24, by = 6),
  timescores_indiv = FALSE,
  output = c("TECH1", "TECH14", "SAMPSTAT", "STANDARDIZED"),
  plot = "PLOT3",
  save = "FSCORES"
  )
Fit Group-Based Trajectory Models (GBTM) for class enumeration.
Description
Perform class enumeration by fitting a series of GBTM in Mplus across a predetermined range of classes, and returning a list of fitted models for evaluation and comparison.
Usage
fitGBTM(
  data,
  outvar,
  catvar = FALSE,
  idvar,
  min_k = 2L,
  max_k = 6L,
  starting_val = 500,
  polynomial = 1,
  timescores,
  timescores_indiv = FALSE,
  estimator = c("MLR", "ML", "WLSMV", "WLS"),
  transformation = c("LOGIT", "PROBIT"),
  output = c("TECH1", "TECH11", "SAMPSTAT", "STANDARDIZED"),
  plot = "PLOT3",
  save = "FSCORES",
  wd = "Results"
)
Arguments
| data | A data frame containing all variables for the trajectory analysis. | 
| outvar | A character vector specifying the outcome variables at different times. | 
| catvar | A logical value indicating whether the outcome variable is categorical. Default is  | 
| idvar | A character string specifying the ID variable. | 
| min_k | An integer specifying the minimum number of latent classes to evaluate. Default is 2. | 
| max_k | An integer specifying the maximum number of latent classes to evaluate. Default is 6. | 
| starting_val | A numeric value specifying the number of random starting values to generate for the initial optimization stage. Note that the number of final stage optimizations will be set as equal to half of this value. | 
| polynomial | An integer specifying the order of the polynomial used to model trajectories. Supported values are: 1 (linear), 2 (quadratic), 3 (cubic). Default is 1. | 
| timescores | A numeric vector specifying the time scores for the model. If  | 
| timescores_indiv | A logical value indicating whether to use individually varying times of observation for the outcome variable. Default is  | 
| estimator | A character string to specify the estimator to use in the analysis. Default is 'MLR'. | 
| transformation | A character string to specify the latent response variable transformation to use when the outcome variable is categorical. Default is  | 
| output | A character vector specifying the requested Mplus output options for the model. | 
| plot | A character string specifying the requested Mplus plot options for the model. Default is PLOT3. | 
| save | A character string specifying the type of results to be saved by Mplus. Default is FSCORES. | 
| wd | A character string specifying the directory where the results folder will be created for saving Mplus input, output, and data files. Default is the current working directory. | 
Details
The fitGBTM function automates the process of fitting GBTM, iterating through an increasing number of class.
This function is designed for conducting class enumeration and help identifying the optimal number of latent classes.
GBTM should converge the quickest to a solution given its lower number of free parameters when compared to other LGM.
The function operates as follows:
- 1. Iterate over an increasing number of classes, ranging from - min_kto- max_k.
- 2. Create GBTM - mplusObjectwith appropriate class specification using the- LGMobjectfunction.
- 3. Fit models using the - runLGMfunction, ensuring convergence by increasing the number of random starting values until the best log-likelihood is replicated.
- 4. Return a list of - mplusObjectincluding results for the fitted GBTM models with each class structures.
The function automates the procedure outlined for model selection in: Van Der Nest et al,. (2020). "An overview of mixture modelling for latent evolutions in longitudinal data: Modelling approaches, fit statistics and software." Advances in Life Course Research 43: 100323.
This function builds upon the capabilities of the mplusObject and mplusModeler functions
from the MplusAutomation package.
Value
A list of mplusObject including the fitted GBTM models for each class specification.
See Also
LGMobject for creating the mplusObject of a latent growth model.
runLGM for conducting latent growth modelling with an mplusObject.
Examples
# Example usage:
GBTM_models <- fitGBTM(
  data = symptoms,
  outvar = paste("sx", seq(from = 0, to = 24, by = 6), sep = "_"),
  catvar = FALSE,
  idvar = "id",
  starting_val = 500,
  min_k = 2L,
  max_k = 6L,
  timescores = seq(from = 0, to = 24, by = 6),
  timescores_indiv = FALSE,
  polynomial = 1,
  output = c("TECH1", 'TECH14', "SAMPSTAT", "STANDARDIZED"),
  plot = "PLOT3",
  save = "FSCORES",
  wd = file.path("Results", "Trajectories")
)
# Accessing the model:
GBTM2 <- GBTM_models[[1]] #with 2 latent classes
GBTM3 <- GBTM_models[[2]] #with 3 latent classes
GBTM4 <- GBTM_models[[3]] #with 4 latent classes
Fit Growth Curve Models (GCM)
Description
Customize and execute GCM in Mplus, offering flexibility in model configuration and parameter estimation.
Usage
fitGCM(
  data,
  outvar,
  catvar = FALSE,
  idvar,
  starting_val = 500,
  polynomial = 1,
  timescores = timescores,
  timescores_indiv = FALSE,
  estimator = c("MLR", "ML", "WLSMV", "WLS"),
  transformation = c("LOGIT", "PROBIT"),
  output = c("TECH1", "SAMPSTAT", "STANDARDIZED"),
  plot = "PLOT3",
  save = "FSCORES",
  wd = "Results"
)
Arguments
| data | A data frame containing all variables for the trajectory analysis. | 
| outvar | A character vector specifying the outcome variables at different times. | 
| catvar | A logical value indicating whether the outcome variable is categorical. Default is  | 
| idvar | A character string specifying the ID variable. | 
| starting_val | A numeric value specifying the number of random starting values to generate for the initial optimization stage. Note that the number of final stage optimizations will be set as equal to half of this value. | 
| polynomial | An integer specifying the order of the polynomial used to model trajectories. Supported values are: 1 (linear), 2 (quadratic), 3 (cubic). Default is 1. | 
| timescores | A numeric vector specifying the time scores for the model. If  | 
| timescores_indiv | A logical value indicating whether to use individually varying times of observation for the outcome variable. Default is  | 
| estimator | A character string to specify the estimator to use in the analysis. Default is 'MLR'. | 
| transformation | A character string to specify the latent response variable transformation to use when the outcome variable is categorical. Default is  | 
| output | A character vector specifying the requested Mplus output options for the model. | 
| plot | A character string specifying the requested Mplus plot options for the model. | 
| save | A character string specifying the type of results to be saved by Mplus. | 
| wd | A character string specifying the directory where the results folder will be created for saving Mplus input, output, and data files. Default is the current working directory. | 
Details
The fitGCM function automates the process of specifying, customizing and fitting GCM in Mplus.
This function builds upon the capabilities of the mplusObject and mplusModeler functions
from the MplusAutomation package.
Value
A list of class mplusObjects including results for the fitted GCM.
See Also
LGMobject for creating the mplusObject of a latent growth model.
runLGM for conducting latent growth modelling with an mplusObject.
Examples
# Example usage:
GCM_model <- fitGCM(
  data = symptoms,
  outvar = paste("sx", seq(from = 0, to = 24, by = 6), sep = "_"),
  catvar = FALSE,
  idvar = "id",
  starting_val = 500,
  polynomial = 3,
  timescores = seq(from = 0, to = 24, by = 6),
  timescores_indiv = FALSE,
  output = c("TECH1", "SAMPSTAT", "STANDARDIZED"),
  plot = "PLOT3",
  save = "FSCORES",
  wd = file.path("Results", "Trajectories")
  )
Fit Latent Class Growth Analysis (LCGA) models to refine covariance structure.
Description
Refine the residual variance structure by fitting a series of LCGA models progressively allowing for the dependence of residuals on time and/or class, and returning a list of fitted models for evaluation and comparison.
Usage
fitLCGA(
  data,
  outvar,
  catvar = FALSE,
  idvar,
  k,
  starting_val = 500,
  polynomial = 1,
  timescores,
  timescores_indiv = FALSE,
  estimator = c("MLR", "ML", "WLSMV", "WLS"),
  transformation = c("LOGIT", "PROBIT"),
  output = c("TECH1", "TECH11", "SAMPSTAT", "STANDARDIZED"),
  plot = "PLOT3",
  save = "FSCORES",
  wd = "Results"
)
Arguments
| data | A data frame containing all variables for the trajectory analysis. | 
| outvar | A character vector specifying the outcome variables at different times. | 
| catvar | A logical value indicating whether the outcome variable is categorical. Default is  | 
| idvar | A character string specifying the ID variable. | 
| k | An integer specifying the number of latent classes for the model. | 
| starting_val | A numeric value specifying the number of random starting values to generate for the initial optimization stage. Note that the number of final stage optimizations will be set as equal to half of this value. | 
| polynomial | An integer specifying the order of the polynomial used to model trajectories. Supported values are: 1 (linear), 2 (quadratic), 3 (cubic). Default is 1. | 
| timescores | A numeric vector specifying the time scores for the model. If  | 
| timescores_indiv | A logical value indicating whether to use individually varying times of observation for the outcome variable. Default is  | 
| estimator | A character string to specify the estimator to use in the analysis. Default is 'MLR'. | 
| transformation | A character string to specify the latent response variable transformation to use when the outcome variable is categorical. Default is  | 
| output | A character vector specifying the requested Mplus output options for the model. | 
| plot | A character string specifying the requested Mplus plot options for the model. | 
| save | A character string specifying the type of results to be saved by Mplus. | 
| wd | A character string specifying the directory where the results folder will be created for saving Mplus input, output, and data files. Default is the current working directory. | 
Details
The fitLCGA function automates the process of fitting LCGA models, iterating through 3 varying residual variance specifications:
- - Relaxed residual variance across time 
- - Relaxed residual variance across class 
- - Relaxed residual variance across both time and class 
This function is designed to help identify the optimal residual variance structure while examining convergence issues as model complexity increases.
The function operates as follows:
- 1. Iterate over the 3 residual variance specifications 
- 2. Create LCGA - mplusObjectwith appropriate residual variance specification using the- LGMobjectfunction.
- 3. Fit models using the - runLGMfunction, ensuring convergence by increasing the number of random starting values until the best log-likelihood is replicated.
- 4. Return a list of - mplusObjectincluding results for the fitted LCGA models with each residual variance structures
The function automates the procedure outlined for model selection in: Van Der Nest et al., (2020). "An overview of mixture modelling for latent evolutions in longitudinal data: Modelling approaches, fit statistics and software." Advances in Life Course Research 43: 100323.
This function builds upon the capabilities of the mplusObject and mplusModeler functions
from the MplusAutomation package.
Value
A list of mplusObject including results for the fitted LCGA models.
See Also
LGMobject for creating the mplusObject of a latent growth model.
runLGM for conducting latent growth modelling with an mplusObject.
Examples
# Example usage:
LCGA_models <- fitLCGA(
 data = symptoms,
 outvar = paste('sx', seq(from = 0, to = 24, by = 6), sep = "_"),
 catvar = FALSE,
 idvar = "id",
 starting_val = 500,
 k = 3L,
 timescores = seq(from = 0, to = 24, by = 6),
 timescores_indiv = FALSE,
 polynomial = 3,
 output = c('TECH1', 'TECH14', 'SAMPSTAT', 'STANDARDIZED'),
 wd = file.path('Results', 'Trajectories')
)
# Accessing the models:
LCGA_t <- LCGA_models[[1]] #with relaxed residual variance across time
LCGA_c <- LCGA_models[[2]] #with relaxed residual variance across class
LCGA_tc <- LCGA_models[[3]] #with relaxed residual variance across time and class
Select best-fitting model from a list of Latent Growth Models (LGM)
Description
Identify and extract the best-fitting model from a list of LGM based on a specified set of criteria applied to a summary table of the models fit indices.
Usage
getBest(
  lgm_object,
  ic = c("BIC", "aBIC", "AIC", "CAIC", "AICC"),
  lrt = c("none", "aLRT", "BLRT"),
  p = 0.05
)
Arguments
| lgm_object | A list of LGM  | 
| ic | A character string specifying the information criterion (IC) to use for selecting the best-fitting model. Supported options are Bayesian Information Criterion (BIC), sample-size-adjusted BIC (aBIC), Akaike Information Criterion (AIC), Consistent Akaike Information Criterion (CAIC), and AIC corrected (AICC). The default is BIC. | 
| lrt | A character string specifying the likelihood ratio test (LRT) to use for selecting the best-fitting model. Supported options are Bootstrap LRT (BLRT) and Lo-Mendel-Rubin adjusted LRT (aLRT). Default is "none", the selection of the the best-fitting model is only made based on the selected IC. | 
| p | A numeric value specifying the p-value threshold for statistical significance when using LRT-based selection of the best-fitting model. Default is 0.05. | 
Details
The function select the best-fitting model based on the following criteria:
- 1. Models with convergence errors are excluded. 
- 2. The model with the lowest information criterion (IC) is selected. 
- 3. If specified, the likelihood ratio test (LRT) is used to determine whether the K-class model can be reduced to K-1 classes. 
- 4. The resulting model throw a warning if it meet any of the following conditions: - - Entropy is below 0.5. 
- - Any class has an average posterior probability of assignment (APPA) below 0.7. 
- - Any class represents less than 5% of the sample size. 
 
Value
The LGM mplusObject of the best-fitting model.
Examples
# Example usage:
GBTM_best <- getBest(
  lgm_object = GBTM_models,
  ic = "BIC",
  lrt = "aLRT",
  p = 0.05
)
best_fit <- getFit(GBTM_best)
print(best_fit)
Get fit indices from Latent Growth Models (LGM)
Description
Extract key information from Mplus LGM objects, including model summaries, fit statistics, class details, warnings, and errors. The function accounts for non-converging models and compiles the extracted information into a single data frame to facilitate model evaluation and comparison.
Usage
getFit(lgm_object)
Arguments
| lgm_object | A single LGM  | 
Details
- - Model summaries such as the title, log-likelihood value and number of observations, parameters and latent classes. 
- - Model fit indices such as the BIC, aBIC, AIC, AICC and CAIC along with statistics from the BLRT and adjusted LMR-LRT, if requested. 
- - Latent class counts and proportions. 
- - Classification confidence measures such as the average posterior probabilities (APPA) and entropy. 
- - Mplus warnings or errors encountered during model estimation. 
This output facilitates side-by-side comparison of models to support model evaluation and selection.
Value
A data frame with a row for each LGM of the input list.
Examples
# Example usage:
fit_indices <- getFit(lgm_object = GCM_model)
fit_indices <- getFit(lgm_object = list(GCM_model, GBTM_models, LCGA_models))
print(fit_indices)
Refine Polynomial Order in Latent Growth Modelling (LGM)
Description
Refine the polynomial order for each class of a LGM by iteratively removing non-significant growth factors, and running the updated models.
Usage
getPoly(lgm_object, wd = "Results")
Arguments
| lgm_object | A LGM  | 
| wd | A character string specifying the directory where the results folder will be created for saving Mplus input, output, and data files. Default is the current working directory. | 
Details
The getPoly function refines the polynomial order of a LGM mplusObject through an iterative process.
In addition to ensuring the statistical significance of growth factors in each latent class,
the function ensure that the best loglikelihood value of the updated model is replicated.
The function works as follows:
- 1. Extract model information from the provided LGM - mplusObject.
- 2. Evaluate the statistical significance of the highest-order growth factor in each class. 
- 3. Remove non-significant growth factors (p-value > 0.05) from the model. 
- 4. Update the LGM - mplusObjectto reflect changes in the growth factor structure.
- 5. Re-run the updated - mplusObjectuntil log-likelihood values are successfully replicated using the- runLGMfunction.
- 6. Repeat the process until the highest-order growth factor of all classes are statistically significant or reduce to intercept-only. 
The function automates the procedure outlined for model selection in: Van Der Nest et al,. (2020). "An overview of mixture modelling for latent evolutions in longitudinal data: Modelling approaches, fit statistics and software." Advances in Life Course Research 43: 100323.
Value
A LGM mplusObject including the results of the updated model with the refined polynomial order.
See Also
LGMobject for creating the mplusObject of a latent growth model.
runLGM for conducting latent growth modelling with a mplusObject.
Examples
# Example usage:
final_model <- getPoly(
  lgm_object = LCGA_best,
  wd = "Results"
  )
final_fit <- getFit(final_model)
 print(final_fit)
Plot individual trajectories of outcome - Spaghetti plot
Description
Generate a spaghetti plot to visualize the individual trajectories of a given outcome across time..
Usage
getSpaghetti(data, outvar)
Arguments
| data | A data frame containing all variables for the trajectory analysis. | 
| outvar | A character vector specifying the outcome variables at different times. | 
Value
A ggplot object displaying the spaghetti plot of individual trajectories.
Examples
# Example usage:
plot <- getSpaghetti(
  data = symptoms,
  outvar = paste("sx", seq(from = 0, to = 24, by = 6), sep = "_"))
print(plot)
Run Latent Growth Models (LGM) and replicate the best loglikelihood value (LL)
Description
Run iterations of an LGM, doubling the number of starting values until the best LL value has replicated at least twice, both within and between models.
Usage
runLGM(lgm_object, wd)
Arguments
| lgm_object | An  | 
| wd | A character string specifying the directory where the results folder will be created for saving the Mplus input, output, and data files. Default is the current working directory. | 
Details
The runLGM function run iterations of an LGM in Mplus while gradually increasing the number of random starting values used to optimize the loglikelihood.
This approach aims to prevent estimation issues related to local maxima, which can result in selecting the inappropriate model during class enumeration.
The function works as follows:
- 1. Estimate the model using the predefined number of random starting values. 
- 2. Rerun the model with double the number of starting values. 
- 3. Continue until the best LL value is successfully replicated both within the model and between 2 consecutive model run, or the maximum number of allowed starting values is reached. By default the maximum number of allowed starting values is set 2 times the number of initial starting values raised to the power of 5. 
- 4. Return the - mplusObjectfrom the replicated model.
This function builds upon the capabilities of the mplusModeler function
from the MplusAutomation package.
Value
A list of class mplusObject including results for the replicated model, alongside with :
- - The Mplus input and data files used for the model. 
- - The output files generated by Mplus. 
- - The data results files saved by Mplus. 
See Also
mplusModeler for running, and reading an mplusObject.
LGMobject for creating the mplusObject for a latent growth model.
Examples
# Example usage:
GBTM_model <-  runLGM(
  lgm_object = GBTM_object,
  wd = file.path("Results", "Trajectories"))
Symptoms Data
Description
A simulated, longitudinal dataset capturing symptom severity with an arbitrary scale (total score: 0-28), over a 2-year follow-up period for 350 individuals. The data is not normally distributed and exhibits heterogeneity, including latent (unobserved) trajectories of symptom severity that reflect diverse progression patterns across individuals. The dataset contains no missing data.
Usage
symptoms
Format
A dataframe with 1 row per individual, 350 observations and 10 variables.
- id
- Individual identifier, numeric. 
- sx_0
- Symptoms severity at month 0, numeric. 
- sx_3
- Symptoms severity at month 3, numeric. 
- sx_6
- Symptoms severity at month 6, numeric. 
- sx_9
- Symptoms severity at month 9, numeric. 
- sx_12
- Symptoms severity at month 12, numeric. 
- sx_15
- Symptoms severity at month 15, numeric. 
- sx_18
- Symptoms severity at month 18, numeric. 
- sx_21
- Symptoms severity at month 21, numeric. 
- sx_24
- Symptoms severity at month 24, numeric. 
Source
Data simulated using modgo