Estimates latent profiles (finite mixture models) using either the open source package mclust or [OpenMx:mxModel]{OpenMx}, or the commercial program Mplus (using the R-interface of MplusAutomation).

estimate_profiles(
  df,
  n_profiles,
  models = NULL,
  variances = "equal",
  covariances = "zero",
  package = "mclust",
  select_vars = NULL,
  ...
)

Arguments

df

data.frame of numeric data; continuous indicators are required for mixture modeling.

n_profiles

Integer vector of the number of profiles (or mixture components) to be estimated.

models

Integer vector. Set to NULL by default, and models are constructed from the variances and covariances arguments. See Details for the six models available in tidyLPA.

variances

Character vector. Specifies which variance components to estimate. Defaults to "equal" (constrain variances across profiles); the other option is "varying" (estimate variances freely across profiles). Each element of this vector refers to one of the models you wish to run.

covariances

Character vector. Specifies which covariance components to estimate. Defaults to "zero" (do not estimate covariances; this corresponds to an assumption of conditional independence of the indicators); other options are "equal" (estimate covariances between items, constrained across profiles), and "varying" (free covariances across profiles).

package

Character. Which package to use; 'OpenMx', 'mclust', or 'MplusAutomation' (requires Mplus to be installed). Default: 'OpenMx'.

select_vars

Character. Optional vector of variable names in df, to be used for model estimation. Defaults to NULL, which means all variables in df are used.

...

Additional arguments are passed to the estimating function; i.e., mxRun, Mclust, or mplusModeler.

Value

A list of class 'tidyLPA'.

Details

Six models are currently available in tidyLPA, corresponding to the most common requirements. All models estimate the observed variable means for each class. The remaining parameters are:

  1. Equal variances across classes; no covariances between observed variables

  2. Varying variances across classes; no covariances between observed variables

  3. Equal variances and equal covariances across classes

  4. Varying variances and equal covariances (not available for package = 'mclust')

  5. Equal variances and varying covariances (not available for package = 'mclust')

  6. Varying variances and varying covariances

Two interfaces are available to estimate these models; specify their numbers in the models argument (e.g., models = 1, or models = c(1, 2, 3)), or specify the variances/covariances to be estimated (e.g.,: variances = c("equal", "varying"), covariances = c("zero", "equal")). Note that when package = 'mclust' is used, models = c(4, 5) are not available. Use package = 'OpenMx' or package = 'Mplus' to estimate these models.

Examples

# to make example run more quickly
iris_sample <- iris[c(1:10, 51:60, 101:114), ]

# Example 1:
iris_sample %>%
  subset(select = c("Sepal.Length", "Sepal.Width",
    "Petal.Length")) %>%
  estimate_profiles(3)
#> tidyLPA analysis using mclust: 
#> 
#>   Model Classes    AIC    BIC Entropy prob_min prob_max n_min n_max BLRT_p
#> 1     1       3 178.42 199.79    0.98     0.99     1.00  0.21  0.50   0.01

# \donttest{
# Example 2:
iris %>%
  subset(select = c("Sepal.Length", "Sepal.Width",
    "Petal.Length")) %>%
  estimate_profiles(n_profiles = 1:4, models = 1:3)
#> The 'variances'/'covariances' arguments were ignored in favor of the 'models' argument.
#> tidyLPA analysis using mclust: 
#> 
#>    Model Classes     AIC     BIC Entropy prob_min prob_max n_min n_max BLRT_p
#> 1      1       1 1150.81 1168.87    1.00     1.00     1.00  1.00  1.00       
#> 2      1       2  885.64  915.74    0.99     1.00     1.00  0.34  0.66   0.01
#> 3      1       3  765.16  807.31    0.93     0.94     1.00  0.27  0.40   0.01
#> 4      1       4  758.71  812.90    0.87     0.70     0.96  0.09  0.39   0.01
#> 5      2       1 1150.81 1168.87    1.00     1.00     1.00  1.00  1.00       
#> 6      2       2  770.96  810.10    1.00     1.00     1.00  0.33  0.67   0.01
#> 7      2       3  702.55  762.76    0.89     0.91     1.00  0.31  0.35   0.01
#> 8      2       4  695.51  776.80    0.88     0.91     0.95  0.09  0.35   0.06
#> 9      3       1  857.33  893.46    1.00     1.00     1.00  1.00  1.00       
#> 10     3       2  699.26  747.43    1.00     1.00     1.00  0.33  0.67   0.01
#> 11     3       3  656.14  716.36    0.92     0.94     1.00  0.33  0.34   0.01
#> 12     3       4  652.81  725.06    0.90     0.71     0.97  0.05  0.34   0.05

# Example 3:
iris_sample %>%
  subset(select = c("Sepal.Length", "Sepal.Width",
    "Petal.Length")) %>%
  estimate_profiles(n_profiles = 1:4, variances = c("equal", "varying"),
                    covariances = c("zero", "zero"))
#> tidyLPA analysis using mclust: 
#> 
#>   Model Classes    AIC    BIC Entropy prob_min prob_max n_min n_max BLRT_p
#> 1     1       1 266.05 275.21    1.00     1.00     1.00  1.00  1.00       
#> 2     1       2 210.85 226.12    0.97     0.98     1.00  0.32  0.68   0.01
#> 3     1       3 178.42 199.79    0.98     0.99     1.00  0.21  0.50   0.01
#> 4     1       4 182.67 210.14    0.94     0.87     1.00  0.09  0.50   0.45
#> 5     2       1 266.05 275.21    1.00     1.00     1.00  1.00  1.00       
#> 6     2       2 178.74 198.58    1.00     1.00     1.00  0.29  0.71   0.01
#> 7     2       3 161.79 192.32    0.97     0.98     1.00  0.21  0.50   0.01
#> 8     2       4 164.95 206.16    0.96     0.95     1.00  0.12  0.50   0.58
# }