Tidy summarizes information about the components of a model. A model component might be a single term in a regression, a single hypothesis, a cluster, or a class. Exactly what tidy considers to be a model component varies cross models but is usually self-evident. If a model has several distinct types of components, you will need to specify which components to return.

# S3 method for Mclust
tidy(x, ...)

Arguments

x

An Mclust object return from mclust::Mclust().

...

Additional arguments. Not used. Needed to match generic signature only. Cautionary note: Misspelled arguments will be absorbed in ..., where they will be ignored. If the misspelled argument has a default value, the default value will be used. For example, if you pass conf.lvel = 0.9, all computation will proceed using conf.level = 0.95. Additionally, if you pass newdata = my_tibble to an augment() method that does not accept a newdata argument, it will use the default value for the data argument.

See also

tidy(), mclust::Mclust()

Other mclust tidiers: augment.Mclust()

Value

A tibble::tibble() with columns:

proportion

The mixing proportion of each component

size

Number of points assigned to cluster.

mean

The mean for each component. In case of 2+ dimensional models, a column with the mean is added for each dimension. NA for noise component

variance

In case of one-dimensional and spherical models, the variance for each component, omitted otherwise. NA for noise component

component

Cluster id as a factor.

Examples

library(dplyr) library(mclust) set.seed(27) centers <- tibble::tibble( cluster = factor(1:3), num_points = c(100, 150, 50), # number points in each cluster x1 = c(5, 0, -3), # x1 coordinate of cluster center x2 = c(-1, 1, -2) # x2 coordinate of cluster center ) points <- centers %>% mutate( x1 = purrr::map2(num_points, x1, rnorm), x2 = purrr::map2(num_points, x2, rnorm) ) %>% dplyr::select(-num_points, -cluster) %>% tidyr::unnest(x1, x2)
#> Warning: unnest() has a new interface. See ?unnest for details. #> Try `df %>% unnest(c(x1, x2))`, with `mutate()` if needed
m <- mclust::Mclust(points) tidy(m)
#> # A tibble: 3 x 6 #> component size proportion variance mean.x1 mean.x2 #> <int> <int> <dbl> <dbl> <dbl> <dbl> #> 1 1 101 0.335 1.12 5.01 -1.04 #> 2 2 150 0.503 1.12 0.0594 1.00 #> 3 3 49 0.161 1.12 -3.20 -2.06
augment(m, points)
#> # A tibble: 300 x 4 #> x1 x2 .class .uncertainty #> <dbl> <dbl> <fct> <dbl> #> 1 6.91 -2.74 1 3.98e-11 #> 2 6.14 -2.45 1 1.99e- 9 #> 3 4.24 -0.946 1 1.47e- 4 #> 4 3.54 0.287 1 2.94e- 2 #> 5 3.91 0.408 1 7.48e- 3 #> 6 5.30 -1.58 1 4.22e- 7 #> 7 5.01 -1.77 1 1.06e- 6 #> 8 6.16 -1.68 1 7.64e- 9 #> 9 7.13 -2.17 1 4.16e-11 #> 10 5.24 -2.42 1 1.16e- 7 #> # … with 290 more rows
#> # A tibble: 1 x 7 #> model G BIC logLik df hypvol nobs #> <chr> <int> <dbl> <dbl> <dbl> <dbl> <int> #> 1 EII 3 -2402. -1175. 9 NA 300