Effect Size Statistics for Anova Tables #rstats

My sjstats-package has been updated on CRAN. The past updates introduced new functions for various purposes, e.g. predictive accuracy of regression models or improved support for the marvelous glmmTMB-package. The current update, however, added some ANOVA tools to the package.

In this post, I want to give a short overview of these new functions, which report different effect size measures. These are useful beyond significance tests (p-values), because they estimate the magnitude of effects, independent from sample size. sjstats provides following functions:

  • eta_sq()
  • omega_sq()
  • cohens_f()
  • anova_stats()

First, we need a sample model:

# load sample data

# fit linear model
fit <- aov(
  c12hour ~ as.factor(e42dep) + as.factor(c172code) + c160age,
  data = efc

All functions accept objects of class aov or anova, so you can also use model fits from the car-package, which allows fitting Anova’s with different types of sum of squares. Other objects, like lm, will be coerced to anova internally.

The following functions return the effect size statistic as named numeric vector, using the model’s term names.

Eta Squared

The eta squared is the proportion of the total variability in the dependent variable that is accounted for by the variation in the independent variable. It is the ratio of the sum of squares for each group level to the total sum of squares. It can be interpreted as percentage of variance accounted for by a variable.

For variables with 1 degree of freedeom (in the numerator), the square root of eta squared is equal to the correlation coefficient r. For variables with more than 1 degree of freedom, eta squared equals R2. This makes eta squared easily interpretable. Furthermore, these effect sizes can easily be converted into effect size measures that can be, for instance, further processed in meta-analyses.

Eta squared can be computed simply with:

#>   as.factor(e42dep) as.factor(c172code)             c160age 
#>         0.266114185         0.005399167         0.048441046

Partial Eta Squared

The partial eta squared value is the ratio of the sum of squares for each group level to the sum of squares for each group level plus the residual sum of squares. It is more difficult to interpret, because its value strongly depends on the variability of the residuals. Partial eta squared values should be reported with caution, and Levine and Hullett (2002) recommend reporting eta or omega squared rather than partial eta squared.

Use the partial-argument to compute partial eta squared values:

eta_sq(fit, partial = TRUE)
#>   as.factor(e42dep) as.factor(c172code)             c160age 
#>         0.281257128         0.007876882         0.066495448

Omega Squared

While eta squared estimates tend to be biased in certain situations, e.g. when the sample size is small or the independent variables have many group levels, omega squared estimates are corrected for this bias.

Omega squared can be simply computed with:

#>   as.factor(e42dep) as.factor(c172code)             c160age 
#>         0.263453157         0.003765292         0.047586841

Cohen’s F

Finally, cohens_f() computes Cohen’s F effect size for all independent variables in the model:

#>   as.factor(e42dep) as.factor(c172code)             c160age 
#>          0.62555427          0.08910342          0.26689334

Complete Statistical Table Output

The anova_stats() function takes a model input and computes a comprehensive summary, including the above effect size measures, returned as tidy data frame (as tibble, to be exact):

#> # A tibble: 4 x 11
#>                  term    df      sumsq     meansq statistic p.value etasq partial.etasq omegasq cohens.f power
#> 1   as.factor(e42dep)     3  577756.33 192585.444   108.786   0.000 0.266         0.281   0.263    0.626  1.00
#> 2 as.factor(c172code)     2   11722.05   5861.024     3.311   0.037 0.005         0.008   0.004    0.089  0.63
#> 3             c160age     1  105169.60 105169.595    59.408   0.000 0.048         0.066   0.048    0.267  1.00
#> 4           Residuals   834 1476436.34   1770.307        NA      NA    NA            NA      NA       NA    NA

Like the other functions, the input may also be an object of class anova, so you can also use model fits from the car package, which allows fitting Anova’s with different types of sum of squares:

anova_stats(car::Anova(fit, type = 3))
#> # A tibble: 5 x 11
#>                  term       sumsq     meansq    df statistic p.value etasq partial.etasq omegasq cohens.f power
#> 1         (Intercept)   26851.070  26851.070     1    15.167   0.000 0.013         0.018   0.012    0.135 0.973
#> 2   as.factor(e42dep)  426461.571 142153.857     3    80.299   0.000 0.209         0.224   0.206    0.537 1.000
#> 3 as.factor(c172code)    7352.049   3676.025     2     2.076   0.126 0.004         0.005   0.002    0.071 0.429
#> 4             c160age  105169.595 105169.595     1    59.408   0.000 0.051         0.066   0.051    0.267 1.000
#> 5           Residuals 1476436.343   1770.307   834        NA      NA    NA            NA      NA       NA    NA


Levine TR, Hullet CR. Eta Squared, Partial Eta Squared, and Misreporting of Effect Size in Communication Research. Human Communication Research 28(4); 2002: 612-625

14 Kommentare zu „Effect Size Statistics for Anova Tables #rstats

  1. simple et de bon goût
    I just regret that your mention to car::Anova was not more explicit about the differences between the sum of squares involved (type I to type III or so, may be).

  2. Pingback: Anova | Pearltrees
  3. Hi daniel,
    thanks for these precisions. You says that you’ve implemented functions for glmmTMB as well but is the anova_stats working for glmmTMB outputs? Thanks in advance.

    1. No, anova_stats() is only for Anova-objects. The other functions implemented for glmmTMB objects are, for instance icc() or so.

      1. Thanks for the answer… and too bad. Any chance you know a function that does more or less an anova-like analysis od a glmmTMB output? lsmeans is kindof doing in but it’d be nice to have an all-in-one function.
        Anyway, thanks for your work.

    1. Indeed, but I have a continuous variable in interaction with two factors, and I would like to have the overall effect on this continuous variable (this is what I call anova-like). lsmeans and ggeffects work only for factors.

      1. No, ggeffects does work for continuous variables as well. There is the emm() function for the marginal mean of the response, or you can use ggpredict() when you’re interested in predictors as well. There is also the emmean package, which is a successor of <lmmeans and by the same author. Maybe that package is also more flexible than lsmeans?

  4. Hi Daniel This looks a really cool package but I am having problems with getting output for my mixed ANOVA:
    I run the following mixed ANOVA:

    aov <- aov(measurement ~ 5_LEVEL_BETWEEN_IV * 2_LEVEL_WITHIN_IV + Error(Participant_ID/2_LEVEL_WITHIN_IV), data=dataframe)


    #that works and I then turn to sjstats to get effect sizes (having successfuly installed sjstats):


    eta_sq(aov, partial = TRUE)

    #this does not work, and neither does the following:

    anova_stats(car::Anova(aov, type = 1))

    #the error message for both the above is:

    Error in UseMethod("anova") :
    no applicable method for 'anova' applied to an object of class "c('aovlist', 'listof')"

    #Then I guessed that the error could be attributed to what you write at the end of your blog "the input may also be an object of class anova, so you can also use model fits from the car package, which allows fitting Anova’s with different types of sum of squares"…

    #So I wrote the following command

    anova_stats(car::Anova(aov, type = 3))

    #and it returned the following error:

    Error in vcov.default(mod, complete = FALSE) :
    there is no vcov() method for models of class aovlist, listof

    Please can you inform me if you spot any reasons for these errors – I figure others are bound to encounter similar problems?


Kommentare sind geschlossen.