No need for SPSS – beautiful output in R #rstats

Note: There’s a second part of this series here.

About one year ago, I seriously started migrating from SPSS to R. Though I’m still using SPSS (because I have to in some situations), I’m quite comfortable and happy with R now and learnt a lot in the past months. But since SPSS is still very wide spread in social sciences, I get asked every now and then, whether I really needed to learn R, because SPSS meets all my needs…

Well, learning R had at least two major benefits for me: 1.) I could improve my statistical knowledge a lot, simply by using formulas, asking why certain R commands do not automatically give the same results like SPSS, reading R resources and papers etc. and 2.) the possibilities of data visualization are way better in R than in SPSS (though SPSS can do well as well…). Of course, there are even many more reasons to use R.

Still, one thing I often miss in R is a beautiful output of simple statistics or maybe even advanced statistics. Not always as plot or graph, but neither as “cryptic” console output. I’d like to have a simple table view, just like the SPSS output window (though the SPSS output is not “beautiful”). That’s why I started writing functions that put the results of certain statistics in HTML tables. These tables can be saved to disk or, even better for quick inspection, shown in a web browser or viewer pane (like in RStudio viewer pane).

All of the following functions are available in my sjPlot-package on CRAN.


(Generalized) Linear Models

The first two functions, which I already published last year, can be used to display (generalized) linear models and have been described here. Yet I want to give another short example for quickly viewing at linear models:

require(sjPlot) # load package
# Fit "dummy" models. Note that both models share the same predictors
# and only differ in their dependent variable
data(efc)
# fit first model
fit1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data=efc)
# fit second model
fit2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + c172code, data=efc)
# Print HTML-table to viewer pane
sjt.lm(fit1, fit2,
       labelDependentVariables=c("Barthel-Index", "Negative Impact"),
       labelPredictors=c("Carer's Age", "Hours of Care", "Carer's Sex", "Educational Status"),
       showStdBeta=TRUE, pvaluesAsNumbers=TRUE, showAIC=TRUE)

This is the output in the RStudio viewer pane:
lm_test

Frequency Tables

Another (new) function is sjt.frq which prints frequency tables (the next example uses value and variable labels, but the simplest function call is just sjt.frq(variable)).

require(sjPlot) # load package
# load sample data
data(efc)
# retrieve value and variable labels
variables <- sji.getVariableLabels(efc)
values <- sji.getValueLabels(efc)
# simple frequency table
sjt.frq(efc$e42dep,
        variableLabels=variables['e42dep'],
        valueLabels=values[['e42dep']])

And again, this is the output in the RStudio viewer pane:
freq_tab_1

You can print frequency tables of several variables at once:

sjt.frq(as.data.frame(cbind(efc$e42dep, efc$e16sex, efc$c172code)),
        variableLabels=list(variables['e42dep'], variables['e16sex'], variables['c172code']),
        valueLabels=list(values[['e42dep']], values[['e16sex']], values[['c172code']]))

The output:
freq_tab_2

When applying SPSS frequency tables, especially for variable with many unique values (e.g. age or income), this often results in very long, unreadable tables. The sjt.frq function, however, can automatically group variables with many unique values:

sjt.frq(efc$c160age,
        variableLabels=list("Carer's Age"),
        autoGroupAt=10)

This results in a frequency table with max. 10 groups:
freq_tab_3
You can also specify whether the row with median value and both upper and lower quartile are highlighted. Furthermore, the complete HTML-code is returned for further use, separated into style sheet and table content. In case you have multiple frequency tables, the function returns a list with HTML-tables.

Contingency Tables

The second new function in the sjPlot-package (while I’m writing this posting, source code and windows binaries of version 1.1 are available, Mac binaries will follow soon…) is sjt.xtab for printing contingency tables.

The simple function call prints observed values and cell percentages:

# prepare sample data set
data(efc)
efc.labels <- sji.getValueLabels(efc)
sjt.xtab(efc$e16sex, efc$e42dep,
         variableLabels=c("Elder's gender", "Elder's dependency"),
         valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]))

xtab_1

Observed values are obligatory, while cell, row and column percentages as well as expected values can be added via parameters. An example with all possible information:

sjt.xtab(efc$e16sex, efc$e42dep,
         variableLabels=c("Elder's gender", "Elder's dependency"),
         valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]),
         showRowPerc=TRUE, showColPerc=TRUE, showExpected=TRUE)

xtab_2

And a simple one, w/o horizontal lines:

sjt.xtab(efc$e16sex, efc$e42dep,
         variableLabels=c("Elder's gender", "Elder's dependency"),
         valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]),
         showCellPerc=FALSE, showHorizontalLine=FALSE)

xtab_3

All colors can be specified via parameters, as well as the constant string values. See ?sjt.frq resp. ?sjt.xtab for detailed information.

If you have more ideas on which “quick” statistics are suitable for printing the results in the viewer pane, let me know. I will try to include them into my package…

No need for SPSS – beautiful output in R #rstats

46 Gedanken zu “No need for SPSS – beautiful output in R #rstats

  1. Henry schreibt:

    Very nice tables, I have been looking for something like this.
    I think there is a small error in the notes in the first table. “* p<0.005" should be "* p<0.05", and actually there should not be a note in this table when the stars are not shown.

    1. Thanks for your feedback, I’m glad you like the functions… I think, some more will follow soon (I think of correlation matrices or an equivalent to the sjp.stackfrq function).

  2. Dan schreibt:

    Is there a good way to use sjPlot from within knitr? I’d definitely use this if there was an easy way use the output within RMarkdown.

    1. I haven’t worked much with knitr yet, so I unfortunately can’t help you. If I understood right, it should be easy to use at least the plots made with sjPlot with knitr. But I’m not sure if and how html tables are correctly “knitted” into a PDF / Word / convertible file.

  3. Harold Baize schreibt:

    Thanks so much for making this package!
    I also want to use sjPlot in knitr. I found a way. I use knitr with RMarkdown and send the output to HTML. To use Daniel’s package in knitr markdown, I made use of the structure feature to access the “invisibly” returned content and style HTML. I also entered the sjPlot function into an inline R expression in knitr markdown like this:

    `r structure(sjt.frq(efc$e42dep,variableLabels=variables[‘e42dep’],valueLabels=values[[‘e42dep’]]))$style`
    `r structure(sjt.frq(efc$e42dep,variableLabels=variables[‘e42dep’],valueLabels=values[[‘e42dep’]]))$content`

    Each on a single line with no break. The result is the HTML written directly to your HTML output from knit2html()

    Daniel, would you consider adding an option to make this easy? Maybe a simple HTML output to console that doesn’t launch a browser? ;-)

    1. Yes, of course that’s possible. I wonder if it would work returning the complete HTML output including html and head tags. Does it interfere with the “overall” knitr-html-output? (i.e. could it be that the html-file has two html or head sections?)
      Probably I simply add a third return-object to the returned structure of the functions and add an option that the output should neither be opened in a viewer nor a browser (nor saved as file)…

  4. Harold Baize schreibt:

    It worked last night on my home computer, but isn’t working today at the office. Can’t figure out why. The structure()$content or $style just returns null. Odd. Maybe I’m forgetting something.
    What I do with the knitr2html output is open the file in MS-Word. Works pretty well. The output from your functions are not exactly as they appear in a browser window. Some of the double lines are missing and the columns aren’t as tight (close together) in Word as they are in the browser.

    Anyway, a third option that outputs just the HTML as text that can be embedded into knitr2html would be wonderful. :-)

    Thanks again. This is something R really needs and I’ve always wished I had the time and programming skill to do it!

    1. If you save the html-output as file (using the file parameter), you can directly open the HTML files with Word or Libre Office etc. Unfortunately, the style sheets are not completely correct rendered, thus the tables have not the same appearance as in the browser window…

  5. Harold Baize schreibt:

    Figured out why it wasn’t working at the office. It should be structure()$page.content not structure()content.
    Sorry, for the syntax error.

    1. I’ve added the requested features and uploaded a current “developer snapshot” of my package here. It’s the source format, so use install.packages("filepath/sjPlot_1.2.tar.gz", type="source") to install the package.
      Use the no.output parameter set to TRUE to suppress output in browser window / viewer pane and access the complete HTML-output via structure$output.complete. See package NEWS section for changes and package-documentation / help for details…

      1. I’ve just tested knitting an html page with output from the sjt.frq function. Including the complete web page does not seem to be the best approach, since this produces a html page inside an html page.
        It is better to do it like you described:
        `r sjt.frq(efc$e42dep)$page.style`
        `r sjt.frq(efc$e42dep)$page.content`

        Even better would it be to place the page.style stuff inside the html header, but I’m not sure how to do this (so the CSS-stuff is in the head of the knitted html page, and only page.content appears in the document).

  6. Harold Baize schreibt:

    Great Daniel! :-)

    I noticed that there were style conflicts when using your sjPlot functions in knitr markdown using the method I described before. Specifically, knitr would not process some markdown tags. The work around was to just use HTML tags. Markdown doesn’t really make HTML that much easier. ;-) The real trick is knitting R code into it, not the keystrokes saved by markdown.

    I look forward to trying it some more after the changes you’ve made.

    I suppose the ideal would be a variety of knitr2html that shares style with your package. Maybe Yihui Xie can incorporate sjPlot into knitr!
    Thanks!

    I will try out you

    1. Ok, I found a solution. I added an additional return-object knitr where I replaced CSS-class-references with inline-style-definitions. This requires some rewriting of the functions. I’ve already done and testet this for sjt.xtab, I’ll work on the functions the next days and probably supply a new developer-snapshot (package update on CRAN should not be too often, that’s why I would first release an “unofficial” snapshot of my package).

      1. Harold Baize schreibt:

        Wow! Thanks Daniel. I’m grateful for your programming skills and generosity.

        Harolddd

  7. Hi Daniel, I really like the outputs so far. Maybe I’ve a desire to see what the output of ‘SciencesPo::detail’ function would look like after you brush up. I made this function for summarizing the whole data frame at one. I really like it for very beginning to know the data at hands. So, maybe you can become motivated to replicate this function with powerful details of HTML.

    1. Well, you can actually do that already:

      library(sjPlot)
      library(SciencePo)
      data(efc)
      sjt.df(detail(etc))

      The function is a bit rudimental, but I already updated the function so you can tweak various style sheets and also get a return value for knitr-integration. New developer snapshot for testing is coming soon… By using this function, you could also sort specific columns of the data frame etc.

  8. Excellent work! I’ve done the same journey from SPSS to R – it’s a one-way road. My package actually deals with similar issues – getting tables and stuff out of R and into publication format as painless as possible. I’ve written a blog series about it, you can find the table section here: http://gforge.se/2014/01/fast-track-publishing-using-knitr-part-iv/

    The problem with importing tables into Word is that it is really lousy at handling HTML, LibreOffice is slightly better at it and after some tweaking my htmlTable()-function actually manages to provide tables the way one would expect. I’ve found it very useful to try to have most of the CSS within each cell for optimal conversion.

    Cool thinking by the way with the rstudio::viewer() – I need to add that to my package.

    1. I’ve followed your series about fast-track-publishing and really like it, especially the “grouping” of factors in the tables (sex, ulceration). That’s something what I’d like to see in my functions as well. ;-)
      I’m not working that much with knitr, but I’ve updated all functions, so know a html-table with “inline-css” is returned that can be used to include the table in knitr-markdown documents.

    2. You’ve chosen opt-in, I have chosen opt-out. ;-) Via no.output parameter, the viewer is not called. In any case, the “normal” html-code (with separated stylesheet) as well as a knitr-friendly html-table with inline-css is returned.

      1. By the way, great advice to put CSS into cells. It seems that Word can’t render CSS inside tr-tags, so I put everything inside th and td, and works perfect! Now the import of HTML-tables into Office looks very good!

  9. Excelent work, We are using the function in job, When I finished some Rpubs I’ll share you some function to label a data frame in a way to be able to use your function without importing a dataframe form SPSS.

    I have a question, Is it possible to put the labels when descriptive statistics are generated?

    1. You can use sji.setValueLabels or sji.setVariableLabels to manually assign labels to variables. Or you can pass string-vectors to the functions, e.g. valueLabels for the sjt.frq function or axisLabels.x for various sjp-functions. Refer to the help-documentation of the package for different examples.

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ photo

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s