No need for SPSS – beautiful output in R #rstats

Note: There’s a second part of this series here.

About one year ago, I seriously started migrating from SPSS to R. Though I’m still using SPSS (because I have to in some situations), I’m quite comfortable and happy with R now and learnt a lot in the past months. But since SPSS is still very wide spread in social sciences, I get asked every now and then, whether I really needed to learn R, because SPSS meets all my needs…

Well, learning R had at least two major benefits for me: 1.) I could improve my statistical knowledge a lot, simply by using formulas, asking why certain R commands do not automatically give the same results like SPSS, reading R resources and papers etc. and 2.) the possibilities of data visualization are way better in R than in SPSS (though SPSS can do well as well…). Of course, there are even many more reasons to use R.

Still, one thing I often miss in R is a beautiful output of simple statistics or maybe even advanced statistics. Not always as plot or graph, but neither as „cryptic“ console output. I’d like to have a simple table view, just like the SPSS output window (though the SPSS output is not „beautiful“). That’s why I started writing functions that put the results of certain statistics in HTML tables. These tables can be saved to disk or, even better for quick inspection, shown in a web browser or viewer pane (like in RStudio viewer pane).

All of the following functions are available in my sjPlot-package on CRAN.

(Generalized) Linear Models

The first two functions, which I already published last year, can be used to display (generalized) linear models and have been described here. Yet I want to give another short example for quickly viewing at linear models:

require(sjPlot) # load package
# Fit "dummy" models. Note that both models share the same predictors
# and only differ in their dependent variable
# fit first model
fit1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data=efc)
# fit second model
fit2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + c172code, data=efc)
# Print HTML-table to viewer pane
sjt.lm(fit1, fit2,
       labelDependentVariables=c("Barthel-Index", "Negative Impact"),
       labelPredictors=c("Carer's Age", "Hours of Care", "Carer's Sex", "Educational Status"),
       showStdBeta=TRUE, pvaluesAsNumbers=TRUE, showAIC=TRUE)

This is the output in the RStudio viewer pane:

Frequency Tables

Another (new) function is sjt.frq which prints frequency tables (the next example uses value and variable labels, but the simplest function call is just sjt.frq(variable)).

require(sjPlot) # load package
# load sample data
# retrieve value and variable labels
variables <- sji.getVariableLabels(efc)
values <- sji.getValueLabels(efc)
# simple frequency table

And again, this is the output in the RStudio viewer pane:

You can print frequency tables of several variables at once:

sjt.frq($e42dep, efc$e16sex, efc$c172code)),
        variableLabels=list(variables['e42dep'], variables['e16sex'], variables['c172code']),
        valueLabels=list(values[['e42dep']], values[['e16sex']], values[['c172code']]))

The output:

When applying SPSS frequency tables, especially for variable with many unique values (e.g. age or income), this often results in very long, unreadable tables. The sjt.frq function, however, can automatically group variables with many unique values:

        variableLabels=list("Carer's Age"),

This results in a frequency table with max. 10 groups:
You can also specify whether the row with median value and both upper and lower quartile are highlighted. Furthermore, the complete HTML-code is returned for further use, separated into style sheet and table content. In case you have multiple frequency tables, the function returns a list with HTML-tables.

Contingency Tables

The second new function in the sjPlot-package (while I’m writing this posting, source code and windows binaries of version 1.1 are available, Mac binaries will follow soon…) is sjt.xtab for printing contingency tables.

The simple function call prints observed values and cell percentages:

# prepare sample data set
efc.labels <- sji.getValueLabels(efc)
sjt.xtab(efc$e16sex, efc$e42dep,
         variableLabels=c("Elder's gender", "Elder's dependency"),
         valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]))


Observed values are obligatory, while cell, row and column percentages as well as expected values can be added via parameters. An example with all possible information:

sjt.xtab(efc$e16sex, efc$e42dep,
         variableLabels=c("Elder's gender", "Elder's dependency"),
         valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]),
         showRowPerc=TRUE, showColPerc=TRUE, showExpected=TRUE)


And a simple one, w/o horizontal lines:

sjt.xtab(efc$e16sex, efc$e42dep,
         variableLabels=c("Elder's gender", "Elder's dependency"),
         valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]),
         showCellPerc=FALSE, showHorizontalLine=FALSE)


All colors can be specified via parameters, as well as the constant string values. See ?sjt.frq resp. ?sjt.xtab for detailed information.

If you have more ideas on which „quick“ statistics are suitable for printing the results in the viewer pane, let me know. I will try to include them into my package…

61 Kommentare zu „No need for SPSS – beautiful output in R #rstats

  1. Very nice tables, I have been looking for something like this.
    I think there is a small error in the notes in the first table. „* p<0.005" should be "* p<0.05", and actually there should not be a note in this table when the stars are not shown.

  2. Excellent, thank you so much for sharing this code.. I was actually staring down coding up something very similar just this week and you’ve done an excellent job with it so far!

    1. Thanks for your feedback, I’m glad you like the functions… I think, some more will follow soon (I think of correlation matrices or an equivalent to the sjp.stackfrq function).

  3. Is there a good way to use sjPlot from within knitr? I’d definitely use this if there was an easy way use the output within RMarkdown.

    1. I haven’t worked much with knitr yet, so I unfortunately can’t help you. If I understood right, it should be easy to use at least the plots made with sjPlot with knitr. But I’m not sure if and how html tables are correctly „knitted“ into a PDF / Word / convertible file.

  4. If you’re using knitr and latex, might want to look at booktabs package. It generates nice looking tables for your pdfs.

  5. Thanks so much for making this package!
    I also want to use sjPlot in knitr. I found a way. I use knitr with RMarkdown and send the output to HTML. To use Daniel’s package in knitr markdown, I made use of the structure feature to access the „invisibly“ returned content and style HTML. I also entered the sjPlot function into an inline R expression in knitr markdown like this:

    `r structure(sjt.frq(efc$e42dep,variableLabels=variables[‚e42dep‘],valueLabels=values[[‚e42dep‘]]))$style`
    `r structure(sjt.frq(efc$e42dep,variableLabels=variables[‚e42dep‘],valueLabels=values[[‚e42dep‘]]))$content`

    Each on a single line with no break. The result is the HTML written directly to your HTML output from knit2html()

    Daniel, would you consider adding an option to make this easy? Maybe a simple HTML output to console that doesn’t launch a browser? 😉

    1. Yes, of course that’s possible. I wonder if it would work returning the complete HTML output including html and head tags. Does it interfere with the „overall“ knitr-html-output? (i.e. could it be that the html-file has two html or head sections?)
      Probably I simply add a third return-object to the returned structure of the functions and add an option that the output should neither be opened in a viewer nor a browser (nor saved as file)…

  6. It worked last night on my home computer, but isn’t working today at the office. Can’t figure out why. The structure()$content or $style just returns null. Odd. Maybe I’m forgetting something.
    What I do with the knitr2html output is open the file in MS-Word. Works pretty well. The output from your functions are not exactly as they appear in a browser window. Some of the double lines are missing and the columns aren’t as tight (close together) in Word as they are in the browser.

    Anyway, a third option that outputs just the HTML as text that can be embedded into knitr2html would be wonderful. 🙂

    Thanks again. This is something R really needs and I’ve always wished I had the time and programming skill to do it!

    1. If you save the html-output as file (using the file parameter), you can directly open the HTML files with Word or Libre Office etc. Unfortunately, the style sheets are not completely correct rendered, thus the tables have not the same appearance as in the browser window…

      1. Hi,
        I have gone through the page, it helped me a lot. Thank you for the wonderful package.
        I am trying to migrate SPSS to R. But i am facing a problem to generate the tables for all the variables at a time.

        1. Could you please help me to cross tabulate all the variables by YEAR (value labels for year are
        – 2010,2011,2012,2013,2014,2015,2016,2017) variable.
        2. We have weights in our data, so i wants to run those tables on applying weights & with out applying weights. It should be flexible to run with some decimal points.
        3. The HTML output is good but i didn’t find any option to export to Excel or CSV. Could you please let me know if it is available.

        Thanks in advance,

  7. Figured out why it wasn’t working at the office. It should be structure()$page.content not structure()content.
    Sorry, for the syntax error.

    1. I’ve added the requested features and uploaded a current „developer snapshot“ of my package here. It’s the source format, so use install.packages("filepath/sjPlot_1.2.tar.gz", type="source") to install the package.
      Use the no.output parameter set to TRUE to suppress output in browser window / viewer pane and access the complete HTML-output via structure$output.complete. See package NEWS section for changes and package-documentation / help for details…

      1. I’ve just tested knitting an html page with output from the sjt.frq function. Including the complete web page does not seem to be the best approach, since this produces a html page inside an html page.
        It is better to do it like you described:
        `r sjt.frq(efc$e42dep)$`
        `r sjt.frq(efc$e42dep)$page.content`

        Even better would it be to place the stuff inside the html header, but I’m not sure how to do this (so the CSS-stuff is in the head of the knitted html page, and only page.content appears in the document).

  8. Great Daniel! 🙂

    I noticed that there were style conflicts when using your sjPlot functions in knitr markdown using the method I described before. Specifically, knitr would not process some markdown tags. The work around was to just use HTML tags. Markdown doesn’t really make HTML that much easier. 😉 The real trick is knitting R code into it, not the keystrokes saved by markdown.

    I look forward to trying it some more after the changes you’ve made.

    I suppose the ideal would be a variety of knitr2html that shares style with your package. Maybe Yihui Xie can incorporate sjPlot into knitr!

    I will try out you

    1. Ok, I found a solution. I added an additional return-object knitr where I replaced CSS-class-references with inline-style-definitions. This requires some rewriting of the functions. I’ve already done and testet this for sjt.xtab, I’ll work on the functions the next days and probably supply a new developer-snapshot (package update on CRAN should not be too often, that’s why I would first release an „unofficial“ snapshot of my package).

      1. Wow! Thanks Daniel. I’m grateful for your programming skills and generosity.


  9. Hi Daniel, I really like the outputs so far. Maybe I’ve a desire to see what the output of ‚SciencesPo::detail‘ function would look like after you brush up. I made this function for summarizing the whole data frame at one. I really like it for very beginning to know the data at hands. So, maybe you can become motivated to replicate this function with powerful details of HTML.

    1. Well, you can actually do that already:


      The function is a bit rudimental, but I already updated the function so you can tweak various style sheets and also get a return value for knitr-integration. New developer snapshot for testing is coming soon… By using this function, you could also sort specific columns of the data frame etc.

  10. Excellent work! I’ve done the same journey from SPSS to R – it’s a one-way road. My package actually deals with similar issues – getting tables and stuff out of R and into publication format as painless as possible. I’ve written a blog series about it, you can find the table section here:

    The problem with importing tables into Word is that it is really lousy at handling HTML, LibreOffice is slightly better at it and after some tweaking my htmlTable()-function actually manages to provide tables the way one would expect. I’ve found it very useful to try to have most of the CSS within each cell for optimal conversion.

    Cool thinking by the way with the rstudio::viewer() – I need to add that to my package.

    1. I’ve followed your series about fast-track-publishing and really like it, especially the „grouping“ of factors in the tables (sex, ulceration). That’s something what I’d like to see in my functions as well. 😉
      I’m not working that much with knitr, but I’ve updated all functions, so know a html-table with „inline-css“ is returned that can be used to include the table in knitr-markdown documents.

    2. You’ve chosen opt-in, I have chosen opt-out. 😉 Via no.output parameter, the viewer is not called. In any case, the „normal“ html-code (with separated stylesheet) as well as a knitr-friendly html-table with inline-css is returned.

      1. By the way, great advice to put CSS into cells. It seems that Word can’t render CSS inside tr-tags, so I put everything inside th and td, and works perfect! Now the import of HTML-tables into Office looks very good!

  11. Great work, although some of this functionality is already provided by the texreg package which generates LaTeX/HTML tables of regression results.

  12. Excelent work, We are using the function in job, When I finished some Rpubs I’ll share you some function to label a data frame in a way to be able to use your function without importing a dataframe form SPSS.

    I have a question, Is it possible to put the labels when descriptive statistics are generated?

    1. You can use sji.setValueLabels or sji.setVariableLabels to manually assign labels to variables. Or you can pass string-vectors to the functions, e.g. valueLabels for the sjt.frq function or axisLabels.x for various sjp-functions. Refer to the help-documentation of the package for different examples.

  13. And, Is it possible to print some of the statistics, for example, just the mean, mean, min and max. Again, great job!

  14. The sjt.xtab function is exacltly what I was looking for, thanks!
    One question: I prefer not to include observed values but only (collum) percentages in my tables, since the latter are more easily interpreted. To have an idea of sample size I add a „total“ row with the observed values (the N-values). Is such a thing possible in de sjt.xtab function?

  15. Hi Daniel, thank you so much for this package. I can get very good looking tables in Rstudio presentation. However, I have a small comment. Under the table (after having displayed output for sjt.xtab command), it shows „observed values · expected values · % within rowname · % within column name· % of total“. How to display only the chosen one among them…eg. I want only „% within (say Column name)“.

  16. It’s not clear to me whether R or any other package can replicate SPSS’s Custom Tables functionality. I read in a (locked) thread on that it can’t (at least not easily and without essentially programming it). Custom Tables is my most used feature in SPSS and one that I simply cannot live without. It lets me design summary tables and reports in a simple, quick and intuitive (yet also powerful) fashion that lets me examine complicate datasets really quickly.

    For data that can sometimes be difficult to get your head around, Custom Tables makes it surprisingly easy to get an impression of what’s going on before performing specific tests. As a tool it’s hard to think of a way it can be implemented better. I wish there were free alternatives to a graphical table designer but I’ve yet to find any :(.

    I’m always up for supporting free and open-source software but I find too that the communities can often be very stuffy, suspicious, defensive or just plain rude and very difficult to have any influence in if you’re not a high level programmer with a lot of time to offer (and probably even if you are because there’s no doubt a lot of egos and a pecking order to be observed!).

    1. To be honest, I rarely use the custom table designer in SPSS. My goal was to produce complex tables with „one click“ (or one line of code), and meanwhile I can summarize results of different analysis methods very quickly (see examples in the related online-manual). This is – in my experience – much easier than in SPSS, or in some cases this is even not possible in SPSS (e.g., summary table of mixed effects models).

      Beside my solutions, there are also some other great packages for more generic table design, like htmlTable, ztable, xtable etc. In combination with the broom-package, it’s easy to create table outputs of almost anything.

      What kind of custom tables do you create that currently can’t be replicated in R?

  17. Hi,

    Is there any way I can give this package a thumbs-up on any site? Seriously saved the day for me. Thank you!

  18. Just one q. – is there any way to change the order of the variable labels that get displayed in the contingency table output? Or do we need to order our data beforehand in the way we want, to be able to change the variable label order in the output? Thanks!

    1. Yes, variables or item categories are ordered according to their numeric value. If you have a labelled factor (w/o numeric values), consider converting it with sjmisc::to_value before.

  19. Also, I am trying to create a table whose dimensions are greater than 2X2 (it’s actually a 6X4 table, including the NA values). From the documentation, I get that in such cases the function computes the Fisher’s exact test with Monte Carlo simulation. However, the function doesn’t display more than 5 out of the 6 values (rows) at a time. Is there some parameter I need to set that will allow me to show all 6 rows of my 6X4 contingency table?

  20. If you have two variables, the function should display all values / dimensions. To add NA, use the showNA argument (see here). If it doesn’t work, feel free to file an issue at GitHub, if possible with some example data to reproduce a bug.

  21. Vielen Dan Daniel für die package. Es ist so nützlich!. Aber, have Ich eine Frege fúr dich. Das die package funcioniert am R-Commander? Ich habe versucht es zu benutzen und bin nicht gegangen

Kommentare sind geschlossen.