# No need for SPSS – beautiful output in R #rstats

Note: There’s a second part of this series here.

About one year ago, I seriously started migrating from SPSS to R. Though I’m still using SPSS (because I have to in some situations), I’m quite comfortable and happy with R now and learnt a lot in the past months. But since SPSS is still very wide spread in social sciences, I get asked every now and then, whether I really needed to learn R, because SPSS meets all my needs…

Well, learning R had at least two major benefits for me: 1.) I could improve my statistical knowledge a lot, simply by using formulas, asking why certain R commands do not automatically give the same results like SPSS, reading R resources and papers etc. and 2.) the possibilities of data visualization are way better in R than in SPSS (though SPSS can do well as well…). Of course, there are even many more reasons to use R.

Still, one thing I often miss in R is a beautiful output of simple statistics or maybe even advanced statistics. Not always as plot or graph, but neither as „cryptic“ console output. I’d like to have a simple table view, just like the SPSS output window (though the SPSS output is not „beautiful“). That’s why I started writing functions that put the results of certain statistics in HTML tables. These tables can be saved to disk or, even better for quick inspection, shown in a web browser or viewer pane (like in RStudio viewer pane).

All of the following functions are available in my sjPlot-package on CRAN.

(Generalized) Linear Models

The first two functions, which I already published last year, can be used to display (generalized) linear models and have been described here. Yet I want to give another short example for quickly viewing at linear models:

```require(sjPlot) # load package
# Fit "dummy" models. Note that both models share the same predictors
# and only differ in their dependent variable
data(efc)
# fit first model
fit1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data=efc)
# fit second model
fit2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + c172code, data=efc)
# Print HTML-table to viewer pane
sjt.lm(fit1, fit2,
labelDependentVariables=c("Barthel-Index", "Negative Impact"),
labelPredictors=c("Carer's Age", "Hours of Care", "Carer's Sex", "Educational Status"),
showStdBeta=TRUE, pvaluesAsNumbers=TRUE, showAIC=TRUE)```

This is the output in the RStudio viewer pane:

Frequency Tables

Another (new) function is `sjt.frq` which prints frequency tables (the next example uses value and variable labels, but the simplest function call is just `sjt.frq(variable)`).

```require(sjPlot) # load package
# load sample data
data(efc)
# retrieve value and variable labels
variables <- sji.getVariableLabels(efc)
values <- sji.getValueLabels(efc)
# simple frequency table
sjt.frq(efc\$e42dep,
variableLabels=variables['e42dep'],
valueLabels=values[['e42dep']])```

And again, this is the output in the RStudio viewer pane:

You can print frequency tables of several variables at once:

```sjt.frq(as.data.frame(cbind(efc\$e42dep, efc\$e16sex, efc\$c172code)),
variableLabels=list(variables['e42dep'], variables['e16sex'], variables['c172code']),
valueLabels=list(values[['e42dep']], values[['e16sex']], values[['c172code']]))```

The output:

When applying SPSS frequency tables, especially for variable with many unique values (e.g. age or income), this often results in very long, unreadable tables. The `sjt.frq` function, however, can automatically group variables with many unique values:

```sjt.frq(efc\$c160age,
variableLabels=list("Carer's Age"),
autoGroupAt=10)```

This results in a frequency table with max. 10 groups:

You can also specify whether the row with median value and both upper and lower quartile are highlighted. Furthermore, the complete HTML-code is returned for further use, separated into style sheet and table content. In case you have multiple frequency tables, the function returns a list with HTML-tables.

Contingency Tables

The second new function in the sjPlot-package (while I’m writing this posting, source code and windows binaries of version 1.1 are available, Mac binaries will follow soon…) is `sjt.xtab` for printing contingency tables.

The simple function call prints observed values and cell percentages:

```# prepare sample data set
data(efc)
efc.labels <- sji.getValueLabels(efc)
sjt.xtab(efc\$e16sex, efc\$e42dep,
variableLabels=c("Elder's gender", "Elder's dependency"),
valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]))```

Observed values are obligatory, while cell, row and column percentages as well as expected values can be added via parameters. An example with all possible information:

```sjt.xtab(efc\$e16sex, efc\$e42dep,
variableLabels=c("Elder's gender", "Elder's dependency"),
valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]),
showRowPerc=TRUE, showColPerc=TRUE, showExpected=TRUE)```

And a simple one, w/o horizontal lines:

```sjt.xtab(efc\$e16sex, efc\$e42dep,
variableLabels=c("Elder's gender", "Elder's dependency"),
valueLabels=list(efc.labels[['e16sex']], efc.labels[['e42dep']]),
showCellPerc=FALSE, showHorizontalLine=FALSE)```

All colors can be specified via parameters, as well as the constant string values. See `?sjt.frq` resp. `?sjt.xtab` for detailed information.

If you have more ideas on which „quick“ statistics are suitable for printing the results in the viewer pane, let me know. I will try to include them into my package…

## 61 Kommentare zu „No need for SPSS – beautiful output in R #rstats“

1. Henry sagt:

Very nice tables, I have been looking for something like this.
I think there is a small error in the notes in the first table. „* p<0.005" should be "* p<0.05", and actually there should not be a note in this table when the stars are not shown.

1. Thanks for your feedback Henry, I’ll fix these bugs!

2. Simon sagt:

Great Work!

3. sebastiankuhn sagt:

nice! good job!

4. Excellent, thank you so much for sharing this code.. I was actually staring down coding up something very similar just this week and you’ve done an excellent job with it so far!

1. Thanks for your feedback, I’m glad you like the functions… I think, some more will follow soon (I think of correlation matrices or an equivalent to the `sjp.stackfrq` function).

5. Dan sagt:

Is there a good way to use sjPlot from within knitr? I’d definitely use this if there was an easy way use the output within RMarkdown.

1. I haven’t worked much with knitr yet, so I unfortunately can’t help you. If I understood right, it should be easy to use at least the plots made with sjPlot with knitr. But I’m not sure if and how html tables are correctly „knitted“ into a PDF / Word / convertible file.

6. If you’re using knitr and latex, might want to look at booktabs package. It generates nice looking tables for your pdfs.

7. Harold Baize sagt:

Thanks so much for making this package!
I also want to use sjPlot in knitr. I found a way. I use knitr with RMarkdown and send the output to HTML. To use Daniel’s package in knitr markdown, I made use of the structure feature to access the „invisibly“ returned content and style HTML. I also entered the sjPlot function into an inline R expression in knitr markdown like this:

`r structure(sjt.frq(efc\$e42dep,variableLabels=variables[‚e42dep‘],valueLabels=values[[‚e42dep‘]]))\$style`
`r structure(sjt.frq(efc\$e42dep,variableLabels=variables[‚e42dep‘],valueLabels=values[[‚e42dep‘]]))\$content`

Each on a single line with no break. The result is the HTML written directly to your HTML output from knit2html()

Daniel, would you consider adding an option to make this easy? Maybe a simple HTML output to console that doesn’t launch a browser? 😉

1. Yes, of course that’s possible. I wonder if it would work returning the complete HTML output including `html` and `head` tags. Does it interfere with the „overall“ knitr-html-output? (i.e. could it be that the html-file has two html or head sections?)
Probably I simply add a third return-object to the returned structure of the functions and add an option that the output should neither be opened in a viewer nor a browser (nor saved as file)…

1. You could take the same approach as rCharts, which supports rendering to an iframe or as inline HTML. The iframe method has the advantage of isolating your style sheets from the rest of the document but it is not supported in all browsers.

A simple rCharts example is here: http://bl.ocks.org/ramnathv/raw/8084330/

8. Harold Baize sagt:

It worked last night on my home computer, but isn’t working today at the office. Can’t figure out why. The structure()\$content or \$style just returns null. Odd. Maybe I’m forgetting something.
What I do with the knitr2html output is open the file in MS-Word. Works pretty well. The output from your functions are not exactly as they appear in a browser window. Some of the double lines are missing and the columns aren’t as tight (close together) in Word as they are in the browser.

Anyway, a third option that outputs just the HTML as text that can be embedded into knitr2html would be wonderful. 🙂

Thanks again. This is something R really needs and I’ve always wished I had the time and programming skill to do it!

1. If you save the html-output as file (using the `file` parameter), you can directly open the HTML files with Word or Libre Office etc. Unfortunately, the style sheets are not completely correct rendered, thus the tables have not the same appearance as in the browser window…

1. sakshi sagt:

Hi,
I have gone through the page, it helped me a lot. Thank you for the wonderful package.
I am trying to migrate SPSS to R. But i am facing a problem to generate the tables for all the variables at a time.

1. Could you please help me to cross tabulate all the variables by YEAR (value labels for year are
– 2010,2011,2012,2013,2014,2015,2016,2017) variable.
2. We have weights in our data, so i wants to run those tables on applying weights & with out applying weights. It should be flexible to run with some decimal points.
3. The HTML output is good but i didn’t find any option to export to Excel or CSV. Could you please let me know if it is available.

9. Harold Baize sagt:

Figured out why it wasn’t working at the office. It should be structure()\$page.content not structure()content.
Sorry, for the syntax error.

1. I’ve added the requested features and uploaded a current „developer snapshot“ of my package here. It’s the source format, so use `install.packages("filepath/sjPlot_1.2.tar.gz", type="source")` to install the package.
Use the `no.output` parameter set to TRUE to suppress output in browser window / viewer pane and access the complete HTML-output via `structure\$output.complete`. See package NEWS section for changes and package-documentation / help for details…

1. I’ve just tested knitting an html page with output from the sjt.frq function. Including the complete web page does not seem to be the best approach, since this produces a html page inside an html page.
It is better to do it like you described:
````r sjt.frq(efc\$e42dep)\$page.style` `r sjt.frq(efc\$e42dep)\$page.content````
Even better would it be to place the page.style stuff inside the html header, but I’m not sure how to do this (so the CSS-stuff is in the head of the knitted html page, and only page.content appears in the document).

10. Harold Baize sagt:

Great Daniel! 🙂

I noticed that there were style conflicts when using your sjPlot functions in knitr markdown using the method I described before. Specifically, knitr would not process some markdown tags. The work around was to just use HTML tags. Markdown doesn’t really make HTML that much easier. 😉 The real trick is knitting R code into it, not the keystrokes saved by markdown.

I look forward to trying it some more after the changes you’ve made.

I suppose the ideal would be a variety of knitr2html that shares style with your package. Maybe Yihui Xie can incorporate sjPlot into knitr!
Thanks!

I will try out you

1. Ok, I found a solution. I added an additional return-object `knitr` where I replaced CSS-class-references with inline-style-definitions. This requires some rewriting of the functions. I’ve already done and testet this for `sjt.xtab`, I’ll work on the functions the next days and probably supply a new developer-snapshot (package update on CRAN should not be too often, that’s why I would first release an „unofficial“ snapshot of my package).

1. Harold Baize sagt:

Wow! Thanks Daniel. I’m grateful for your programming skills and generosity.

Harolddd

11. Hi Daniel, I really like the outputs so far. Maybe I’ve a desire to see what the output of ‚SciencesPo::detail‘ function would look like after you brush up. I made this function for summarizing the whole data frame at one. I really like it for very beginning to know the data at hands. So, maybe you can become motivated to replicate this function with powerful details of HTML.

1. What is the `SciencesPo::detail` function about? Included in which package?

1. ‚detail‘ is a function for sumarizing data from the SciencePo package.

2. Well, you can actually do that already:

```library(sjPlot)
library(SciencePo)
data(efc)
sjt.df(detail(etc))```

The function is a bit rudimental, but I already updated the function so you can tweak various style sheets and also get a return value for knitr-integration. New developer snapshot for testing is coming soon… By using this function, you could also sort specific columns of the data frame etc.

12. Harold Baize sagt:

Sorry that was a typo sentence fragment on the end of my last comment. 😉

13. Excellent work! I’ve done the same journey from SPSS to R – it’s a one-way road. My package actually deals with similar issues – getting tables and stuff out of R and into publication format as painless as possible. I’ve written a blog series about it, you can find the table section here: http://gforge.se/2014/01/fast-track-publishing-using-knitr-part-iv/

The problem with importing tables into Word is that it is really lousy at handling HTML, LibreOffice is slightly better at it and after some tweaking my htmlTable()-function actually manages to provide tables the way one would expect. I’ve found it very useful to try to have most of the CSS within each cell for optimal conversion.

Cool thinking by the way with the rstudio::viewer() – I need to add that to my package.

1. I’ve followed your series about fast-track-publishing and really like it, especially the „grouping“ of factors in the tables (sex, ulceration). That’s something what I’d like to see in my functions as well. 😉
I’m not working that much with knitr, but I’ve updated all functions, so know a html-table with „inline-css“ is returned that can be used to include the table in knitr-markdown documents.

2. You’ve chosen opt-in, I have chosen opt-out. 😉 Via `no.output` parameter, the viewer is not called. In any case, the „normal“ html-code (with separated stylesheet) as well as a knitr-friendly html-table with inline-css is returned.

1. By the way, great advice to put CSS into cells. It seems that Word can’t render CSS inside `tr`-tags, so I put everything inside th and td, and works perfect! Now the import of HTML-tables into Office looks very good!

14. Hi Daniel,
Knitr integration would be awesome.
Grate idea and very good job!
Keep up!
Cheers

15. Great work, although some of this functionality is already provided by the texreg package which generates LaTeX/HTML tables of regression results.

16. Excelent work, We are using the function in job, When I finished some Rpubs I’ll share you some function to label a data frame in a way to be able to use your function without importing a dataframe form SPSS.

I have a question, Is it possible to put the labels when descriptive statistics are generated?

1. You can use `sji.setValueLabels` or `sji.setVariableLabels` to manually assign labels to variables. Or you can pass string-vectors to the functions, e.g. `valueLabels` for the `sjt.frq` function or `axisLabels.x` for various `sjp`-functions. Refer to the help-documentation of the package for different examples.

17. And, Is it possible to print some of the statistics, for example, just the mean, mean, min and max. Again, great job!

1. Do you mean the `sjt.df` function?

18. james sagt:

Thank you so much!

19. Johan Pauwels sagt:

The sjt.xtab function is exacltly what I was looking for, thanks!
One question: I prefer not to include observed values but only (collum) percentages in my tables, since the latter are more easily interpreted. To have an idea of sample size I add a „total“ row with the observed values (the N-values). Is such a thing possible in de sjt.xtab function?

1. Ok, I included an option in `sjt.xtab`, see GitHub.

1. Johan Pauwels sagt:

That’s totally fantastic! Thank you so much!

20. Hi Daniel, thank you so much for this package. I can get very good looking tables in Rstudio presentation. However, I have a small comment. Under the table (after having displayed output for sjt.xtab command), it shows „observed values · expected values · % within rowname · % within column name· % of total“. How to display only the chosen one among them…eg. I want only „% within (say Column name)“.

21. Simon sagt:

It’s not clear to me whether R or any other package can replicate SPSS’s Custom Tables functionality. I read in a (locked) thread on stats.stackexchange.com that it can’t (at least not easily and without essentially programming it). Custom Tables is my most used feature in SPSS and one that I simply cannot live without. It lets me design summary tables and reports in a simple, quick and intuitive (yet also powerful) fashion that lets me examine complicate datasets really quickly.

For data that can sometimes be difficult to get your head around, Custom Tables makes it surprisingly easy to get an impression of what’s going on before performing specific tests. As a tool it’s hard to think of a way it can be implemented better. I wish there were free alternatives to a graphical table designer but I’ve yet to find any :(.

I’m always up for supporting free and open-source software but I find too that the communities can often be very stuffy, suspicious, defensive or just plain rude and very difficult to have any influence in if you’re not a high level programmer with a lot of time to offer (and probably even if you are because there’s no doubt a lot of egos and a pecking order to be observed!).

1. To be honest, I rarely use the custom table designer in SPSS. My goal was to produce complex tables with „one click“ (or one line of code), and meanwhile I can summarize results of different analysis methods very quickly (see examples in the related online-manual). This is – in my experience – much easier than in SPSS, or in some cases this is even not possible in SPSS (e.g., summary table of mixed effects models).

Beside my solutions, there are also some other great packages for more generic table design, like htmlTable, ztable, xtable etc. In combination with the broom-package, it’s easy to create table outputs of almost anything.

What kind of custom tables do you create that currently can’t be replicated in R?

22. apurva hegde sagt:

Hi,

Is there any way I can give this package a thumbs-up on any site? Seriously saved the day for me. Thank you!

23. apurva hegde sagt:

Just one q. – is there any way to change the order of the variable labels that get displayed in the contingency table output? Or do we need to order our data beforehand in the way we want, to be able to change the variable label order in the output? Thanks!

1. Yes, variables or item categories are ordered according to their numeric value. If you have a labelled factor (w/o numeric values), consider converting it with `sjmisc::to_value` before.

24. apurva hegde sagt:

Also, I am trying to create a table whose dimensions are greater than 2X2 (it’s actually a 6X4 table, including the NA values). From the documentation, I get that in such cases the function computes the Fisher’s exact test with Monte Carlo simulation. However, the function doesn’t display more than 5 out of the 6 values (rows) at a time. Is there some parameter I need to set that will allow me to show all 6 rows of my 6X4 contingency table?

25. If you have two variables, the function should display all values / dimensions. To add NA, use the `showNA` argument (see here). If it doesn’t work, feel free to file an issue at GitHub, if possible with some example data to reproduce a bug.

26. Laura Andrea De Gracia sagt:

Vielen Dan Daniel für die package. Es ist so nützlich!. Aber, have Ich eine Frege fúr dich. Das die package funcioniert am R-Commander? Ich habe versucht es zu benutzen und bin nicht gegangen

Kommentare sind geschlossen.