Inspired by these two postings, I thought about including a function in my package for simply creating scatter plots.
In my package, there’s a function called sjp.scatter
for creating scatter plots. To reproduce these examples, first load the package and then attach the sample data set:
data(efc)
The simplest function call is by just providing two variables, one for the x- and one for the y-axis:
sjp.scatter(efc$c160age, efc$e17age)
If you have continuous variables with a larger scale, you shouldn’t have problems with overplotting or overlaying dots. However, this problem usually occurs, if you have variables with just a few categories (factor levels). The function automatically estimates the amount of overlaying dots and then automatically jitters them, like in following example, which also includes a marginal rug-plot:
sjp.scatter(efc$e16sex,efc$neg_c_7, efc$c172code, showRug=TRUE)
The same plot, when auto-jittering is turned off, would look like this:
sjp.scatter(efc$e16sex,efc$neg_c_7, efc$c172code, showRug=TRUE, autojitter=FALSE)
You can also add a grouping variable. The scatter plot is then „divided“ into as many groups as indicated by the grouping variable. In the next example, two variables (elder’s and carer’s age) are grouped by different dependency levels of the elderly. Additionally, a fitted line for each group is plotted:
sjp.scatter(efc$c160age,efc$e17age, efc$e42dep, title="Scatter Plot", legendTitle=sji.getVariableLabels(efc)['e42dep'], legendLabels=sji.getValueLabels(efc)[['e42dep']], axisTitle.x=sji.getVariableLabels(efc)['c160age'], axisTitle.y=sji.getVariableLabels(efc)['e17age'], showGroupFitLine=TRUE)
If the groups are difficult to distinguish in a single plot area, the graph can be faceted by groups. This is shown in the last example, where the same scatter plot as above is plotted with facets for each group:
sjp.scatter(efc$c160age,efc$e17age, efc$e42dep, title="Scatter Plot", legendTitle=sji.getVariableLabels(efc)['e42dep'], legendLabels=sji.getValueLabels(efc)[['e42dep']], axisTitle.x=sji.getVariableLabels(efc)['c160age'], axisTitle.y=sji.getVariableLabels(efc)['e17age'], showGroupFitLine=TRUE, useFacetGrid=TRUE, showSE=TRUE)
Find a complete overview of the various function options in the package-help or at inside-r.
3 Kommentare zu „Simply creating various scatter plots with ggplot #rstats“
Kommentare sind geschlossen.