by Robert A. Muenchen
The R Commander is a free and open source user interface for the R software, one that focuses on helping users learn R commands by point-and-clicking their way through analyses. The R Commander is available on Windows, Mac, and Linux; there is no server version.
This is one of a series of reviews which aim to help non-programmers choose the user interface for R which is best for them. Each review also includes a cursory description of the programming support that each interface offers.
There are various definitions of user interface types, so here’s how I’ll be using these terms:
GUI = Graphical User Interface specifically using menus and dialog boxes to avoid having to type programming code. I do not include any assistance for programming in this definition. So GUI users are people who prefer using a GUI to perform their analyses. They don’t have the time or inclination to become good programmers.
IDE = Integrated Development Environment which helps programmers write code. I do not include point-and-click style menus and dialog boxes when using this term. IDE users are people who prefer to write R code to perform their analyses.
The various user interfaces available for R differ quite a lot in how they’re installed. Some, such as jamovi, BlueSky, or RKWard, install in a single step. Others, such as Deducer, install in multiple steps. Advanced computer users often don’t appreciate how lost beginners can become while attempting even a single-step installation. The HelpDesks at most universities are flooded with such calls at the beginning of each semester!
As described on the R Commander main web site, the installation basics are as follows:
- Download R from CRAN and install it in the manner appropriate to your operating system. If you have an old version of R — that is, older than the current version — then it’s generally a good idea to install the current version of R before installing the Rcmdr package. On Windows, opt for a customized startup and select the single-document interface (“SDI,” see the Windows notes below for details).
- On Mac OS X only, download and install XQuartz, and reboot your computer (see the Mac notes below for greater detail).
- Start R, and at the > command prompt, type the command install.packages(“Rcmdr”).
- Once it is installed, to load the Rcmdr package, just enter the command library(“Rcmdr”).
- Optionally install Pandoc and LaTeX to get publication-quality output (via “Tools> Install auxiliary software")
Complete installation notes are here. They’re worth reading as they go on to point out several things that can go wrong. These include having an incompatible version of R (i.e. you skipped step 1), and R packages which fail to install.
While these multiple steps are more challenging that single-step installations, they are in line with the developer’s goal of helping people learn to program in R. That audience would have to learn to install R and R packages, then load packages anyway.
When choosing a GUI, one of the most fundamental questions is: what can do for you? What the initial software installation of each GUI gets you is covered in the Graphics, Analysis, and Modeling section of this series of articles. Regardless of what comes built-in, it’s good to know how active the development community is. They contribute “plug-ins" which add new menus and dialog boxes to the GUI. This level of activity ranges from very low (RKWard, BlueSky, Deducer) through moderate (jamovi) to very active.
R Commander’s development community is by far the most active, with 42 add-ons available. The add-ons are stored on CRAN and installed like any other software. You install them using the install.packages function, then choose “Tools> Load Rcmdr plug-ins…", select the plug-in, and click OK. R Commander will then tell you that you need to restart R Commander to have it appear on the menus. The R Commander stores your list of plug-ins in your .Rprofile and will edit it for you. That’s important as editing it is a non-trivial task (see Installation, step 6).
You can find a comprehensive list of plug-ins here: https://cran.r-project.org/web/packages/index.html.
Some user interfaces for R, such as jamovi and BlueSky Statistics, start by double-clicking on a single icon, which is great for people who prefer to not write code. Others, such as Deducer, have you start R, then load a package from your library, then call a function. That’s better for people looking to learn R, as those are among the first tasks they’ll have to learn anyway.
You start R Commander by first starting the RGUI program that comes with the main R package. You can also start it from any program that offers an R console, such as the one that comes with the main R installation. Once R is started, you start the R Commander by loading it from your library by typing this command and pressing the Enter key: “library(“Rcmdr”)." The main control screen will then appear (Figure 1, upper left) along with a graphics screen (Figure 1, right).
If you want to have the R Commander start automatically each time you start R, it’s possible to do so, but it is not a task for beginners. The R Commander creates an “.Rprofile" in your main directory and it includes instructions about how to “uncomment" a few lines by removing the leading “#" characters. However, files whose names consist only of an “.extension" are hidden by your file system, and if edited using the Notepad application, cannot be saved with that non-standard type of name. You can save it to a regular name like “Rprofile.txt". You then must use operating system commands to rename it, as the Windows file manager also won’t let you choose a file name that is only an extension.
A data editor is a fundamental feature in data analysis software. It puts you in touch with your data, lets you get a feel for it, if only in a rough way. A data editor is such a simple concept that you might think there would be hardly any differences in how they work in different GUIs. While there are technical differences (single-click sorting, icons that show excluded observations, etc.), to a beginner what matters the most are the differences in simplicity. Some GUIs, including jamovi, Bluesky, let you create only what R calls a data frame. They use more common terminology and call it a data set: you create one, you save one, later you open one, then you use one. Others, such as RKWard trade this simplicity for the full R language perspective: a data set is stored in a workspace. So the process goes: you create a data set, you save a workspace, you open a workspace, and choose a data set from within it.
You start the R Commander’s Data Editor by choosing “Data> New data set…" You can enter data immediately, though at first the variables are named simply V1, V2… and the rows are named 1,2,3…. You can click on the names to change them (see Figure 2). Clicking on the “Add row" or “Add column" buttons do just that, though the Enter key is a quicker way to get a new row. You can enter simple numeric data or character data; no scientific notation, no dates. The latter is converted to a factor, but there is no way to enter the underlying values such as 1, 2 and have the editor display Male, Female, for example. That slows down data entry.
There is no way to enter or change any metadata other than variable and row names.
Saving the data provides a lesson on R data structures. Since you started the process by creating a new “data set", you might start looking on the menus for where to save such a thing. Instead, you have to know that in R, data sets reside in something called a “workspace". So “Data: New data set…" is balanced by “File: Save R workspace". It would be nice if there was some instruction explaining this situation.
The R Commander can import the file formats: CSV, TXT, Excel, Minitab, SPSS, SAS, and Stata. It can even import data directly from a URL, which is a rare feature for a GUI. These are all located under “Data> Import Data". A particularly handy feature is the ability to explore and load data sets that are included with installed packages. That’s done via “Data> Data in packages…".
To get data from SQL database formats, you’ll have to use R code.
It’s often said that 80% of data analysis time is spent preparing the data. Variables need to be transformed, recoded, or created; missing values need to be handled; datasets need to be stacked or merged, aggregated, transposed, or reshaped (e.g. from wide to long and back). A critically important aspect of data management is the ability to transform many variables at once. For example, social scientists need to recode many survey items, biologists need to take the logarithms of many variables. Doing such tasks one variable at a time is tedious. Some GUIs, such as BlueSky, handle nearly all of these challenges. Others, such as RKWard offer just a handful of data management functions.
The R Commander is able to recode many variables, adding an optional prefix to each name like “recoded_" to each variable that you choose to recode. It can also standardize many variables at once, but can only over-write the original values. Make a copy of your data set before doing that! Unfortunately, when it comes to other popular transformations such as the logarithm, you have to apply them one variable at time.
For reshaping data sets, the R Commander can stack one set of variables into a single variable and create a factor to classify those values, but it can’t take along other relevant variables, nor can it do the reverse of this process by going from “long" to “wide" data structures.
Overall, the R Commander offers a very useful set of data management tools:
For managing the active data set as a whole:
- View data
- Select active data set
- Refresh active data set
- Help on active data set
- Variables in active data set
- Set case names
- Subset active data set
- Sort active data set
- Aggregate variables in the active data set
- Remove row(s) from active data set
- Stack variables in active data set (half of reshaping discussed above)
- Remove cases with missing data
- Save active data set
- Export active data set
For managing variables in the active data set:
- Recode variables (able to do many variables)
- Compute new variables (can create only one new variable at a time)
- Add observation numbers to data set
- Standardize variables (able to do many variables at once)
- Convert numeric variables to factors
- Bin numeric variable
- Reorder factor levels
- Drop unused factor levels
- Define contrasts for a factor
- Rename variables
- Delete variables from data set
Menus & Dialog Boxes
The goal of pointing & clicking your way through an analysis is to save time by recognizing menu settings rather than spend it on the memorization and practice required by programming. Some GUIs, such as jamovi make this easy by sticking to menu standards and using simpler dialog boxes; others, such as RKWard, use non-standard menus that are unique to it and hence require more learning.
Figure 1 shows a common screen layout. The main R Commander window is in the upper left. A typical dialog box is in the front, and the graph it created is on the right. The data editor is on the upper right.
The R Commander’s menu structure contains some unique choices. No operations on data files are located on the usual “File" menu. For example, existing data sets or files are not opened using the usual “File> Open…", but instead using “Data> Load data set…" menu. Also, everything on the models menu applies not to data but from models that you’ve already created from data. The other menus follow Windows standards. When switching between software packages, I found myself usually looking for data under the File menu. The rationale behind the R Commander’s approach is that the R function that opens files is named “load". So this structure will help people learn more about R code (whether they’re headed that way or not!)
The dialog boxes have their own style too, but one that is easy to learn. Rather than have empty role boxes that you drag or click variables into, the role boxes contain the full list of relevant variables, and you click on one (or more) to select them (see “X variable (pick one) in Fig. 1). In the cases where there is an empty role box, or you double-click a variable name to move it from a list to box. The R Commander does a nice job of helping you avoid asking for absurdities, such as the mean of a factor, by not displaying them in certain dialog boxes.
The two objects you might be working on are shown on the toolbar right below the main menus. Clicking the “Data set:" tool will allow you to choose which data set most of the dialog boxes will refer to by default. That’s filled in automatically when you load, enter, or import a data set. Similarly, clicking the “Model:" tool will let you select a model which most of the choices on the Models menu will relate to. It too is filled in automatically each time you create a new model. See more on this in the Modeling section below.
Documentation & Training
There is excellent quality documentation available to help you learn the R Commander. The one to start with is Getting Started With the R Commander, by lead developer John Fox and Milan Bouchet-Valat. There is a complete book Using the R Commander, A Point-and-Click Interface for R, by John Fox.
On YouTube.com you’ll find thousands of videos on how to use the R Commander.
R GUIs provide simple task-by-task dialog boxes which generate much more complex code. Sometimes that code consists of custom functions that control R’s standard ones. So for a particular task, there is the potential for you to need help at three levels of complexity. Nearly all R GUIs provide that level of help when needed.
The R Commander provides help files for its general use, for the R functions its dialog boxes use, but oddly enough, not for the dialog boxes themselves.
Each dialog box has a help button in the lower left corner which opens a standard R help file in your browser. Unfortunately, that help has little to do with the dialog box. Instead, it describes the underlying R programming language that the dialog box calls. If you’re a devoted GUI user, you’ll be disappointed. But if your goal is to learn R programming, this will help get you used to help files that are rarely written for beginners. For example, the help file for “Statistics> Summaries> Tests of normality…" says:
“formula: one-sided formula of the form ~x or two-sided formula of the form x ~ groups, where x is a numeric variable and groups is a factor." If you were planning on learning to control that function using programming, that’s very useful information. However, there is no reference at all to “formula" in the dialog that called up that help!
The R Commander is the only GUI I’m aware of that lacks dialog-specific help. The others provide that (albeit rather sparse given the simplicity of dialog boxes), and then link to the more complex R help if you want to see more. The GUIs that have very tight ties between their dialog boxes and the custom functions they use, notably jamovi and BlueSky, provide R-style detailed help files that do go more deeply into detail than many GUI users would want to see, while avoiding the inclusion of steps that cannot be done using their dialog boxes.
The various GUIs available for R handle graphics in several ways. Some, such as RKWard, focus on R’s traditional graphics. Others, such as BlueSky Statistics focus on the popular ggplot2 package while banishing traditional graphics to a “Legacy" menu. Still others, such as jamovi, use their own functions to tie its graphs closely to the type of analysis you’re doing.
GUIs also differ quite a lot in how they control the style of the graphs they generate. Ideally, you would set the style and all graphs would follow it. That’s how jamovi works, but then jamovi is limited to its custom graph functions, as nice as they may be. BlueSky uses ggplot2 graphics almost exclusively, and its dialogs offer to apply “themes" from the ggthemes package.
The R Commander offers control over all three of R’s graphics types: traditional, lattice, and ggplot2. Built into it are traditional and some lattice graphics. For more extensive support for lattice graphics, try the plug-in: RcmdrPlugin.plotByGroup. Adding the KMggplot2 plug-in and you’ve got support for ggplot2 as well.
Regarding the standardization of style, given the breadth of graphics packages supported, standardizing styles across graphs is not a realistic expectation.
Here is the selection of plots the R Commander can create.
- Index plot…
- Dot plot…
- Plot discrete numeric variable…
- Density estimate…
- Stem-and-leaf display…
- Quantile-comparison plot…
- Scatterplot matrix…
- Line graph…
- XY conditioning plot…
- Plot of means…
- Strip chart…
- Bar graph…
- Pie chart…
- 3D graph…
To see how different graphs look, below are three, one from each of R’s main graphics systems. This one using traditional graphics from “Graphs> Scatterplot":
When doing plots that compare groups, the R Commander switches to graphs produced by the lattice package. Not all the same options are present, for example fitting a linear regression line isn’t included via the dialog box, so you’d have to modify the code to get such options added. This is from “Graphs> XY Conditioning plot…"
The R Commander’s KMggplot2 plug-in uses ggplot2 behind the scenes. It allows for more options such as linear regression fits. For the next plot, the dialog box required only the X variable, Y variable, X facet variable, Y facet variable, and smoothing fit. It makes the fairly complex ggplot language very easy to control. It’s not as flexible, but it does do the most popular types of plots:
Here is the ggplot code created by the dialog box shown in Figure 6:
load("C:/Users/muenchen/Documents/R4STATS/mydata100.RData") NOTE: The dataset mydata100 has 100 rows and 9 columns. &amp;amp;amp;amp;amp;amp;nbsp; require("ggplot2") &amp;amp;amp;amp;amp;amp;nbsp; .df &amp;amp;amp;amp;amp;lt;- data.frame(x = mydata100$pretest, y = mydata100$posttest, s = mydata100$workshop, t = mydata100$gender) &amp;amp;amp;amp;amp;amp;nbsp; .plot &amp;amp;amp;amp;amp;lt;- ggplot(data = .df, aes(x = x, y = y)) + geom_point() + stat_smooth(method = "lm") + scale_y_continuous(expand = c(0.01, 0)) + facet_grid(s ~ t) + xlab("pretest") + ylab("posttest") + theme_bw(base_size = 14, base_family = "sans") + theme(panel.spacing = unit(0.3, "lines"))
The way statistical models (which R calls model objects) are created and used, is an area on which R GUIs differ the most. The simplest, and least flexible approach is taken by jamovi and RKWard. They try to do everything you might need in a single dialog box. To an R programmer, that sounds extreme, as R works with models one task at a time. However, neither SAS nor SPSS were able to save models for their first 35 years of existence! There are ways to work around that limitation. BlueSky’s modeling approach goes further by saving model objects, but then only offering a few things to do with them, such as making predictions. That is a task you could do instead by creating a new variable manually.
R Commander is “all in" on modeling. When it creates a model, it saves it automatically giving it a useful name like RegModel.1. In same step, it provides minimal summary information about the model. It then offers 25 different other menu selections to do things with those models! Since one of the R Commander’s goals is to help you learn R programming, this makes perfect sense. However, occasional users may find this approach intimidating. When you have the dialog box open for a given model type, and it has a selection of options from which to choose, you know they are all relevant to that type of model. But when you create a model and then look at all the other menus and their dialog boxes, it’s not always clear which are relevant.
All of the R GUIs offer a decent set of statistical analysis methods. Some also offer machine learning methods. The R Commander’s selection is the most comprehensive. Since this topic is so complex, I’ll just provide links to let you decide if it has what you need. Here’s the R Commander’s list of built-in methods:
And here’s the list of plug-ins; just use CTRL-F to have your browser search for “Rcmdr":
Generated R Code
One of the aspects that most differentiates the various GUIs for R is the code they generate. If you decide you want to save code, what type of code is best for you? The concise functions that mimic the simplicity of one-step dialogs such as jamovi provides? The tidyverse-based code that BlueSky writes? The completely transparent (and complex) code provided by RKWard, which might be the best for budding R power users?
Below is an example of the R Commander’s code for a simple linear regression. It’s extremely concise, using far less code than the other GUIs (see the same example done in their reviews). However, part of the reason for that is that it provides very little output, and the output are not formatted as word processing tables!
RegModel.1 <- lm(posttest~pretest, data=mydata100)
Now that you’ve got your RegModel.1, it’s time to sift through the choices on the Models menu to see what you can do with it. Several of the other GUIs make that selection for you, putting you in a padded cell that’s oh so comfortable! But still a cell.
Support for Programmers
Some of the GUIs reviewed in this series of articles include extensive support for programming. For example, RKWard offers almost all the power of IDEs such as RStudio or Eclipse StatET. BlueSky gets perhaps half of the way there.
The R Commander offers a minimal “R Script" window that lets you edit and save code, but it lacks features that R power users look for such as color-coded syntax checking. That’s not really its reason to exist.
Reproducibility & Sharing
One of the biggest challenges that GUI users face is being able to reproduce what they did. Reproducibility is useful for re-running everything on the same dataset if you find a data entry error. It’s also useful for applying your work to new datasets so long as they use the same variable names (or the software can handle changes). Some scientific journals ask researchers to submit their files (usually code and data) along with their written report so that others can check their work.
As important a topic as it is, reproducibility is a problem for GUI users, a problem that has only recently been solved by some software developers. Most GUIs, such as BlueSky, save only code, but it’s not code the GUI users wrote, so they also can’t read it or change it! Others such as jamovi, RKWard, and the newest version of SPSS save the dialog box entries and allow GUI users to have reproducibility in the form they prefer.
The R Commander offers only code-based reproducibility. There’s no way to recreate a fully specified dialog box when starting from the saved code.
If you wish to share your work with colleagues, you would send them two files: the code and your data set. They would then have to install either the R Commander, or the RcmdrMisc package. The latter contains all the R Commander’s built-in functions. Your colleague might also have to install some plug-ins, if you used any (they’re standard R packages).
Output & Report Writing
Ideally, output should be clearly labeled, well organized, and of publication quality. It might also delve into the realm of word processing through Sweave/knitr and Rmarkdown documents. At the moment, none of the GUIs covered in this series of reviews meets all of these requirements. See the separate reviews to see how each of the other packages is doing on this topic.
The R Commander is currently the only R GUI that includes full support for R Markdown. This enables it to handle the labeling and organization of both code and its output with aplomb. In its own tabbed window, the R Commander provides a template showing you where to put your name (or a title) and stamps it with the current date. After that, it’s just text, so you can easily move or delete sections and add titles or comments to help you document your work. There’s a “Generate Report" button on the bottom right of the R Markdown window. Clicking it causes it to create an HTML report and display it in your browser. With embellishments, it can act as your final report, obviating the need for a word processor.
However, the output for the statistical tables is not high quality. It’s the same monospaced display from R itself. For example, here’s output to compare males and females on a two variables from the R Commander’s formatted R Markdown report:
And below is approximately the same output directly from jamovi, with no modification. It’s a true word processing table that’s automatically in the form that most journals require for publication. Similar high-quality output is also available in BlueSky and RKWard.
Repeating an analysis on different groups of observations is a core task in data science. Software needs to provide an ability to select a subset one group to analyze, then another subset to compare it to. All the GUIs reviewed provide that feature, including the R Commander.
Software also needs the ability to automate such selections so that you might generate dozens of analyses, one group at a time. While this has been available in commercial GUIs for decades, only one R GUI, BlueSky, includes that feature.
Early in the development of statistical software, developers tried to guess what output would be important to save to a new dataset (e.g. predicted values, factor scores), and the ability to save such output was built into the analysis procedures themselves. However, researchers were far more creative than the developers anticipated. To better meet their needs, output management systems were created and tacked on to existing tools (e.g. SAS’ Output Delivery System, SPSS’ Output Management System). One of R’s greatest strengths is that every bit of output can be readily used as input. However, for the simplification that GUIs provide, that’s a challenge.
Output data can be observation-level, such as predicted values for each observation or case. When group-by analyses are run, the output data can also be observation-level, but now the (e.g.) predicted values would be created by individual models for each group, rather than one model based on the entire original data set (perhaps with group included as a set of indicator variables).
Group-by analyses can also create model-level data sets, such as one R-squared value for each group’s model. They can also create parameter-level data sets, such as the p-value for each regression parameter for each group’s model. (Saving and using single models is covered under “Modeling" above.)
For example, in our organization, we have 250 departments and want to see if any of them have a gender bias on salary. We write all 250 regression models to a data set, and then search to find those whose gender parameter is significant (hoping to find none, of course!)
The R Commander creates only observation-level data, such as predicted values. That is a nearly universal limitation of the GUIs reviewed in this series. Only Bluesky Statistics offered all three types of output management, and it does so only for a limited array of models.
The R Commander can be extended through the use of plug-ins, which are a form of R package that include dialog box controls that integrate into the R Commander’s menu structure. A 100-page manual and other materials supporting plug-in developers is available here.
The R Commander’s many strengths have made it the most popular point-and-click GUI for R. Its built-in functionality is impressive, and its number of plug-ins is unsurpassed. Its closest competitor is the new comer BlueSky Statistics, whose functionality is similar. The R Commander is run from within R and it has the advantage of full support for reports using R Markdown. On the other hand, BlueSky feels more like a stand-alone application (its use of R is initially hidden) and it focuses on publication-quality output destined for word processors. jamovi is another recent arrival. While its functionality is currently far behind the leaders, its developers are adding functionality at a rapid rate.
The R Commander’s extensive functionality, its availability across all three major operating systems plus its translation into all major world languages means that if you’re looking for an R GUI, the R Commander might be just what you need!
For a summary of all my R GUI software reviews, see the article, R Graphical User Interface Comparison.
Thanks to John Fox, Milan Bouchet-Valat, and the many people who contributed to the creation of the R Commander. A special thanks to John Fox for his many suggestions that greatly improved this article.