A Comparative Review of the RKWard GUI for R

by Robert A. Muenchen

Introduction

RKWard is a free and open source Graphical User Interface for the R software, one that supports beginners looking to point-and-click their way through analyses, as well as advanced programmers. You can think of it as a blend of the menus and dialog boxes that R Commander offers combined with the programming support that RStudio provides. RKWard is available on Windows, Mac, and Linux.

This review is one of a series which aims to help non-programmers choose the Graphical User Interface (GUI) that is best for them. However, I do include a cursory overview of how RKWard helps you work with code. In most sections, I’ll begin with a brief description of the topic’s functionality and how GUIs differ in implementing it. Then I’ll cover how RKWard does it.

Figure 1. RKWard’s main control screen containing an open data editor window (big one), an open dialog box (right) and its output window (lower left).

 

Terminology

There are various definitions of user interface types, so here’s how I’ll be using these terms:

GUI = Graphical User Interface specifically using menus and dialog boxes to avoid having to type programming code. I do not include any assistance for programming in this definition. So GUI users are people who prefer using a GUI to perform their analyses. They often don’t have the time required to become good programmers.

IDE = Integrated Development Environment which helps programmers write code. I do not include point-and-click style menus and dialog boxes when using this term. IDE users are people who prefer to write R code to perform their analyses.

 

Installation

The various user interfaces available for R differ quite a lot in how they’re installed. Some, such as jamovi or BlueSky Statistics, install in a single step. Others install in multiple steps, such as R Commander and Deducer. Advanced computer users often don’t appreciate how lost beginners can become while attempting even a single-step installation. I work at the University of Tennessee, and our HelpDesk is flooded with such calls at the beginning of each semester!

Installing RKWard on Windows is done in a single step since its installation file contains both R and RKWard. Linux binaries do not contain a matching copy of R, but the package manager will obtain R (unless already installed). On Mac, the user is responsible for installing R, manually. Regardless of their operating system, RKWard users never need to learn how to start R, then execute the install.packages function, and then load a library. Installers for all three operating systems are available here.

The RKWard installer obtains the appropriate version of R, simplifying the installation and ensuring complete compatibility. However, if you already had a copy of R installed, depending on its version, you could end up with a second copy.

RKWard minimizes the size of its download by waiting to install some R packages until you actually try to use them for the first time. Then it prompts you, offering default settings that will get the package you need.

On Windows, the installation file is 136 megabytes in size.

 

Plug-ins

When choosing a GUI, one of the most fundamental questions is: what can it do for you? What the initial software installation of each GUI gets you is covered in the Graphics, Analysis, and Modeling section of this series of articles. Regardless of what comes built-in, it’s good to know how active the development community is. They contribute “plug-ins" which add new menus and dialog boxes to the GUI. This level of activity ranges from very low (RKWard, BlueSky, Deducer) through moderate (jamovi) to very active (R Commander).

Currently all plug-ins are included with the initial installation. You can see them using the menu selection Settings> Configure Packages> Manage RKWard Plugins. There are only brief descriptions of what they do, but once installed, you can access the help files with a single click.

RKWard add-on modules are part of standard R packages and are distributed on CRAN. Their package descriptions include a field labeled, “enhances: rkward". You can sort packages by that field in RKWard’s package installation dialog where they are displayed with the RKWard icon.

 

Startup

Some user interfaces for R, such as jamovi and BlueSky Statistics, start by double-clicking on a single icon, which is great for people who prefer to not write code. Others, such as R commander, have you start R, then load a package from your library, then call a function. That’s not good for GUI users, but for people looking to learn the R language, it helps them on their way.

RKWard is started directly as a stand-alone application, not from within R. The next time you start it up, it offers to load your last open workspace & knows its location.

 

Data Editor

A data editor is a fundamental feature in data analysis software. It puts you in touch with your data, lets you get a feel for it, if only in a rough way. A data editor is such a simple concept that you might think there would be hardly any differences in how they work in different GUIs. While there are technical differences, to a beginner what matters the most are the differences in simplicity. Some GUIs, including jamovi and Bluesky, let you create only what R calls a data frame. They use more common terminology and call it a data set: you create one, you save one, later you open one, then you use one. Others, such as the R Commander trade this simplicity for the full R language perspective: a data set is stored in a workspace. So the process goes: you create a data set, you save a workspace, you open a workspace, and choose a data set from within it.

RKWard’s spreadsheet-style data editor is very easy to use. It puts its metadata – variable name, label, type, format, and levels – at the top of each variable (Figure 2). This makes it seem quite natural that you start at the top of the spreadsheet and work your way down until you’re entering the data values. Under “Type" you can double-click to reveal a dropdown menu that shows, 1:Numeric, 2:Factor, 3:String, 4:Logical. You can either click on one of those choices, or type its number to make the selection.

Figure 2. RKWard’s Data Editor showing metadata (top third), and regular data.

Double-clicking “Levels" opens a dialog that offers values of a factor, such as 1, 2 and prompts you to enter each label, such as Male, Female. When finished you can then continue with data entry, typing numbers and having RKWard convert their numbers to the labels. That makes data entry quick and accurate.

The tab key takes you to the next cell. It also adds a new variable when you reach the end of the defined variables. That’s handy if it’s what you wanted to do, but it’s also easy to create a new variable by accident. If that happens, right-click on the variable name and choose “delete."

The Home and End keys take you to the beginning or end of an observation. So to begin entering a new observation, you press Home, then cursor down (or vice versa). I would prefer that the Enter key be used in place of that two-key sequence, but Excel users will probably like it as is.

To save your dataset, choose Workspace> Save Workspace. Recall that to start creating a dataset, you use File> New> Dataset. Since, there’s no matching File> Save> Dataset, the beginner is left to make the mental leap that a workspace is the thing that needs saving! The developers are aware of this issue and are working on a solution.

When opening an existing data set, most programs will show you the data in spreadsheet form, but RKWard doesn’t. The file opens into a new tabbed window, but that window does not pop to the front, making you wonder if you succeeded in opening the file or not. Another way you’ll know it opened is that its name appears in the Workspace window in the upper left of the main control window.

 

Data Import

File formats that RKWard can import include R Workspaces, Text/CSV, Excel, SPSS, and Stata. Data in other formats, such as SAS or SQL databases, must be imported using code.

 

Data Management

It’s often said that 80% of data analysis time is spent preparing the data. Variables need to be transformed, recoded, or created; missing values need to be handled; datasets need to be stacked or merged, aggregated, transposed, or reshaped (e.g. from wide to long and back). A critically important aspect of data management is the ability to transform many variables at once. For example, social scientists need to recode many survey items, biologists need to take the logarithms of many variables. Doing such tasks one variable at a time is tedious. Some GUIs, such as R Commander and BlueSky Statistics, handle nearly all of these challenges. Others, such as jamovi, offer just a handful of data management functions.

Unfortunately, this is RKWard’s weakest area. Its Data menu offers only four choices:

  • Generate random data
  • Recode categorical data
  • Sort data
  • Subset data

The Recode dialog box works well for a single variable, but it can’t do multiple variables at once.

 

Menus & Dialog Boxes

The goal of pointing & clicking your way through an analysis is to save time by recognizing menu settings rather than the more difficult task of recalling programming commands. Some GUIs, such as jamovi make this easy by sticking to menu standards and using simple dialog boxes; others, such as the R Comander, use sequences of dialog boxes and/or non-standard menus that are unique to it and hence require more learning.

Figure 1 shows a typical RKWard session. The main control panel is the big window in the back. It has tabbed windows for each open data set and each set of output. RKWard makes it very easy to right-click on the tabs to detach any tabbed window. That would make it quite easy to compare two data sets side-by-side, or to make full use of multiple displays.

At the top of the control panel you’ll find the usual “File, Edit…Help" menus. Immediately below that is a toolbar containing shortcut icons. These provide a commonly used subset of the main menus and the icons that appear there change slightly depending on what you’re doing. The icons include: Open, Create, Save, Save Script…. The Save icon does drop down a menu, offering to save your workspace (i.e. your dataset) or your script. The “Save Script" icon is handy for saving your most recent changes to the same filename with a single click. The usual CTRL-S shortcut does the same thing.

Running plots or analyses are done in the usual way by making menu selections. Dialog boxes appear and you select variables, then click an arrow icon to move them into the empty role boxes (also shown in Fig 1, right). The shortcut CTRL-click allows you to select a set of variables one at a time, as usual. Shift-click lets you select contiguous sets of variables, but a bug in the development too ARKward uses (Qt) doesn’t always show you they’re selected until you wave your mouse pointer across the selected variables. Note that you can’t drag and drop the variables into their various roles.

When you’ve made your dialog box choices, you click “Submit" to run the step. If you want to see the R code that each dialog generates, click the “Code Preview" box in the bottom right corner of each dialog, and it will appear in the bottom of the dialog. While the code is displayed, any dialog changes you make will immediately be reflected in the code, which is very helpful when you’re learning to program. The reverse is not true since you cannot make changes to the code there. You would have to copy it and paste it into the program editor.

Most other GUIs maintain their dialog box settings within a work session, so if you wanted to do a variation of the previous step, you would simply choose it from the menus again. If you try that in RKWard, you’ll see your last round of settings have been cleared out. They are saved in the output file though. Each set of results in the output window contains a “Run again" link at the bottom of its section. Clicking that link will restore the dialog, complete with all the settings you used for that section of output.

While most dialog boxes are controlled by selecting variables from data frames, some require other types of data objects. For example, the factor analysis plug-in requires data stored in a correlation matrix. However, the correlation matrix dialog box doesn’t allow for saving the matrix, so it seems that you’d have to know how to do that using R code. Fixing this situation is on the developer’s “to do" list.

When exiting RKWard, it asks if you want to save your workspace and code (if you’ve entered any). It will automatically save your output and the dialog boxes required to make it in the file rk_out.html. In future sessions, this is loaded automatically and maintains its “Run Again" capability. To save that to a different location, you can use “Workspace> Save workspace" or on the lower set of menus, “Save: Save workspace".

The way statistical models are created and used is an area that differs most among R GUIs. The simplest, and least flexible approach, is taken by jamovi which tries to do everything you might need in a single dialog box. R Commander goes the opposite direction, saving models, and then offering users 25 different other menu selections to do things with those models. See their respective reviews on how well they succeed with each approach.

RKWard’s modeling takes the simpler approach, offering most of what people want in their models from a single dialog box. None of its modeling steps save the model object itself.

 

Documentation & Training

The user documentation for RKWard is located on the project’s web site.

YouTube.com also offers over 650 training videos that show how to use RKWard.

 

Help

R GUIs provide simple task-by-task dialog boxes which generate much more complex code. Sometimes that code consists of custom functions that control R’s standard ones. So for a particular task, there is the potential for you to need help at three levels of complexity. Nearly all R GUIs provide that level of help when needed. The notable exception that is the R Commander, which lacks help on the dialog boxes themselves (see that review for details).

RKWard provides help files at all three levels. Each dialog box has a help button which provides a summary description, how to use the dialog box, all the GUI settings, what related functions are used, any dependencies involved, and an “About" section which provides the function’s version and its authors. Each help page also links to R’s built-in help on any functions used.

When you click on Help in a dialog, the help appears in RKWard’s main window. That comes to the front, which may cover up the dialog itself. That window contains Back and Forward buttons which you might think would get you back to the dialog box. So it’s best to move the dialog to an empty space on your screen to allow you to read the help and see the dialog at the same time.

The Help menu offers a search capability, but it searches only general R functions, not RKWard’s GUI-based capabilities.

 

Graphics

The various GUIs available for R handle graphics in several ways. Some, such as BlueSky and Deducer focus on using the popular ggplot2 package. Others, such as the R Commander, build in support for base graphics, lattice graphics, and use plug-ins for both lattice and ggplot2. Still others, such as jamovi, use their own functions so they can tie them closely with the type of analysis being done.

GUIs also differ quite a lot in how they control the style of the graphs they generate. Ideally, you would set the style and all graphs would follow it. That’s how jamovi works, but then jamovi is limited in the type of graphs that it does. BlueSky uses ggplot2 graphics almost exclusively, and its dialogs offer to apply “themes" from the ggthemes package.

RKWard plots are done by R’s built-in plots except for some specialty plots such as Pareto. None of them are done using lattice or ggplot2. As a result, plots of group comparisons are fairly limited.

While RKWard doesn’t let you set the style of graphics in advance, its use of R’s built-in plot functions guarantees that at least they all share one style.

Plots in RKWard can be a bit confusing at first as its default highlighted variable entry box is used for pre-tablulated data. For GUI users, that’s a pretty odd concept; they almost never have such data. When you enter standard un-tabulated data into that field, a blank plot window results with “Error in -0.01 * height : non-numeric argument to binary operator". Checking the “Tabulate data before plotting" box gets things working in a more standard GUI way.

RKWard’s Plot menu offers:

  • Barplot
  • Box Plot
  • Density Plot
  • Dotchart
  • ECDF Plot
  • Generic Plot
  • Histogram
  • Item Response Theory (6 plots)
  • Pareto Chart
  • Piechart
  • Scatterplot
  • Scatterplot Matrix
  • Stem-and-Leaf Plot
  • Stripchart

 

Modeling

The way models are created and managed has a tremendous impact on both the flexibility and complexity of a user interface. The various R GUIs differ quite a lot in this regard. The simplest, and least flexible approach, is taken by jamovi, which tries to do everything you might need in a single dialog box. R Commander goes the opposite direction, saving models, and then offering users 25 different other menu selections to do things with those models. See their respective reviews on how well they succeed with each approach.

RKWard takes the simplest approach to modeling, by creating the model and offering you most of what you might want from that model in a single dialog box. The regression and ANOVA dialog boxes allow you to save models. However, there are no other menu entries devoted to model management manipulation. That struck me as a surprising choice for this particular GUI, since in many other ways RKWard goes for the most powerful approach (e.g. code generation, programming support, object viewer, etc.).

 

Analysis Methods

All of the R GUIs offer a decent set of statistical analysis methods. Some also offer machine learning methods. Since this topic is so complex, I’ll simply list the methods RKWard comes with.

The first one on the Analysis menu is “Basic Statistics." Oddly enough, by default it offers none! I thought something had malfunctioned as I’ve never seen a stat package that didn’t offer a standard set of statistics at this stage. Just to test how it handles obvious errors, I included a factor and asked for the mean, sd, etc. This yielded the standard R messages, which beginners would find perplexing. It would have been more helpful to prevent the request by not displaying factors in that dialog box.

It turned out that the cause of my problem was that the Recode procedure had converted a numeric variable to a factor as it recoded it! The help file pointed out that there is a “Data type after recoding" setting that is set to “factor" by default. Here is a list of RKWard’s standard set of analyses:

  • Basic Statistics
  • Descriptive Statistics
  • ANOVA (between, within, mixed)
  • Classical test theory
  • Cluster Analysis
  • Cohen’s Kappa
  • Correlation
  • Crosstabs
  • Factor analysis
  • Item Response Theory
  • Means
  • Moments
  • Multidimensional scaling
  • Multiple choice
  • Outlier tests
  • Power analysis
  • Regression (linear only)
  • Text analysis
  • Time series
  • Variances / scale
  • Wilcoxon tests
  • Input matrix test

 

Generated R Code

One of the aspects that most differentiates the various GUIs for R is the code they generate. This code can help you learn to program in R. It is also helpful for documenting the analysis steps, and for reproducing and perhaps automating them. But what type of code is best for you? The base R code as provided by R Commander? Tidyverse-style code as provided by BlueSky Statistics? The concise functions that mimic the simplicity of one-step dialogs such as jamovi provides?

The RKWard developers chose to display base R code that will maximize what you will learn about R by not hiding any of the behind-the-scenes complexity involved. For example, to get the mean and standard deviation for two variables, RKWard generates this code:

 

local({

## Compute

vars <- rk.list (mydata100[["pretest"]], mydata100[["posttest"]])

results <- data.frame ("Variable Name"=I(names (vars)), check.names=FALSE)

for (i in 1:length (vars)) {

var <- vars[[i]]

results[i, "Mean"] <- mean(var,na.rm=TRUE)

results[i, "sd"] <- sd(var,na.rm=TRUE)

# robust statistics

}

## Print result

rk.header ("Univariate statistics", parameters=list("Omit missing values"="yes"))

rk.results (results)

})

This might seem daunting at first, leaving a beginner to sift through it to find that mean(var,na.rm = TRUE) is the code that calculated the mean. However, someone coming from another programming language will quickly see how to code a “for" loop in R. Keep in mind that people not wanting to learn to code in R will not even see this code unless they ask to.

 

Support for Programmers

While I’m focusing on interfaces that include menus and dialog boxes for non-programmers, people wishing to blend that style of work with programming should know that some GUIs, such as jamovi offer little to none support programmers; their simplicity forbids it. Others, such as BlueSky Statistics offer modest support.

RKWard offers a powerful and comprehensive Integrated Development Environment (IDE) which lets programmers write and debug their code. The output from running code appears in the console window, while the output created by dialogs appears in the Output tab. The IDE features of RKWard are very similar to those of the popular RStudio IDE.

Its code editor, Kate, has its own open source project and it is jam-packed with advanced features: https://kate-editor.org/about-kate/ . It supports syntax highlighting, provides hints on function arguments, offers to complete object names, and more.

That’s great for people wanting to execute code, but for point-and-click users, it means that there’s a lot of added complexity. Point-and-clickers can ignore most of the features on RKWard’s menus: Edit, View, Workspace, and Run.

 

Reproducibility & Sharing

One of the biggest challenges that GUI users face is being able to reproduce what they did. Reproducibility is useful for re-running everything on the same dataset if you find a data entry error. It’s also useful for applying your work to new datasets so long as they use the same variable names (or the software can handle changes). Some scientific journals ask researchers to submit their files (usually code and data) along with their written report so that others can check their work.

As important a topic as it is, reproducibility is a problem for GUI users, a problem that has only recently been solved by some software developers. Most GUIs (e.g. R Commander, BlueSky) save only code, and it’s not code the GUI users wrote, so they also can’t read it! Others such as jamovi and the newest version of SPSS save the dialog box entries and allow GUI users to have reproducibility in the form they prefer.

RKWard does save dialog box settings for you to reuse. If you execute a plot or an analysis using a dialog, then decide to do a variation on that step, choosing the dialog box again will show you one devoid of your previous choices. You might think that it has “forgotten" them. However, in the output window, each step ends with the link “Run again". Clicking that link will make the dialog reappear, complete with your last settings filled in.

If you do “run again" the new output will appear immediately below the existing version. That’s the most convenient approach, as you’re likely to want them close for comparison purposes, but it would be nice to have the option to have it appear at the bottom of all output as the new SPSS offers.

If you wish to share your work with colleagues you would have two choices. If they’re GUI users, you would send them your data and your RKWard output/code file. They would install RKWard, open the RKWard file, change the pointer to the new data set location, and begin work. However, output files do not contain any graphs. A new output system is in development which should make work easier to share.

If your colleagues are R programmers, you could send them your data and send them your R code. They would need to install RKWard to execute the code.

 

Output & Report Writing

Ideally, output should be clearly labeled, well organized, and of publication quality. It might also delve into the realm of word processing through Sweave/knitr and R Markdown documents. At the moment, none of the GUIs covered in this series of reviews meets all of these requirements. See the separate reviews to see how each of the other packages is doing on this topic.

The labelling of RKWard’s output is done via default titles which reflect each step well, but which cannot be changed in the dialog boxes. So if you try five variations of a regression model, you’ll just see five sets of output labeled “Linear Regression."

The organization of the output is in time-order only, and you cannot delete any of the steps you take. This often results in an output file filled with unneeded results. The upper right side of the output window has a “Show TOC" link which displays a Table of Contents. Each entry is a link that jumps you directly to that part of the output, which is very convenient. Tables of contents are common place for GUIs to let you re-order, rename, or delete bits of output, but none of that is possible here.

RKWard’s output quality from all GUI dialogs is very high, with nice fonts, and true rich text tables. That means you can paste them into any word processor and reformat them quickly. That really helps speed your work as R output defaults to mono-spaced fonts that require additional steps to get into publication form (e.g. using functions from packages such as xtable or texreg). Since the graphs for a given output file are stored in the “.rkward" folder, it would be

RKWard doesn’t offer support for Sweave/knitr or R Markdown documents. That is one of the very few things that RStudio offers that RKWard lacks. However, work has already begun on adding this capability.

 

Group-By Analyses

Repeating an analysis on different groups of observations is a core task in data science. Software needs to provide an ability to select a subset one group to analyze, then another subset to compare it to. All GUIs, including the RKWard, perform that task. It also needs the ability to automate such selections so that you might generate dozens of analyses, one group at a time. While this has been available in commercial GUIs for decades, only one R GUI, BlueSky, includes that feature.

 

Output Management

Output management deals with the software’s ability to create new data sets from the output of an analysis. That output can then be used as input for further analysis. Such data can be observation-level, such as predicted values for each observation or case. When group-by analyses are run, the output can also be model-level, such as one R-squared value for each group’s model; or parameter-level, such as the p-value for each regression parameter for each group’s model. (Saving and using models themselves is covered under “Modeling" above.)

For example, in our organization, we have 250 departments and want to see if any of them have a gender bias on salary. We write all 250 regression models to a data set and then search to find those whose gender parameter is significant (hoping to find none, of course).

RKWard’s modeling dialogs have the ability to save observation-level information such as predicted values and residuals. Since RKWard has such a nice spreadsheet data editor, you might expect your new variables to be saved to your original data set where you could view them. However, by default they are instead saved as individual vectors.

Since RKWard lacks the ability to perform group-by processing, it has no ability to save model-level information, nor can it save parameter-level results for further analysis.

 

Developer Issues

The RKWard development team has created a set of tools to help R package developers to convert their work into RKWard plugins. The process consists of creating dialog boxes and determining their place on the menu structure, and adding formatting to output so that true tables appear along with quality fonts. Details are provided here: https://rkward.kde.org/Developer_Information .

 

Conclusion

RKWard is a powerful front-end to the R language, one that provides easy point-and-click control for GUI users. People interested in learning the base R language (as opposed to the tidyverse-style commands) will learn much from the R code that RKWard writes. It is extremely clear R code, hiding very little within custom functions (much less than other user interfaces).

For R power users, RKWard offers a complete integrated development environment that is the rough equivalent to RStudio or Eclipse StatET. The only major thing lacking there is support for Sweave or R Markdown/knitr, and the latter is in development.

The only other R GUI that attempts to provide tools for both beginners and advanced users is BlueSky Statistics, which has stronger support for GUI-style data management but offers less for R coders.

If you’re looking to expand your R user interface horizons, take RKWard out for a test drive and see how you like it!

 

Acknowledgments

Thanks to Thomas Friedrichsmeier, Meik Michalke, and the RKWard team for creating RKWard and giving it away for us all to use. A special thanks to Thomas Friedrichmeier for his many suggestions that improved this article. Also thanks to Rachel Ladd for her editorial suggestions.