Graphical User Interfaces | r4stats.com

A Comparative Review of the R Commander GUI for R

Introduction

The R Commander is a free and open source user interface for the R software, one that focuses on helping users learn R commands by point-and-clicking their way through analyses. The R Commander is available on Windows, Mac, and Linux; there is no server version.

This is one of a series of reviews which aim to help non-programmers choose the user interface for R which is best for them. Each review also includes a cursory description of the programming support that each interface offers.

Figure 1. The main R Commander window is in the upper left. A typical dialog box is in the front, and the graph it created is on the right. The data editor is on the upper right.

Terminology

There are various definitions of user interface types, so here’s how I’ll be using these terms:

GUI = Graphical User Interface specifically using menus and dialog boxes to avoid having to type programming code. I do not include any assistance for programming in this definition. So GUI users are people who prefer using a GUI to perform their analyses. They don’t have the time or inclination to become good programmers.

IDE = Integrated Development Environment which helps programmers write code. I do not include point-and-click style menus and dialog boxes when using this term. IDE users are people who prefer to write R code to perform their analyses.

Installation

The various user interfaces available for R differ quite a lot in how they’re installed. Some, such as jamovi, BlueSky, or RKWard, install in a single step. Others, such as Deducer, install in multiple steps. Advanced computer users often don’t appreciate how lost beginners can become while attempting even a single-step installation. The HelpDesks at most universities are flooded with such calls at the beginning of each semester!

As described on the R Commander main web site, the installation basics are as follows:

Download R from CRAN and install it in the manner appropriate to your operating system. If you have an old version of R — that is, older than the current version — then it’s generally a good idea to install the current version of R before installing the Rcmdr package. On Windows, opt for a customized startup and select the single-document interface (“SDI,” see the Windows notes below for details).
On Mac OS X only, download and install XQuartz, and reboot your computer (see the Mac notes below for greater detail).
Start R, and at the > command prompt, type the command install.packages(“Rcmdr”).
Once it is installed, to load the Rcmdr package, just enter the command library(“Rcmdr”).
Optionally install Pandoc and LaTeX to get publication-quality output (via “Tools> Install auxiliary software”)

Complete installation notes are here. They’re worth reading as they go on to point out several things that can go wrong. These include having an incompatible version of R (i.e. you skipped step 1), and R packages which fail to install.

While these multiple steps are more challenging that single-step installations, they are in line with the developer’s goal of helping people learn to program in R. That audience would have to learn to install R and R packages, then load packages anyway.

Plug-ins

When choosing a GUI, one of the most fundamental questions is: what can do for you? What the initial software installation of each GUI gets you is covered in the Graphics, Analysis, and Modeling section of this series of articles. Regardless of what comes built-in, it’s good to know how active the development community is. They contribute “plug-ins” which add new menus and dialog boxes to the GUI. This level of activity ranges from very low (RKWard, BlueSky, Deducer) through moderate (jamovi) to very active.

R Commander’s development community is by far the most active, with 45 add-ons available. The add-ons are stored on CRAN and installed like any other software. You install them using the install.packages function, then choose “Tools> Load Rcmdr plug-ins…”, select the plug-in, and click OK. R Commander will then tell you that you need to restart R Commander to have it appear on the menus. The R Commander stores your list of plug-ins in your .Rprofile and will edit it for you. That’s important as editing it is a non-trivial task (see Installation, step 6).

You can find a comprehensive list of plug-ins here: https://cran.r-project.org/web/packages/index.html.

Startup

Some user interfaces for R, such as jamovi and BlueSky Statistics, start by double-clicking on a single icon, which is great for people who prefer to not write code. Others, such as Deducer, have you start R, then load a package from your library, then call a function. That’s better for people looking to learn R, as those are among the first tasks they’ll have to learn anyway.

You start R Commander by first starting the RGUI program that comes with the main R package. You can also start it from any program that offers an R console, such as the one that comes with the main R installation. Once R is started, you start the R Commander by loading it from your library by typing this command and pressing the Enter key: “library(“Rcmdr”).” The main control screen will then appear (Figure 1, upper left) along with a graphics screen (Figure 1, right).

If you want to have the R Commander start automatically each time you start R, it’s possible to do so, but it is not a task for beginners. The R Commander creates an “.Rprofile” in your main directory and it includes instructions about how to “uncomment” a few lines by removing the leading “#” characters. However, files whose names consist only of an “.extension” are hidden by your file system, and if edited using the Notepad application, cannot be saved with that non-standard type of name. You can save it to a regular name like “Rprofile.txt”. You then must use operating system commands to rename it, as the Windows file manager also won’t let you choose a file name that is only an extension.

Data Editor

A data editor is a fundamental feature in data analysis software. It puts you in touch with your data, lets you get a feel for it, if only in a rough way. A data editor is such a simple concept that you might think there would be hardly any differences in how they work in different GUIs. While there are technical differences (single-click sorting, icons that show excluded observations, etc.), to a beginner what matters the most are the differences in simplicity. Some GUIs, including jamovi, Bluesky, let you create only what R calls a data frame. They use more common terminology and call it a data set: you create one, you save one, later you open one, then you use one. Others, such as RKWard trade this simplicity for the full R language perspective: a data set is stored in a workspace. So the process goes: you create a data set, you save a workspace, you open a workspace, and choose a data set from within it.

You start the R Commander’s Data Editor by choosing “Data> New data set…” You can enter data immediately, though at first the variables are named simply V1, V2… and the rows are named 1,2,3…. You can click on the names to change them (see Figure 2). Clicking on the “Add row” or “Add column” buttons do just that, though the Enter key is a quicker way to get a new row. You can enter simple numeric data or character data; no scientific notation, no dates. The latter is converted to a factor, but there is no way to enter the underlying values such as 1, 2 and have the editor display Male, Female, for example. That slows down data entry.

There is no way to enter or change any metadata other than variable and row names.

Saving the data provides a lesson on R data structures. Since you started the process by creating a new “data set”, you might start looking on the menus for where to save such a thing. Instead, you have to know that in R, data sets reside in something called a “workspace”. So “Data: New data set…” is balanced by “File: Save R workspace”. It would be nice if there was some instruction explaining this situation.

Figure 2. The R Commander’s data editor.

Data Import

The R Commander can import the file formats: CSV, TXT, Excel, Minitab, SPSS, SAS, and Stata. It can even import data directly from a URL, which is a rare feature for a GUI. These are all located under “Data> Import Data”. A particularly handy feature is the ability to explore and load data sets that are included with installed packages. That’s done via “Data> Data in packages…”.

To get data from SQL database formats, you’ll have to use R code.

Data Management

It’s often said that 80% of data analysis time is spent preparing the data. Variables need to be transformed, recoded, or created; missing values need to be handled; datasets need to be stacked or merged, aggregated, transposed, or reshaped (e.g. from wide to long and back). A critically important aspect of data management is the ability to transform many variables at once. For example, social scientists need to recode many survey items, biologists need to take the logarithms of many variables. Doing such tasks one variable at a time is tedious. Some GUIs, such as BlueSky, handle nearly all of these challenges. Others, such as RKWard offer just a handful of data management functions.

The R Commander is able to recode many variables, adding an optional prefix to each name like “recoded_” to each variable that you choose to recode. It can also standardize many variables at once, but can only over-write the original values. Make a copy of your data set before doing that! Unfortunately, when it comes to other popular transformations such as the logarithm, you have to apply them one variable at time.

For reshaping data sets, the R Commander can stack one set of variables into a single variable and create a factor to classify those values, but it can’t take along other relevant variables, nor can it do the reverse of this process by going from “long” to “wide” data structures.

Overall, the R Commander offers a very useful set of data management tools:

For managing the active data set as a whole:

View data
Select active data set
Refresh active data set
Help on active data set
Variables in active data set
Set case names
Subset active data set
Sort active data set
Aggregate variables in the active data set
Remove row(s) from active data set
Stack variables in active data set (half of reshaping discussed above)
Remove cases with missing data
Save active data set
Export active data set

For managing variables in the active data set:

Recode variables (able to do many variables)
Compute new variables (can create only one new variable at a time)
Add observation numbers to data set
Standardize variables (able to do many variables at once)
Convert numeric variables to factors
Bin numeric variable
Reorder factor levels
Drop unused factor levels
Define contrasts for a factor
Rename variables
Delete variables from data set

Menus & Dialog Boxes

The goal of pointing & clicking your way through an analysis is to save time by recognizing menu settings rather than spend it on the memorization and practice required by programming. Some GUIs, such as jamovi make this easy by sticking to menu standards and using simpler dialog boxes; others, such as RKWard, use non-standard menus that are unique to it and hence require more learning.

Figure 1 shows a common screen layout. The main R Commander window is in the upper left. A typical dialog box is in the front, and the graph it created is on the right. The data editor is on the upper right.

The R Commander’s menu structure contains some unique choices. No operations on data files are located on the usual “File” menu. For example, existing data sets or files are not opened using the usual “File> Open…”, but instead using “Data> Load data set…” menu. Also, everything on the models menu applies not to data but from models that you’ve already created from data. The other menus follow Windows standards. When switching between software packages, I found myself usually looking for data under the File menu. The rationale behind the R Commander’s approach is that the R function that opens files is named “load”. So this structure will help people learn more about R code (whether they’re headed that way or not!)

The dialog boxes have their own style too, but one that is easy to learn. Rather than have empty role boxes that you drag or click variables into, the role boxes contain the full list of relevant variables, and you click on one (or more) to select them (see “X variable (pick one) in Fig. 1). In the cases where there is an empty role box, or you double-click a variable name to move it from a list to box. The R Commander does a nice job of helping you avoid asking for absurdities, such as the mean of a factor, by not displaying them in certain dialog boxes.

The two objects you might be working on are shown on the toolbar right below the main menus. Clicking the “Data set:” tool will allow you to choose which data set most of the dialog boxes will refer to by default. That’s filled in automatically when you load, enter, or import a data set. Similarly, clicking the “Model:” tool will let you select a model which most of the choices on the Models menu will relate to. It too is filled in automatically each time you create a new model. See more on this in the Modeling section below.

Continued here…

A Comparative Review of the Rattle GUI for R

Introduction

Rattle is a popular free and open source Graphical User Interface (GUI) for the R software, one that focuses on beginners looking to point-and-click their way through data mining tasks. Such tasks are also referred to as machine learning or predictive analytics. Rattle’s name is an acronym for “R Analytical Tool To Learn Easily.” Rattle is available on Windows, Mac, and Linux systems.

This post is one of a series of reviews which aim to help non-programmers choose the GUI that is best for them. Additionally, these reviews include a cursory description of the programming support that each GUI offers.

Figure 1. The Rattle interface with the “Data” tab chosen, showing which file I’m reading, and the roles of the variables will play in analyses. The role assigned to each variable is critically important. Note the all-important “Execute” button in the upper left of the screen. Nothing happens until it’s clicked.

Terminology

There are various definitions of user interface types, so here’s how I’ll be using these terms:

GUI = Graphical User Interface using menus and dialog boxes to avoid having to type programming code. I do not include any assistance for programming in this definition. So, GUI users are people who prefer using a GUI to perform their analyses. They don’t have the time or inclination to become good programmers.

IDE = Integrated Development Environment which helps programmers write code. I do not include point-and-click style menus and dialog boxes when using this term. IDE usersare people who prefer to write R code to perform their analyses.

Installation

The various user interfaces available for R differ quite a lot in how they’re installed. Some, such as jamovi or RKWard, install in a single step. Others install in multiple steps, such as R Commander (two steps) and Deducer (up to seven steps). Advanced computer users often don’t appreciate how lost beginners can become while attempting even a simple installation. The Help Desks at most universities are flooded with such calls at the beginning of each semester!

The steps to install Rattle are:

Install R
In R, install the toolkit that Rattle is written in by executing the command: install.packages(“RGtk2”)
Also in R, install Rattle itself by executing the command:
install.packages(“rattle”, dependencies=TRUE)
The very latest development version is available here.
Note that while Rattle’s name is capitalized, the name of the rattle package is spelled in all lower-case letters!
If you wish to take advantage of interactive visualization (highly recommended) then install the GGobi software from: http://www.ggobi.org/downloads/.

Plug-in Modules

When choosing a GUI, one of the most fundamental questions is: what can it do for you? What the initial software installation of each GUI gets you is covered in the Graphics, Analysis, and Modeling sections of this series of articles. Regardless of what comes built-in, it’s good to know how active the development community is. They contribute “plug-ins” which add new menus and dialog boxes to the GUI. This level of activity ranges from very low (RKWard, Deducer) through moderate (jamovi) to very active (R Commander).

Rattle’s complete capability was designed and programmed by Graham Williams of Togaware. As a result, it doesn’t have plug-ins, but it does include a comprehensive set of data mining tools.

Startup

Some user interfaces for R, such as jamovi, start by double-clicking on a single icon, which is great for people who prefer to not write code. Others, such as R commander and JGR, have you start R, then load a package from your library, and call a function. That’s better for people looking to learn R, as those are among the first tasks they’ll have to learn anyway.

Rattle is run as a part of R itself, so the steps to start it begin with starting R:

Start R.
Load Rattle from your library by executing the command: library(“rattle”)
Start Rattle by executing the command: rattle()

Data Editor

A data editor is a fundamental feature in data analysis software. It puts you in touch with your data and lets you get a feel for it, if only in a rough way. A data editor is such a simple concept that you might think there would be hardly any differences in how they work in different GUIs. While there are technical differences, to a beginner what matters the most are the differences in simplicity. Some GUIs, including jamovi, let you create only what R calls a data frame. They use more common terminology and call it a data set: you create one, you save one, later you open one, then you use one. Others, such as RKWard trade this simplicity for the full R language perspective: a data set is stored in a workspace. So the process goes: you create a data set, you save a workspace, you open a workspace, and choose a data set from within it.

Rattle’s data editor is unique for a GUI in that it does not offer a way to create a data set. It lets you edit any data set you open using R’s built-in edit function, but that function offers very few features. Clicking on a variable name will cause a dialog to open, offering to change the variable’s name or type as numeric or character (see Figure 2). Rattle automatically converts variables that have fewer than 10 values into “categorical” ones. R would call these factors. You can always recode variables from numeric to categorical (or vice versa) in the “Transform” tab (see Data Management section).

Figure 2. Rattle uses R’s built-in edit function as its data editor. Here I clicked on the name of the variable “Rainfall” to show how you might rename it or change its data type.

Data Import

Since R GUIs are using R to do the work behind the scenes, they often include the ability to read a wide range of files, including SAS, SPSS, and Stata. Some, like BlueSky Statistics, also include the ability to read directly from SQL databases. Of course you can always use R code to import data from any source and then continue to analyze it using any GUI, but the point of GUIs is to avoid programming.

Rattle skips many common statistical data formats, but it includes a couple exclusive ones, such as the Attribute-Relation File Format used by other data mining tools. It also includes “corpus” which reads in text documents, and it then it performs the popular tf-idfcalculation to prepare them for analysis using the other numerically-based analysis methods.

On its “Data” tab, Rattle offers several formats:

File: CSV
File: TXT
File: Excel
Attribute-Relation File Format (ARFF)
Open Database Connectivity (ODBC)
R Dataset
RData File
Library
Corpus (for text analysis)
Script

Data Management

It’s often said that 80% of data analysis time is spent preparing the data. Variables need to be transformed, recoded, or created; strings and dates need to be manipulated; missing values need to be handled; datasets need to be stacked or merged, aggregated, transposed, or reshaped (e.g. from wide to long and back). A critically important aspect of data management is the ability to transform many variables at once. For example, social scientists need to recode many survey items, biologists need to take the logarithms of many variables. Doing these types of tasks one variable at a time can be tedious. Some GUIs, such as jamovi and RKWard handle only a few of these functions. Others, such as BlueSky Statistics or the R Commander can handle all, or nearly all, of these tasks.

Rattle provides minimal data management tools. Its designer chose to focus on reading a single data set, and making transformations that are common in data mining projects quick and easy. More complex data management tasks are left to other tools such as SQL in a database before the data set is read in, or using R programming.

Rattle’s “Transform” tab cycles through various data management “types.” The way it works is quite unique. As you can see in Figure X, I have selected the Transform tab by clicking on it. I then held the CTRL key down to select several variables that are highlighted in blue. If the variables had been next to one another, I could have clicked on the first one, then shift-clicked on the last to select them all. Next I chose my transformation, by choosing “Recode” and then “Recenter.” Finally, I clicked the “Execute” button (or F2) to complete the process by adding three new recoded variables to the data set. Original variables are never changed, and you never have the ability to choose the name of the new variable(s). A prefix is appended to the variable name(s) automatically to speed the process. In this case, my Rainfall variable was transformed into “RRC_Rainfall”. The RRC prefix stands for “Recoded, Re-Centered.”

Whenever a variable is transformed, its status in the “Data” tab switches from “Input” to “Ignore”, while the transformed version of variable enters the data with an “Input” role.

Figure 3. Rattle’s “Transform” tab with three variables selected. The “Recode” sub-tab is also selected and the “Recenter” transformation is chosen. When the “Explore” button is clicked, the newly tranformed variables will be appended to the data set with a prefix indicating the type of transformation performed.

As easy as some transformations are, other transformations are impossible. For example, if you had a formula to calculate recommended daily allowances of vitamins, there’s no way to do it. Conditional transformations, those which have different formulas for different subsets of the observations (e.g. daily allowances of vitamins calculated differently for men and women) are also not possible. Here are the available transformations:

Transform> Rescale> Normalize

Recenter (Z-score)
Scale 0 to 1
(Var – Median)/Mean Absolute Deviation (MAD)
Natural Log
Log 10
Matrix (divide all by a constant)

Transform> Impute

Replace missing with zeros (e.g. requesting nothing gets you nothing)
Mean
Median
Mode
Constant

Transform> Recode

Binning> Quantiles
Binning> KMeans clusters
Binning> Equal width intervals
Binning> N Equally spaced intervals
Indicator variables
Join Categorics
As Categoric
As Numeric

Continued here…

A Comparative Review of the BlueSky Statistics GUI for R

Introduction

BlueSky Statistics’ desktop version is a free and open source graphical user interface for the R software that focuses on beginners looking to point-and-click their way through analyses. A commercial version is also available which includes technical support and a version for Windows Terminal Servers such as Remote Desktop, or Citrix. Mac, Linux, or tablet users could run it via a terminal server.

This post is one of a series of reviews which aim to help non-programmers choose the Graphical User Interface (GUI) that is best for them. Additionally, these reviews include a cursory description of the programming support that each GUI offers.

Terminology

There are various definitions of user interface types, so here’s how I’ll be using these terms:

Installation

The various user interfaces available for R differ quite a lot in how they’re installed. Some, such as jamovi or RKWard, install in a single step. Others install in multiple steps, such as the R Commander (two steps) and Deducer (up to seven steps). Advanced computer users often don’t appreciate how lost beginners can become while attempting even a simple installation. The HelpDesks at most universities are flooded with such calls at the beginning of each semester!

The main BlueSky installation is easily performed in a single step. The installer provides its own embedded copy of R, simplifying the installation and ensuring complete compatibility between BlueSky and the version of R it’s using. However, it also means if you already have R installed, you’ll end up with a second copy. You can have BlueSky control any version of R you choose, but if the version differs too much, you may run into occasional problems.

Plug-in Modules

BlueSky is a fairly new open source project, and at the moment all the add-on modules are provided by the company. However, BlueSky’s capabilities approaches the comprehensiveness of R Commander, which currently has the most add-ons available. The BlueSky developers are working to create an Internet repository for module distribution.

Startup

You start BlueSky directly by double-clicking its icon from your desktop, or choosing it from your Start Menu (i.e. not from within R itself). It interacts with R in the background; you never need to be aware that R is running.

Data Editor

BlueSky starts up by showing you its main Application screen (Figure 1) and prompts you to enter data with an empty spreadsheet-style data editor. You can start entering data immediately, though at first, the variables are simply named var1, var2…. You might think you can rename them by clicking on their names, but such changes are done in a different manner, one that will be very familiar to SPSS users. There are two tabs at the bottom left of the data editor screen, which are labeled “Data” and “Variables.” The “Data” tab is shown by default, but clicking on the “Variables” tab takes you to a screen (Figure 2) which displays the metadata: variable names, labels, types, classes, values, and measurement scale.

Figure 1. The main BlueSky Application screen.

The big advantage that SPSS offers is that you can change the settings of many variables at once. So if you had, say, 20 variables for which you needed to set the same factor labels (e.g. 1=strongly disagree…5=Strongly Agree) you could do it once and then paste them into the other 19 with just a click or two. Unfortunately, that’s not yet fully implemented in BlueSky. Some of the metadata fields can be edited directly. For the rest, you must instead follow the directions at the top of that screen and right click on each variable, one at a time, to make the changes. Complete copy and paste of metadata is planned for a future version.

Figure 2. The Variables screen in the data editor. The “Variables” tab in the lower left is selected, letting us see the metadata for the same variables as shown in Figure 1.

You can enter numeric or character data in the editor right after starting BlueSky. The first time you enter character data, it will offer to convert the variable from numeric to character and wait for you to approve the change. This is very helpful as it’s all too easy to type the letter “O” when meaning to type a zero “0”, or the letter “I” instead of number one “1”.

To add rows, the Data tab is clearly labeled, “Click here to add a new row”. It would be much faster if the Enter key did that automatically.

To add variables you have to go to the Variables tab and right-click on the row of any variable (variable names are in rows on that screen), then choose “Insert new variable at end.”

To enter factor data, it’s best to leave it numeric such as 1 or 2, for male and female, then set the labels (which are called values using SPSS terminology) afterwards. The reason for this is that once labels are set, you must enter them from drop-down menus. While that ensures no invalid values are entered, it slows down data entry. The developer’s future plans includes automatic display of labels upon entry of numeric values.

If you instead decide to make the variable a factor before entering numeric data, it’s best to enter the numbers as labels as well. It’s an oddity of R that factors are numeric inside, while displaying labels that may or may not be the same as the numbers they represent.

To enter dates, enter them as character data and use the “Data> Compute” menu to convert the character data to a date. When I reported this problem to the developers, they said they would add this to the “Variables” metadata tab so you could set it to be a date variable before entering the data.

If you have another data set to enter, you can start the process again by clicking “File> New”, and a new editor window will appear in a new tab. You can change data sets simply by clicking on its tab and its window will pop to the front for you to see. When doing analyses, or saving data, the data set that’s displayed in the editor is the one that will be used. That approach feels very natural; what you see is what you get.

Saving the data is done with the standard “File > Save As” menu. You must save each one to its own file. While R allows multiple data sets (and other objects such as models) to be saved to a single file, BlueSky does not. Its developers chose to simplify what their users have to learn by limiting each file to a single data set. That is a useful simplification for GUI users. If a more advanced R user sends a compound file containing many objects, BlueSky will detect it and offer to open one data set (data frame) at a time.

Figure 3. Output window showing standard journal-style tables. Syntax editor has been opened and is shown on right side.

Data Import

The open source version of BlueSky supports the following file formats, all located under “File> Open”:

Comma Separated Values (.csv)
Plain text files (.txt)
Excel (old and new xls file types)
Dbase’s DBF
SPSS (.sav)
SAS binary files (sas7bdat)
Standard R workspace files (RData) with individual data frame selection

The SQL database formats are found under the “File> Import Data” menu. The supported formats include:

Microsoft Access
Microsoft SQL Server
MySQL
PostgreSQL
SQLite

Data Management

It’s often said that 80% of data analysis time is spent preparing the data. Variables need to be transformed, recoded, or created; strings and dates need to be manipulated; missing values need to be handled; datasets need to be stacked or merged, aggregated, transposed, or reshaped (e.g. from wide to long and back). A critically important aspect of data management is the ability to transform many variables at once. For example, social scientists need to recode many survey items, biologists need to take the logarithms of many variables. Doing these types of tasks one variable at a time can be tedious. Some GUIs, such as jamovi and RKWard handle only a few of these functions. Others, such as the R Commander, can handle many, but not all, of them.

BlueSky offers one of the most comprehensive sets of data management tools of any R GUI. The “Data” menu offers the following set of tools. Not shown is an extensive set of character and date/time functions which appear under “Compute.”

Missing Values
Compute
Bin Numeric Variables
Recode (able to recode many at once)
Make Factor Variable (able to covert many at once)
Transpose
Transform (able to transform many at once)
Sample Dataset
Delete Variables
Standardize Variables (able to standardize many at once)
Aggregate (outputs results to a new dataset)
Aggregate (outputs results to a printed table)
Subset (outputs to a new data et)
Subset (outputs results to a printed table)
Merge Datasets
Sort (outputs results to a new dataset)
Sort (outputs results to a printed table)
Reload Dataset from File
Refresh Grid
Concatenate Multiple Variables (handling missing values)
Legacy (does same things but using base R code)
Reshape (long to wide)
Reshape (wide to long)

Continued here…

A Comparative Review of the Deducer GUI for R

Introduction

Deducer is a free and open source Graphical User Interface for the R software, one that provides beginners a way to point-and-click their way through analyses. It also integrates into an environment designed to help programmers be more productive. Deducer is available on Windows, Mac, and Linux; there is no server version.

This post one of a series of reviews which aim to help non-programmers choose the Graphical User Interface (GUI) that is best for them. However, the reviews will include a cursory description of the programming support that each GUI offers.

Figure 1. JGR console with Deducer menus (left) and Deducer data viewer (right).

Terminology

There are various definitions of user interface types, so here’s how I’ll be using these terms:

Installation

The various user interfaces available for R differ quite a lot in how they’re installed. Some, such as jamovi, BlueSky, or RKWard, install in a single step. Others, such as the R Commander and Rattle, install in multiple steps. Advanced computer users often don’t appreciate how lost beginners can become while attempting even a simple installation. The HelpDesks at most are flooded with such calls at the beginning of each semester!

Deducer’s installation is quite complex:

If you haven’t already done so, install the Java JRE. If you’re on Windows, I recommend the Windows x64 64-bit version.
Download and install R. You should only need to keep the 64-bit version there too.
Start R as an administrator, and from within it install Deducer and its companion IDE, the Java GUI for R (JGR, pronounced “jaguar”) using:
packages(c(“JGR”,”Deducer”,”DeducerExtras”))
Start JGR by submitting the commands:
library(“JGR”)
JGR()
Within the JGR Console, start Deducer by choosing “Packages & Data> Package Manager” and clicking the checkboxes labeled “loaded” and “default” in front of both “Deducer” and “Deducer Extras”, then close the box.
If you wish to get publication-quality output, download and install DeducerRichOutput from here.
Finally, if you wish to start Deducer by clicking an icon (instead of typing two R commands) download the JGR launcher from here. If you have problems with this working start over while paying particular attention to where the instructions say, “as administrator.”

If your goal is to point-and-click your way through analyses, you probably won’t care for that much complexity. However, if your goal is to learn how to program in R, following those steps will help you on your way. Some of those steps are tasks you must learn when programming R.

Plug-in Modules

When choosing a GUI, one of the most fundamental questions is: what can it do for you? What the initial software installation of each GUI gets you is covered in the Graphics, Analysis, and Modeling sections of this series of articles. Regardless of what comes built-in, it’s good to know how active the development community is. They contribute “plug-ins” which add new menus and dialog boxes to the GUI. This level of activity ranges from very low (e.g. RKWard) through moderate (e.g. jamovi) to very active (e.g. R Commander).

Deducer has been in existence since 2009, and during that time nine plug-ins have been developed. Unfortunately there is no single place to go to find them. On the GUI’s “Packages & Data> GUI Add-ons” menu you’ll find four of them. Others are available here. The complete list of plug-ins that I could find is here:

DeducerExtras: An add-on package containing a variety of additional analysis dialogs. These include: Distribution quantiles, single/multiple sample proportion tests, paired t-test, Wilcoxon signed rank test, Levene’s test, Bartlett’s test, k-means clustering, Hierarchical clustering, factor analysis, and multi-dimensional scaling
DeducerPlugInScaling: Reliability and factor analysis
DeducerMMR: Moderated multiple regression and simple slopes analysis
DeducerRichOutput: writes results into true word processing tables with fonts and formatting
DeducerSpatial: A GUI for Spatial Data Analysis and Visualization
RDSAnalyst: Respondent Driven Sampling
gMCP: (Experimental) A graphical approach to sequentially rejective multiple test procedures
RGG: (Experimental) A GUI Generator
DeducerText: (Experimental) Text Mining
DeducerHansel: (Experimental) An add-on package which covers many methods common in econometrics, including binary logit, binary probit, and tobit estimates, and various time-series, panel, and spatial data methods. The time-series methods include cointegration analysis.

Startup

Some user interfaces for R, such as jamovi, start by double-clicking on a single icon, which is great for people who prefer to not write code. Others, such as R commander and Rattle, have you start R, then load a package from your library, then call a function. That’s better for people looking to learn R, as those are among the first tasks they’ll have to learn anyway.

On Deducer’s main web site, it recommends the following steps:

Start R.
Load the JGR package from your library by executing the command: “library(“JGR”)”.
Start JGR by executing the command: “JGR()” and, if you followed the installation instructions above, JGR will start Deducer automatically. Both of the screens shown in Figure 1 will appear.

However, if you make it successfully through all seven installation steps described above, you can also start Deducer by double-clicking on the JGR Launcher icon.

Data Editor / Viewer

Deducer’s data editor is named Data Viewer. That can be confusing since many well-known software packages – including RStudio, the R Commander, and SAS Studio – use the term “viewer” for tools that let you see but not edit the data. The first time I used Deducer, I spent an embarrassing amount of time trying to find the “data editor” when it was right under my nose!

Figure 2. Deducer’s Data Viewer with the “Data View” tab selected (upper left). I have right-clicked on the variable name of “q2” and it displayed a menu of tasks to perform.

You can start Deducer’s Data Viewer by choosing “File> New Data”. You then provide a name, and click OK. You’ll see it execute a command like, “mydata <- data.frame()” but the Data Viewer may not show you an empty spreadsheet. It tends to lock onto your last data set, but you can choose the drop-down menu labeled “Data Set” to get to the name of the one you just started to create. An empty version of the screen shown in Figure 2 will appear.

You can start entering data immediately, though the variables will be named V1, V2,… at first. Numeric and character data will be fine, but don’t enter any other type of variables yet, such as dates. Before you go very far, it’s important to click on the “Variable View” tab and fill in your metadata, such as variable names, Type and Factor Level (see Figure 3). When the metadata are filled in, the data editor may wipe out any existing data! For example, if you enter some dates like “8/31/2018” it will be stored as character. If you then switch to the Variable View, and click on Type for that variable, and choose “Date” from the drop-down menu, the editor will delete the exiting dates.

This combination of Data View/Variable View is a common one which was made popular by SPSS. In that software it offers great power by letting you copy metadata from one variable to dozens of others. So you might have survey data where, 1=”Strongly Disagree”, 2=”Disagree”,…”5=”Strongly Agree”. SPSS would allow you to define this for one variable, the copy it and paste it into many others. Deducer’s Variable View does not allow that. You must work one variable at a time, which gets quite tedious.

To open an existing data set, choose “File> Open Data”. If it doesn’t appear in the Data Viewer window, choose it from the Data Set drop-down menu.

Figure 3. Deducer’s Data Viewer with the “Variable View” tab selected (upper left). This displays and lets us edit the metadata for the same data as shown in Figure 2.

Saving the data is done with the standard “File> Save As” menu. You must save each one to its own file. While R allows multiple data sets (and other objects such as models) to be saved to a single file, Deducer does not. Its developers chose to simplify what their users have to learn by limiting each file to a single data set. However, you can also save or load multiple data sets by using JGR’s workspace save and open menu items. This strikes a good balance as beginners will relate to the simplicity of one-data-set-per-file, while advanced users will like the option to deal with more complex multi-object workspaces.

[Continued here…]

The Popularity of Point-and-Click GUIs for R

Point-and-click graphical user interfaces (GUIs) for R allow people to analyze data using the R software, without having to learn how to program in the R language. This is a brief look at how popular each one is. Knowing that a GUI is popular doesn’t mean it will meet your needs, but it does mean that it’s meeting the needs of many others. This may be helpful information when selecting the appropriate GUI for you, if programming is not your primary interest. For detailed information regarding what each GUI can do for you, and how it works, see my series of comparative reviews, which is currently in progress.

There are many ways to estimate the popularity of data science software, but one of the most accurate is by counting the number of downloads (see appendix for details). Figure 1 shows the monthly downloads of four of the six R GUIs that I’m reviewing (i.e. all that exist as far as I know). We can see that the R Commander (Rcmdr) is the most popular GUI, and it has had steady growth since its introduction. Next comes Rattle, which is more oriented towards machine learning tasks. It too, has shown high popularity and steady growth.

The three lines at the bottom could use more “breathing room” so let’s look at them in their own plot.

Figure 1. Number of times each software was downloaded by month.

Figure 2 shows the same data as Figure 1, but with the two most popular GUIs removed to make room to study the remaining data. From it we can see that Deducer has been around for many more years than the other two. Downloads for Deducer grew steadily for a couple of years, then they leveled off. Its downloads appear to be declining slightly in recent years. jamovi (its name is not capitalized) has only been around for a brief period, and its growth has been very rapid. As you can see from my recent review, jamovi has many useful features.

Figure 2. Number of times the less popular GUIs were downloaded. (Same as Fig. 1, with the R Commander and rattle removed).

The lowest (blue) line shows downloads for the jmv package, that contains all the functions used by the jamovi GUI. It allows programmers to write code instead of using the jamovi GUI. People who point-and-click their way through an analysis in jamovi can send their code to any R user, who would then use the jmv package to run it. Since most jamovi users would prefer to point-and-click their way through analyses, it makes sense that the jmv package has been downloaded many fewer times than jamovi itself.

Two GUIs are missing from this plot: RKWard and BlueSky Statistics. Neither of those are downloaded from CRAN, and I was unable to obtain data from the developers of those GUIs. However, knowing that RKWard has a similar number of point-and-click features as Deducer, one can deduce (heh!) that it might have a similar level of popularity. The BlueSky software has only recently appeared on the scene, especially with its current level of features, so I expect it too will be towards the bottom, but growing rapidly.

I’m nearly done with all my reviews, so stay tuned to see what the other GUIs offer.

Acknowledgements

Thanks to Guangchuang Yu for making the dlstats package which allowed me to collect data so easily. Thanks also to Jonathon Love, who provided the download data for jamovi, and to Josh Price for his helpful editorial advice.

Appendix: Where the Data Came From

I used R’s dlstats package, which makes quick work of gathering counts of monthly downloads of R packages from the Comprehensive R Archive Network (CRAN). CRAN consists of sites around the world called “mirrors” from which people can download R packages. When starting the download process, R asks you to choose a mirror that is close to your location. In the popular RStudio development environment for R, the default mirror is set to their own server, which is actually a worldwide network of mirrors. Since it’s the default download location in a very popular tool for R, its download data will give us a good idea of the relative popularity of each GUI. The absolute popularity will be greater, but to get that data I would have to gather data from all the other servers around the world. If you have time to do that, please send me the results!

A Comparative Review of the RKWard GUI for R

Introduction

RKWard is a free and open source Graphical User Interface for the R software, one that supports beginners looking to point-and-click their way through analyses, as well as advanced programmers. You can think of it as a blend of the menus and dialog boxes that R Commander offers combined with the programming support that RStudio provides. RKWard is available on Windows, Mac, and Linux.

This review is one of a series which aims to help non-programmers choose the Graphical User Interface (GUI) that is best for them. However, I do include a cursory overview of how RKWard helps you work with code. In most sections, I’ll begin with a brief description of the topic’s functionality and how GUIs differ in implementing it. Then I’ll cover how RKWard does it.

Figure 1. RKWard’s main control screen containing an open data editor window (big one), an open dialog box (right) and its output window (lower left).

Terminology

There are various definitions of user interface types, so here’s how I’ll be using these terms:

GUI = Graphical User Interface specifically using menus and dialog boxes to avoid having to type programming code. I do not include any assistance for programming in this definition. So GUI users are people who prefer using a GUI to perform their analyses. They often don’t have the time required to become good programmers.

Installation

The various user interfaces available for R differ quite a lot in how they’re installed. Some, such as jamovi or BlueSky Statistics, install in a single step. Others install in multiple steps, such as R Commander and Deducer. Advanced computer users often don’t appreciate how lost beginners can become while attempting even a single-step installation. I work at the University of Tennessee, and our HelpDesk is flooded with such calls at the beginning of each semester!

Installing RKWard on Windows is done in a single step since its installation file contains both R and RKWard. However, Mac and Linux users have a two-step process, installing R first, then download RKWard which links up to the most recent version of R that it finds. Regardless of their operating system, RKWard users never need to learn how to start R, then execute the install.packages function, and then load a library. Installers for all three operating systems are available here.

The RKWard installer obtains the appropriate version of R, simplifying the installation and ensuring complete compatibility. However, if you already had a copy of R installed, depending on its version, you could end up with a second copy.

RKWard minimizes the size of its download by waiting to install some R packages until you actually try to use them for the first time. Then it prompts you, offering default settings that will get the package you need.

On Windows, the installation file is 136 megabytes in size.

Plug-ins

When choosing a GUI, one of the most fundamental questions is: what can it do for you? What the initial software installation of each GUI gets you is covered in the Graphics, Analysis, and Modeling section of this series of articles. Regardless of what comes built-in, it’s good to know how active the development community is. They contribute “plug-ins” which add new menus and dialog boxes to the GUI. This level of activity ranges from very low (RKWard, BlueSky, Deducer) through moderate (jamovi) to very active (R Commander).

Currently all plug-ins are included with the initial installation. You can see them using the menu selection Settings> Configure Packages> Manage RKWard Plugins. There are only brief descriptions of what they do, but once installed, you can access the help files with a single click.

RKWard add-on modules are part of standard R packages and are distributed on CRAN. Their package descriptions include a field labeled, “enhances: rkward”. You can sort packages by that field in RKWard’s package installation dialog where they are displayed with the RKWard icon.

Continued here…