R is free and powerful software for data analysis and graphics. However, its flexible approach is so different from other software that it can be frustrating to learn. This live-via-Internet 8-hour workshop introduces R in a way that takes advantage of what you already know. For many topics we will begin with add-on commands that work similarly to your current software. Then we will cover R’s built-in commands that provide simpler but more flexible output. We will also discuss aspects of R that are likely to trip you up. For example, many R functions let you specify which data set to use in a way that looks identical to SAS, but which differs in a way that is likely to lead to perplexing error messages.
Most of our time will be spent working through examples that you may run simultaneously on your computer. You will see both the instructor’s screen and yours, side-by-side, as we run the examples and discuss the output. However, the handouts include each step and its output, so feel free to skip the computing; it’s easy to just relax and take notes. The slides and programming steps are numbered so you can easily switch from computing to slides and back again.
Most of the examples come from R for SAS and SPSS Users and R for Stata Users. That makes it easy to review what we did later with full explanations, or to learn more about a particular subject by extending an example which you have already seen.
The workshop is presented live-via-Internet on Monday and Wednesday afternoons in 4-hour sessions. Each session is divided into sections of around 20 minutes per topic and there are breaks every 75 minutes. There is ample time to ask questions verbally or by typing them into a Q&A window. After each session, you will receive a set of practice exercises for you to do on your own time, as well as solutions to the problems.
The entire session is recorded and available for study for two weeks afterwards. The instructor is available to handle workshop-related questions both during the workshop and at any time in the future.
Attendees should know how to program in SAS, SPSS or Stata and be familiar with basic statistical methods including linear regression and one-way analysis of variance.
When finished, you will be able to use R to import data, transform it, create publication quality graphics, perform commonly used statistical analyses and know how to generalize that knowledge to more advanced methods. You will also have an especially thorough understanding of how R compares to SAS, SPSS and Stata.
Robert A. Muenchen is the author of R for SAS and SPSS Users and, with Joseph M. Hilbe, R for Stata Users. He is also the creator of r4stats.com, a popular web site devoted to analyzing trends in analytics software and helping people learn the R language. Bob is an ASA Accredited Professional Statistician™ with 30 years of experience and is currently the manager of OIT Research Computing Support (formerly the Statistical Consulting Center) at the University of Tennessee. He has taught workshops on research computing topics for more than 500 organizations. Bob has written or coauthored over 70 articles published in scientific journals and conference proceedings, and has provided guidance on more than 1,000 graduate theses and dissertations.
Bob has served on the advisory boards of SAS Institute, SPSS Inc., StatAce OOD, the Statistical Graphics Corporation and PC Week Magazine. His suggested improvements have been incorporated into SAS, SPSS, StatAce, JMP, STATGRAPHICS and several R packages. His research interests include statistical computing, data graphics and visualization, text analytics, and data mining.
The workshop is delivered to your computer using Cisco WebEx. You can join a test meeting and see computer system requirements here. It’s important to test your computer since many organizations require special privileges to modify your computer to accept the browser plug-in that makes it work.
Course programs, data, and exercises will be sent to you a week before the workshop. The instructions include installing R, which you can download R for free here: http://www.r-project.org/
We will also use RStudio, which you can download for free here: http://RStudio.com. If you already know a different R editor, that’s fine too.
(In-depth data management topics are covered in a separate workshop that follows on the Friday afternoon immediately after this one.)
- Introduction and statement of goals
- Overview of R
- Installing and maintaining R
- Programming Language Basics – including creating, subsetting and analyzing vectors (variables), factors (categorical variables), data frames (data sets), matrices, arrays and lists.
- Managing your files and workspace – R provides a complete environment that includes many commands for listing, printing, saving, deleting data as well as examining object structure.
- Controlling functions (procedures or commands) using arguments (options or parameters) or an object’s class; how to change class
- Data Acquisition – reading comma- and tab-delimited files, Excel, SAS, SPSS & Stata
- Data Transformations – modifying existing variables and creating new ones
- Selecting variables and observations – R offers many more ways to do selection
- Writing functions (macros)
- Traditional graphics (similar to old SAS and SPSS graphics) including bar, scatter, strop, box plots, histograms, plotting groups, adding embellishments and regression fits.
- Lattice graphics (similar to new SAS SG* and Stata graphics) – a brief overview
- The Grammar of Graphics approach using the ggplot2 package (similar to SPSS GPL) including: qplot vs. ggplot; bar charts, histograms, scatter, strip, multi-layered plots; group plots, adding embellishments and fit lines.
- Interactive graphics – a brief overview (similar to JMP, SAS/INSIGHT, SAS/IML Studio)
- Graphics resources
- Descriptive statistics done both the SAS/SPSS/Stata way and the R way
- Crosstabulation done both the SAS/SPSS/Stata way and the R way
- “By” or “split file” processing of groups
- Correcting p-values for the effects of multiple testing
- Correlation: Pearson, Spearman, both standard and Bayesian p-values
- Linear regression
- Extractor functions (like Stata’s postestimation commands)
- t-test, including standard and Bayesian p-values, Wilcoxon Mann-Whitney rank sum test
- Paired t-test including standard and Bayesian p-values, Wilcoxon signed-rank test
- Analysis of variance, Kruskal-Wallis & post hoc tests
- Getting publication-quality output into Word, LibreOffice, HTML and LaTeX
- Ways to run R
- Programs that include other programs
- Running R from within SAS and SPSS
- Running R as an adjunct to Stata
- Graphical User Interfaces: R Commander, JGR, Rattle, Excel
- Summary of topics learned
Here is a slide show of previous workshops.