Workshop: Introduction to Modern R

The R language is often initially taught focusing on its built-in functions. Later, beginners discover that there are add-on packages that make R much easier to use. This workshop does just the opposite: it starts using the easiest and fastest R commands right from the start. For each topic, it then covers the built-in functions briefly, pointing out why we will focus on an alternative. There are, of course, situations in which some of the built-in functions are the best to use, and we’ll go over those as well.

R--113Most of our time will be spent working through examples that you may run simultaneously on your computer. You will see both the instructor’s screen and yours, as we run the examples and discuss the output. However, the handouts include each step and its output, so feel free to skip the computing; it’s easy to just relax and take notes. The slides and programming steps are numbered so you can easily switch from computing to slides and back again.

This workshop is available at your organization’s site, or via webinars.

The 0n-site version is the most engaging by far, generating much discussion and occasionally veering off briefly to cover topics specific to a particular organization. The instructor presents a topic for around twenty minutes. Then we switch to exercises, which are already open in another tabbed window. The exercises contain hints that show the general structure of the solution; you adapt those hints to get the final solution. The complete solutions are in a third tabbed window, so if you get stuck the answers are a click away. The typical schedule for training on site are located here.

A webinar version is also available. The approach is saves travel expenses and is especially useful for organizations with branch offices. It’s offered as two half-day sessions, often with a day or two skipped in between to give participants a chance to do the exercises and catch up on other work. There is time for questions on the lecture topics (live) and the exercises (via email). However, webinar participants are typically much less engaged, and far less discussion takes place.

For further details or to arrange a webinar or site visit, contact the instructor, Bob Muenchen, at muenchen.bob@gmail.com.

Prerequisites

This workshop assumes no prior knowledge of R. Some knowledge of statistics is helpful, but not required. The instructor is well aware that knowledge of statistics fades rapidly when not used!

Learning Outcomes

When finished, participants will be able to use R to import data, transform it, create publication quality graphics, perform commonly used statistical analyses and know how to generalize that knowledge to more advanced methods.

Presenter

Robert A. Muenchen is the author of R for SAS and SPSS Users and, with Joseph M. Hilbe, R for Stata Users. He is also the creator of r4stats.com, a popular web site devoted to analyzing trends in analytics software and helping people learn the R language. Bob is an ASA Accredited Professional Statistician™ with 35 years of experience and is currently the manager of OIT Research Computing Support (formerly the Statistical Consulting Center) at the University of Tennessee. He has taught workshops on research computing topics for more than 500 organizations and has offered training in partnership with the American Statistical AssociationDataCamp.com, New Horizons Computer Learning Centers, Revolution Analytics, RStudio and Xerox Learning Services. Bob has written or coauthored over 70 articles published in scientific journals and conference proceedings, and has provided guidance on more than 1,000 graduate theses and dissertations.

Bob has served on the advisory boards of SAS Institute, SPSS Inc., StatAce OOD, Intuitics, the Statistical Graphics Corporation and PC Week Magazine (now eWeek). His suggested improvements have been incorporated into SAS, SPSS, JMP, STATGRAPHICS and several R packages. His research interests include statistical computing, data graphics and visualization, text analytics, and data mining.

Computer Requirements

On-site training is best done in a computer lab with a projector and, for large rooms, a PA system. The webinar version is delivered to your computer using Zoom (or similar webinar systems if your organization has a preference.)

Course programs, data, and exercises will be sent to you a week before the workshop. The instructions include installing R, which you can download R for free here: http://www.r-project.org/. We will also use RStudio, which you can download for free here: http://RStudio.com. If you already know a different R editor, that’s fine too.

Course Outline
(In-depth data management topics are covered in an optional separate workshop that usually follows immediately after this one.)

  1. Introduction and statement of goals
    1. Overview of R
    2. Installing and maintaining R
    3. Getting help
  2. Programming Language Basics – including creating, subsetting and analyzing:
    1. Vectors (variables)
    2. Factors (categorical variables)
    3. Data frames (data sets)
    4. “Tibbles” (dplyr’s tbl_df data frames)
    5. Matrices
    6. Arrays
    7. Lists
  3. Managing your files and workspace
    1. Listing their names
    2. Printing
    3. Deleting
    4. Saving
    5. Examining structure of data sets, etc.
  4. Controlling functions (procedures or commands) using
    1. Arguments (options or parameters)
    2. An object’s class
    3. How to change class
    4. Model formulas
  5. Data Acquisition – Reading files (includes whichever formats your organzation needs)
    1. Comma separated value files
    2. Tab-delimited files
    3. Excel files
    4. Minitab data sets
    5. SAS data sets
    6. SPSS save file
    7. Stata data sets
  6. Data Transformations using
    1. Math formulas
    2. Recoding
    3. Conditional (logical) formulas
  7. Selecting variables and observations using:
    1. Dollar format
    2. The “attach” function
    3. The “with” function
    4. Subscripting (a.k.a. indexing)
    5. dplyr’s select and filter functions
    6. Model formulas and the “data=” argument
  8. Writing functions (macros)
    1. Why they’re more important in R than most languages
    2. How to create functions
    3. How to apply functions to data frames
    4. Applying functions by group
  9. Graphics
    1. Traditional graphics including:
      1. Bar charts
      2. Scatter plots
      3. Strip plots
      4. Box plots
      5. Histograms
      6. Repeating above plots by groups
      7. Adding titles, etc.
      8. Adding regression lines
    2. Lattice graphics – a brief overview
    3. The Grammar of Graphics approach using the ggplot2 package
      1. qplot vs. ggplot
      2. Bar charts
      3. Histograms
      4. Scatter plots
      5. Strip plots
      6. Multi-layered plots
      7. Group plots
      8. Adding titles, etc.
      9. Adding regression lines
    4. Interactive graphics – a brief overview
    5. Graphics resources
  10. Statistics – many are done showing sparse R output and the richer output that most people prefer.
    1. Descriptive statistics
    2. Crosstabulation with chi-squared test
    3. Repeating an analysis by groups or departments (a.k.a. “By” or “split file”)
    4. Correcting p-values for the effects of multiple testing
    5. Correlation: Pearson, Spearman
    6. Linear regression
    7. Extractor functions (a.k.a. postestimation commands)
    8. T-tests
    9. Wilcoxon Mann-Whitney rank sum test
    10. Paired t-test
    11. Wilcoxon signed-rank test
    12. Analysis of variance
    13. Kruskal-Wallis
    14. Post hoc tests
  11. Getting publication-quality output into
    1. Word
    2. HTML
    3. LaTeX (optional)
  12. Ways to run R (includes only those of interest to your organization)
    1. Interactively
    2. Programs that include other programs
    3. Running R from within SAS
    4. Running R from within SPSS
    5. Running R as an adjunct to Stata
    6. Graphical User Interfaces:
      1. R Commander
      2. Rattle data mining interface
      3. Excel integration
      4. Alteryx/KNIME/RapidMiner
  13. Summary of topics learned

Here is a slide show of previous workshops.