Learn R the easy way, by focusing on modern “tidyverse” functions. This 2-day workshop starts at ground zero and shows you how to import data, then transform, visualize, and analyze it. You’ll have hands-on experience every step of the way. The slides, examples, and output are all integrated into a single document. You can add your own notes as you go, and when finished, you “knit” it all together into a single 156-page book.

This workshop is available at your organization’s site, or via webinars.

The on-site version is the most engaging, generating much discussion and occasionally veering off briefly to cover topics specific to a particular organization. The instructor and participants work through a topic hands-on for around twenty minutes. Then we switch to exercises, which are already open in another tabbed window. The exercises contain hints that show the general structure of the solution; you adapt those hints to get the final solution. The complete solutions are in a third tabbed window, so if you get stuck the answers are a click away. The typical schedule for training on site is located here.

A webinar version is also available. The approach saves travel expenses and is especially useful for organizations with branch offices. It’s offered as two half-day sessions, often with a day or two skipped in between to give participants a chance to do the exercises and catch up on other work. There is time for questions on the lecture topics (live) and the exercises (via email). However, webinar participants are typically much less engaged, and far less discussion takes place.

For further details or to arrange a webinar or site visit, contact the instructor, Bob Muenchen, at muenchen.bob@gmail.com.

**Prerequisites**

This workshop assumes no prior knowledge of R. Some knowledge of statistics is helpful, but not required. The instructor is well aware that knowledge of statistics fades rapidly when not used.

**Learning Outcomes**

When finished, participants will be able to use R to import data, transform it, create publication quality graphics, perform commonly used statistical analyses and know how to generalize that knowledge to more advanced methods.

**Presenter**

Robert A. Muenchen is the author of *R for SAS and SPSS Users* and, with Joseph M. Hilbe, *R for Stata Users*. He is also the creator of r4stats.com, a popular website devoted to analyzing trends in analytics software and helping people learn the R language. Bob is an ASA Accredited Professional Statistician™ with 35 years of experience and is currently the manager of OIT Research Computing Support (formerly the Statistical Consulting Center) at the University of Tennessee. He has taught workshops on research computing topics for more than 500 organizations and has offered training in partnership with the American Statistical Association, DataCamp.com, New Horizons Computer Learning Centers, Revolution Analytics, RStudio, and Xerox Learning Services. Bob has written or coauthored over 70 articles published in scientific journals and conference proceedings and has provided guidance on more than 1,000 graduate theses and dissertations.

Bob has served on the advisory boards of SAS Institute, SPSS Inc., StatAce OOD, Intuitics, the Statistical Graphics Corporation and PC Week Magazine (now eWeek). His suggested improvements have been incorporated into SAS, SPSS, JMP, STATGRAPHICS, and many R packages. His research interests include statistical computing, data graphics and visualization, text analytics, machine learning, and data mining.

**Computer Requirements**

We will use the free and open-source version of R, which you can download here: http://www.r-project.org/. We will also use RStudio, which you can download for free here: http://RStudio.com. If you already know a different R editor, that’s fine too.

On-site training is best done in a computer lab with a projector and, for large rooms, a PA system. The webinar version is delivered to your computer using Zoom (or similar webinar systems if your organization has a preference.)

**Course Materials**

Course notes, programs, data sets, practice exercises, and solutions will be sent to you in electronic form a week before the workshop. For ease of searching, the course notes are indexed by keywords from Excel, SQL, SAS, SPSS, and Stata. Searching on any fundamental topic from those languages is likely to take you directly to the R equivalent.

Other searchable keywords include alerts on topics that people often err on, as well as common R warning and error messages along with their meanings and solutions. The notes, code, and output are summarized in the 156-page book, *Introduction to Modern R*, which has an interactive table of contents, allowing you to jump quickly to any topic.

**Course Outline
**(In-depth data management topics are covered in an optional separate workshop that usually follows immediately after this one.)

INTRODUCTION

1.1 Topics

1.2 Preparing Your Computer

1.3 Note to System Administrators

2. OVERVIEW OF R

2.1 What is R?

2.2 R’s Advantages

2.3 R’s Disadvantages

2.4 Is R Accurate?

2.5 The Five Main Parts of SAS / SPSS / Stata

2.6 Workshops vs. Books

3. INSTALLING & MAINTAINING R

3.1 Package Installation & Loading

3.2 Choosing a “Mirror”

3.3 Finding Packages

3.4 What if Packages Change?

4. RSTUDIO BASICS & WORKSHOP FILES

4.1 Starting an R Script File

4.2 RStudio Tips

4.3 Workshop Files

4.4 Keywords, Alerts, Warnings, & Errors

5. R MARKDOWN

5.1 Starting an R Markdown File

5.2 R Markdown Language

5.3 R Markdown Knitting Options

5.4 R Markdown Chunk Options

6. R LANGUAGE BASICS

6.1 Objects & Their Names

6.2 Console Prompts

6.3 R Comments

6.4 Expressions

6.5 Assignments

6.6 Commands

6.7 Spacing Example

6.8 Impact of (Parentheses)

6.9 Impact of {Braces}

6.10 Getting Package Info

6.11 Package Conflicts

6.12 Resolving Packages Conflicts

7. HELP & DOCUMENTATION

7.1 Help Files

7.2 Help Details

7.3 More Specific Help

7.4 Help for a Whole Package

7.5 Documentation

7.6 Free Internet Support

7.7 Practice Time

8. DATA STRUCTURES

8.1 A Quick Poll

8.2 R vs. Other Software

8.3 Numeric Vectors

8.4 Printing Vectors (or Any Object)

8.5 Vector Operations

8.6 Example Operations

8.7 Vector Attributes

8.8 Character Vectors

8.9 More Numeric Vectors

8.10 Example Function Calls

8.11 Selecting Vector Elements

8.12 Factors

8.13 Creating a Factor

8.14 Value Labels

8.15 Factor Arguments

8.16 Selecting by Factor Label

8.17 Factor from Character Vector

8.18 Adding Value Labels

8.19 Our Data So Far

8.20 Data Frame Creation

8.21 Why Use Data Frames?

8.22 Data Frame Details

8.23 More Data Frame Details

8.24 Tibble Creation

8.25 How Tibbles Improve Printing

8.26 Other Tibble Advantages

8.27 Matrices

8.28 Matrix Creation via the matrix Function

8.29 Matrix Printing

8.30 Matrix Creation via Column Binding

8.31 Matrix Function Use

8.32 Arrays

8.33 List Creation

8.34 List Details

8.35 Naming Components

8.36 Getting Component Names

8.37 Table of Data Structures, Modes & Classes

8.38 Practice Time

9. MANAGING FILES & WORKSPACE

9.1 Preparing the Workspace

9.2 Introduction

9.3 Listing Objects

9.4 ls Examples

9.5 Printing Objects

9.6 Displaying Attributes

9.7 Examining Object Structure

9.8 Deleting Objects

9.9 rm Examples

9.10 Working Directory

9.11 Saving Your Work

9.12 Quitting & Restarting

9.13 R “Helps” Automate Saving

9.14 Blocking Automatic Saving/Loading

9.15 Special R Files

9.16 Practice Time

10. CONTROLLING FUNCTIONS

10.1 Preparing the Workspace

10.2 R Functions

10.3 Function Output

10.4 Argument Name vs. Position

10.5 A Common Error

10.6 The Triple-dot Argument

10.7 Controlling Functions with Class

10.8 Seeing What Methods Exist

10.9 Changing Class Changes Output

10.10 Combining Function Calls

10.11 Practice Time

11. DATA ACQUISITION

11.1 Preparing the Workspace

11.2 A Quick Poll

11.3 Comma Separated Values

11.4 CSV File Details

11.5 Resulting Tibble

11.6 Data Within a Program Using read_csv

11.7 Data Within a Program Using tribble

11.8 Tab Delimited mydata.tab

11.9 Tab File Details

11.10 Reading Tab File

11.11 Excel Files

11.12 SAS, SPSS, Stata Files

11.13 Database via ODBC

11.14 Database Directly

11.15 Other Databases

11.16 Other Data Sources

11.17 Practice Time

12. CHOOSING VARIABLES FROM DATA FRAMES

12.1 Preparing the Workspace

12.2 The Way You Choose Variables Matters!

12.3 Which Data Frame(s)?

12.4 Choosing Vars Using Dollar Notation

12.5 dplyr’s select Function

12.6 select Variable Options

12.7 Subscripting or Indexing

12.8 Column Position Can Contain…

12.9 Leaving Out the Comma

12.10 Comma Impact on Tibbles

12.11 Choose Vars Using [[ ]] Notation

12.12 Choosing Vars Using Formulas

12.13 The attach and with Functions

12.14 Recommendations

12.15 A Common Selection Error

12.16 Practice Time

13. CHOOSING OBSERVATIONS FROM DATA FRAMES

13.1 Preparing the Workspace

13.2 Using dplyr’s filter Function

13.3 Using Subscripting [ ]
13.4 Logic Rules

13.5 Impact of the “which” Function

13.6 Effect of “which”” on Logical Vectors

13.7 Using Selections in Analyses

13.8 Table of Logical Comparisons

13.9 Practice Time

14. CHOOSING BOTH VARS & OBS FROM DATA FRAMES

14.1 Preparing the Workspace

14.2 Combining select and filter

14.3 select & filter details

14.4 Using Both Subscripts

14.5 Saving Subsets

14.6 Practice Time

15. TRANSFORMATIONS

15.1 Preparing the Workspace

15.2 Using Dollar Notation

15.3 Using mutate

15.4 Resulting Data

15.5 mutate Details

15.6 Table of Transformations

15.7 Practice Time

16. MISSING VALUES

16.1 Preparing the Workspace

16.2 Reading Blanks as Missing

16.3 Reading Other Values as Missing

16.4 Missing Value Codes

16.5 How Missing Values Sort

16.6 Logic for Missing Values

16.7 Using naniar to Count Missing & Valid

16.8 Using Logic to Count Missing & Valid

16.9 Action on Missing Values

16.10 Manual Listwise Deletion

16.11 Mean/Median Substitution

16.12 Advanced Imputation Methods

16.13 The “simputation” Package

16.14 Practice Time

17. GRAPHICS: BASE

17.1 Preparing the Workspace

17.2 Importance of Graphing

17.3 Base Graphics Overview

17.4 Barplot

17.5 Barplot stacked

17.6 Boxplot

17.7 Scatterplot

17.8 Histogram

17.9 Adding Embellishments

17.10 Graphics Parameters

17.11 What Objects Can plot Handle?

17.12 Plotting Groups

17.13 Practice Time

18. GRAPHICS: ggplot2 PACKAGE

18.1 Prepare the Workspace

18.2 The ggplot2 Package

18.3 ggplot vs. qplot

18.4 The Grammar Components

18.5 Barplot

18.6 ggplot Syntax

18.7 Barplot Stacked

18.8 Barplot Dodged

18.9 Barplot with Facets

18.10 Boxplot with Overlay of Points

18.11 Boxplot with Facets

18.12 Simple Scatterplot

18.13 Scatterplot with Points & Shapes Set by a Factor

18.14 Regression Lines Set by a Factor

18.15 Scatterplot with Facets

18.16 Changing Colors & Styles

18.17 Grey Scale

18.18 Black and White Background with Grid

18.19 Color Palettes

18.20 Applying a Color Palette

18.21 Example Theme: Wall Street Journal

18.22 Color Blind Correction

18.23 Interactive Graphics

18.24 Graphics Resources

18.25 Practice Time

19. WRITING & APPLYING FUNCTIONS

19.1 Preparing the Workspace

19.2 Applying Functions to Data Frames

19.3 A map Example

19.4 The Family of Map Functions

19.5 Using map_dbl

19.6 An Example Function

19.7 Rules for Writing Functions

19.8 Applying mystats

19.9 Anonymous Functions

19.10 Including Functions from Files

19.11 Practice Time

20. STATISTICS REVIEW

20.1 Goals of Statistical Analysis

20.2 Meaning of Significance

20.3 Impact of Data Size

20.4 Impact of Multiple Testing

21. BASIC STATISTICS

21.1 Preparing the Workspace

21.2 R’s summary function

21.3 The skim Function

21.4 jmv Package’s descriptives Function

21.5 Frequency & Percent Tables

21.6 Cross-tabulation & Chi-Square

21.7 R’s Built-in table Function

21.8 Table-Related Functions

21.9 Cross-tabulations Using the jmv Package

21.10 Other Categorical Functions

22. CORRELATION & REGRESSION

22.1 Preparing the Workspace

22.2 Correlation

22.3 R’s Built-in cor Function

22.4 Built-in Test of Significance

22.5 R Commander’s rcorr.adjust Does More

22.6 Multiple Regression

22.7 Modeling Functions

22.8 Linear Models Using lm

22.9 Model Contents

22.10 Printing Entire Model Contents

22.11 Finding Relevant Functions

22.12: Table of Regression Formulas

23. COMPARING GROUPS

23.1 Preparing the Workspace

23.2 Two Independent Groups

23.3 Independent Samples t.test

23.4 Independent Samples Non-parametric wilcox.test

23.5 Paired Samples

23.6 Paired Samples t.test

23.7 Paired Samples Non-parametric wilcox.test

23.8 Comparing More Than 2 Groups

23.9 Getting Means & Variances

23.10 Test for Equality of Variances

23.11 Create AOV Model

23.12 Plot Diagnostics

23.13 Types of ANOVA Tests

23.14 Get ANOVA Table

23.15 Types of Means

23.16 Estimated Marginal Means

23.17 Comparing All Means

23.18 Comparing Means to Control

23.19 Methods of Comparison

23.20 Adjustment Types

23.21 Compact Letter Display (CLD)

23.22 Plotting All Comparisons

23.23 EM Means Interaction Plot (MIP)

23.24 Table of ANOVA / ANCOVA Formulas

23.25 Practice Time

24. HIGH QUALITY OUTPUT

24.1 Preparing the Workspace

24.2 Output Formatting Options

24.3 The kable Function

24.4 Create Models to Display

24.5 xtable: One Model

24.6 texreg: One Model

24.7 texreg: Two Models

24.8 texreg Reference

24.9 apaTables

24.10 Practice Time

25. DEBUGGING CODE

25.1 Preparing the Workspace

25.2 Debugging Steps

25.3 Objects Not Loaded

25.4 Formulas and Data

25.5 Misspelled Object Names

25.6 Misspelled Package Names

25.7 Package Not Installed

25.8 Missing Quotes for Character Variables

25.9 Forgetting to Reference a Data Frame

25.10 Simple Functions Need Vectors

25.11 Messages About Arguments

25.12 Messages About Arguments II

25.13 Too Many Functions

25.14 Too Many Results for map_dbl

25.15 Missing Pipe Operator

25.16 Missing Quotes in Subscripts I

25.17 Quotes for File Names

25.18 Quotes for Package Names

25.19 Missing Commas, Parentheses, etc.

25.20 Missing Functions

25.21 “Error in Select”

25.22 Use Two Colons in Function Calls

25.23 Use One Colon When Detaching

26. GRAPHICAL USER INTERFACES TO R

26.1 The R Commander

26.2 BlueSky Statistics

26.3 jamovi

27. CONCLUSION

27.1 Brief Review

27.2 Providing Feedback

27.3 Future Support

27.4 Question Time

Here is a slideshow of previous workshops.