R has a wide variety of machine learning (ML) models. While the many ML functions solve similar problems by predicting various outcomes, they use a confusing array of different command styles, making them hard to learn. Fortunately, the caret package provides a standard approach to dozens of ML functions, greatly speeding learning and use. This full-day hands-on workshop starts with ML basics and takes you step-by-step through increasingly complex modeling styles.

Most of our time will be spent working through examples that you may run simultaneously on your computer. You will see both the instructor’s screen and yours, side-by-side, as we run the examples and discuss the output. However, the handouts include each step and its output, so feel free to skip the computing; it’s easy to just relax and take notes.

This workshop is available at your organization’s site, or via webinars.

The 0n-site version is the most engaging by far, generating much discussion and occasionally veering off briefly to cover topics specific to a particular organization. The instructor presents a topic for around twenty minutes. Then we switch to exercises, which are already open in another tabbed window. The exercises contain hints that show the general structure of the solution; you adapt those hints to get the final solution. The complete solutions are in a third tabbed window, so if you get stuck the answers are a click away. The typical schedule for training on site is located here.

A webinar version is also available. The approach saves travel expenses and is especially useful for organizations with branch offices. It’s offered as two half-day sessions, often with a day or two skipped in between to give participants a chance to do the exercises and catch up on other work. There is time for questions on the lecture topics (live) and the exercises (via email). However, webinar participants are typically much less engaged, and far less discussion takes place.

For further details or to arrange a webinar or site visit, contact the instructor, Bob Muenchen, at muenchen.bob@gmail.com.

**Prerequisites**

This workshop assumes a basic knowledge of R. Introductory knowledge of statistics is helpful, but not required.

**Learning Outcomes**

When finished, participants will be able to use R to import documents in a variety of formats and analyze them with regard to topics or style.

**Presenter**

Robert A. Muenchen is the author of *R for SAS and SPSS Users* and, with Joseph M. Hilbe, *R for Stata Users*. He is also the creator of r4stats.com, a popular website devoted to analyzing trends in data science software, reviewing such software, and helping people learn the R language. Of the over 750 R blogs on the Internet, Feedspot rates r4stats.com the eleventh most influential.

Bob is an ASA Accredited Professional Statistician™ with 35 years of experience and is currently the manager of OIT Research Computing Support (formerly the Statistical Consulting Center) at the University of Tennessee. He has taught workshops on research computing topics for more than 500 organizations and has presented workshops in partnership with the American Statistical Association, RStudio, DataCamp.com, New Horizons Computer Learning Centers, Revolution Analytics (acquired by Microsoft), and Xerox Learning Services. Bob has written or coauthored over 70 articles published in scientific journals and conference proceedings and has provided guidance on more than 1,000 graduate theses and dissertations.

Bob has served on the advisory boards of SAS Institute, SPSS Inc., StatAce OOD, the Statistical Graphics Corporation, and PC Week Magazine. His suggested improvements have been incorporated into SAS, SPSS, StatAce, JMP, jamovi, BlueSky Statistics, STATGRAPHICS and numerous R packages. His research interests include statistical computing, data graphics and visualization, text analytics, and data mining.

**Computer Requirements**

On-site training is best done in a computer lab with a projector and, for large rooms, a PA system. The webinar version is delivered to your computer using Zoom (or similar webinar systems if your organization has a preference.)

Course programs, data, and exercises will be sent to you a week before the workshop. The instructions include installing R, which you can download R for free here: http://www.r-project.org/. We will also use RStudio, which you can download for free here: http://RStudio.com. If you already know a different R editor, that’s fine too.

**Course Outline**

1. Introduction

2. Overview of Machine Learning

3. Intro to the caret Package

4. Data Pre-processing

5. Principal Components

6. Dummy Variables

7. Partitioning Data Sets

8. Feature Selection

9. Controlling Model Training

10. Classification and Regression Trees

11. Random Forests

12. Gradient Boosting via xgboost

13. Neural Networks

14. glmnet

15. ROC Curves

16. Model Tuning Grids

17. Choosing a Model

Here is a slideshow of previous workshops.