Workshop: Managing Data with R

Before you can analyze data, it must be in the right form. Getting it into that form is often where we spend most of our time. This live-via-Internet 4-hour workshop shows how to perform the most commonly used data management tasks in R. We will cover how to use R’s popular add-on packages and compare them to R’s older built-in functions.

R--143

Most of our time will be spent working through examples that you may run simultaneously on your computer. You will see both the instructor’s screen and yours, side-by-side, as we run the examples and discuss the output. However, the handouts include each step and its output, so feel free to skip the computing; it’s easy to just relax and take notes. The slides and programming steps are numbered so you can easily switch from computing to slides and back again.

Many of the examples come from the extensive data management examples R for SAS and SPSS Users and R for Stata Users. That makes it easy to review what we did later with full explanations, or to learn more about a particular subject by extending an example which you have already seen.

The workshop is presented live-via-Internet on Friday afternoons in one 4-hour session. The session is divided into sections of around 20 minutes per topic and there are breaks every 75 minutes. There is ample time to ask questions verbally or by typing them into a Q&A window. At the end of the workshop, you will receive a set of practice exercises for you to do on your own time, as well as solutions to the problems.

The entire session is recorded and available for study for two weeks afterwards. The instructor is available to handle workshop-related questions both during the workshop and at any time in the future.

Prerequisites

Attendees should know basic R programming, including how to read data files and call functions.

Learning Outcomes

When finished, you will be able to prepare most data sets for analysis.

Presenter

Robert A. Muenchen is the author of R for SAS and SPSS Users and, with Joseph M. Hilbe, R for Stata Users. He is also the creator of r4stats.com, a popular web site devoted to analyzing trends in analytics software and helping people learn the R language. Bob is an ASA Accredited Professional Statistician™ with 30 years of experience and is currently the manager of OIT Research Computing Support (formerly the Statistical Consulting Center) at the University of Tennessee. He has taught workshops on research computing topics for more than 500 organizations. Bob has written or coauthored over 70 articles published in scientific journals and conference proceedings, and has provided guidance on more than 1,000 graduate theses and dissertations.

Bob has served on the advisory boards of SAS Institute, SPSS Inc., StatAce OOD, the Statistical Graphics Corporation and PC Week Magazine. His suggested improvements have been incorporated into SAS, SPSS, StatAce, JMP, STATGRAPHICS and several R packages. His research interests include statistical computing, data graphics and visualization, text analytics, and data mining.

Computer Requirements

The workshop is delivered to your computer using Cisco WebEx. You can join a test meeting and see computer system requirements here. It’s important to test your computer since many organizations require special privileges to modify your computer to accept the browser plug-in that makes it work.

Course programs, data, and exercises will be sent to you a week before the workshop. The instructions include installing R, which you can download R for free here: http://www.r-project.org/

We will also use RStudio, which you can download for free here: http://RStudio.com. If you already know a different R editor, that’s fine too.

Course Outline 

  1. Transformation basics
  2. Conditional transformations
  3. Summarization of columns and rows
  4. Summarization by group
  5. Analysis by group
  6. Sorting data
  7. Selecting first or last observation per group
  8. Miscellaneous variable tools (rename, keep, drop)
  9. Stacking data frames
  10. Finding and removing duplicate observations
  11. Merging data frames
  12. Reshaping data frames
  13. Comparing variables and data frames
  14. Character string manipulations
  15. Date / time manipulations (not in shorter useR! presentation)
  16. Using SQL within R (not in shorter useR! presentation)

Here is a slide show of previous workshops.

2 Responses to Workshop: Managing Data with R

  1. Pingback: Webinar: Managing Data with R | r4stats.com

  2. Pingback: Job Trends in the Analytics Market: New, Improved, now Fortified with C, Java, MATLAB, Python, Julia and Many More! | r4stats.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s