That’s a good example of a formula that an author may use in the text because it’s easy for readers to understand. But I would hope that if a professor knew enough to develop a new statistical method and program it into an R package, that s/he would know that that formula is terrible for use in programming. I assumed that SAS or SPSS programmers would study the code in the R package, not blindly follow formulas that are likely to lead to trouble. The most recent reference of which I am aware for the accuracy of R, SAS, etc. is this article, which builds on your earlier work: A comparative study of the reliability of nine statistical software packages, Keeling & Pavur, Computational Statistics & Data Analysis, Volume 51, Issue 8, 1 May 2007, Pages 3811–3831.

Best regards,

Bob

Cheers,

Bob

SAS programmers (I know this for a fact) and I suspect SPSS programmers as well do NOT program from books and articles. For example, books typically present the least squares estimator as b = (X’X)^-1X’y but it should not be programmed this way. For further details see McCullough and Vinod, “The Reliability of Econometric Software,” Journal of Economic Literature, 1999.

Peer review is no assurance of accuracy is correctness, and peer review does not end debate. That an article is peer-reviewed only means that someone (the referees) think the debate can begin.

Regards,

Bruce

You raise a critically important point. The main R download is thoroughly validated as described here: http://www.r-project.org/doc/R-SDLC.pdf. That document lists the packages, and R’s help() function will tell you which package any particular function is in. The validation process covers just over 8,000 functions that are roughly the equivalent to:

Base SAS, GRAPH, STAT, ETS, IML, and some of Enterprise Miner. From that set it’s missing Structural Equation Modeling, Multiple Imputation, and its various graphical user interfaces such as Enterprise Guide and SAS Studio.

For SPSS users, it’s roughly the equivalent to IBM SPSS Base, Statistics, Advanced Statistics, Regression, Forecasting, Decision Trees, Neural Networks and Bootstrapping. Missing from that set is the SPSS graphical user interface.

So those commands you can count on for accuracy. Many other packages are based on books or journal articles that have passed the peer review process. When that’s the case, the functions are likely reliable. In fact, SAS and SPSS programmers probably followed those same books and journal articles aiming to get the same answers. However, many more come from sources of unknown accuracy and I recommend investigating them carefully before using them, just as you would a SAS or SPSS macro that you found on someone’s web site.

Cheers,

Bob

That’s an excellent point. Certainly no single person could ever master the vast array of functions that R offers, nor would one person ever need to. The same is true of SAS. SAS offers enough capability to meet the needs of the vast majority of researchers. In addition, when SAS Institute bothers to include a method of analysis, you know it has gone through a vetting process that indicates that it is a method that is important to a wide audience. New R functions come out at such an amazing pace that it’s hard to know which ones to learn and which are unlikely to become widely used. But if you happen to need one of those niche functions, R is the tool that’s much more likely to have it.

Cheers,

Bob

I agree that SAS is wonderfully versatile and powerful software. I especially like the latest release of SAS Studio. I wish RStudio offered as many features! SAS Institute’s own data show that overall SAS revenue is growing. However, in the main Popularity article I provide data that show usage of R is growing rapidly *in academia* (Figure 2e) while the academic use of SAS is declining. I make it clear how I collect all my data to enable people to disagree using facts rather than opinion. Where is your data?

I hope you enjoy the holiday weekend too; three days!

Cheers,

Bob

The functional comparison is misleading. Many R functions are redundant; where SAS has a single function for mixed linear models, R has many. There are virtues in that — if you are really into the details of mixed linear models, the differences may be important — but for many users it’s just noise.

Regards,

Thomas

]]>SAS is far more versatile, powerful and especially creative.

That is why it is on the increase in both the academic and commercial world, while R has plateaued or even decreased due to competition from other products such as Perl and Python.

Deny it, if you can!

Enjoy the Holiday weekend.

Mark Ezzo

Columbus Consulting Corporation

O: 610-666-1492

C: 267-261-5560

]]>