The Statistical Package for the Social Sciences (SPSS) is an analytical tool widely used in social sciences, including educational psychology, although it is also popular in other fields such as the health sciences and marketing (Field, 2013). It is a great tool for non-statisticians due to its easy to navigate graphics user interface, with most basic data analysis capable of being accomplished through menus and dialog boxes without having to learn the SPSS language (Green & Salkind, 2010).
Statistical Analysis System (SAS) is a powerful suite of procedures, in addition to the well-known, well-documented, and flexible software language used to manipulate them (Dimaggio, 2013). Since its mass distribution in the 1970’s, SAS has become the statisitics’ industry standard, and has accumulated large amounts of high-quality production code for multiple purposes. For example, many large companies as well as federal and local government public health agencies utilize the program (Dimaggio, 2013). It has strong data handling capabilities, and automatically produces diagnostics and easy to interpret plots and output.
In the field of advanced and predictive analytics, SAS holds a 35.4% market share with total revenue of $768.3 million, and had been used at more than 70,000 sites in 135 countries (IDC, 2015) Although SPSS ranks second in advanced analytics industry, its market share of 17.1% is only about half of SAS (IDC, 2015).
Data analytical tools include two divisions: programming operations and windows operations (Minelli, Chambers, & Dhirai, 2012). SPSS executes user’s commands through interactive windows, with most features accessible via pull-down menus (Arbuckle, 2010). This is particularly useful as the menus and dialog boxes offer visual reminders of most of the available options with each step in the analysis. It also provides a programming platform that uses a proprietary 4GL command syntax language. Some complex applications can only be programmed in syntax and are not accessible in menu structures, while others are simply more quickly carried out by typing a few key words (Levesque, 2005).
In comparison, although SAS does provide a graphical point-and-click user interface for non-technical users and more advanced options through the SAS language, it is significantly more programming-oriented than SPSS (Salkind, 2010). SAS programs have a DATA step, which retrieves and manipulates data, usually creating a SAS data set, and a PROC step that analyzes the data. Each step uses a series of statements to carry out its functions (Delwiche & Slaughter, 2012). Compared to other statistical packages, using SAS requires a relatively low knowledge of coding provided the only analyses being run are those for which it already has PROCs. Outside of these, coding can be extremely difficult, and any error in it will prevent the execution of the desired command (Delwiche & Slaughter, 2012).
Charts & Graphics
SPSS’s chart-builder is designed to produce various types of graphs. SPSS cannot create a chart with the spreadsheet data automatically; users are required to go to the graph builder pull-down menu and designate the chart-style (e.g., Horizontal Bar/Vertical Bar), variables, x-axis and y-axis. SPSS provides standard statistical graphs but little else, and the chart produced is not usually visually appealing (Field, 2013).
In comparison, the graphic procedures in SAS are much more complex (Wicklin, 2013). Its graphics are created largely using syntax language whereby users must write a chart procedure followed by a series of statements regarding the group variables, sub-group variables, visual style, and all other elements shown in the chart (Wicklin, 2013). Although the SAS Graphic procedure is significantly more complicated than SPSS, it has its own advantages: by weaving SAS macros into programming code, users can regenerate graphs on a routine basis, providing them with more flexibility and significantly reducing work volume (Zhu, Zeng, & Wang, 2010).
Flexibility in Model Building
The SPSS operation interface makes it extremely convenient for entering data and manipulating rows and columns. Incorporation of the 4GL command syntax language makes the software more flexible and allows users to customize their model based on programming language. However, the syntax of SPSS is poor and has a complex command structure, and its menu-style interface is frequently reported as a major obstacle (Levesque, 2005). In advanced statistical analysis, the SAS system, although more difficult to use, can provide more control than SPSS due to command line interface/advanced editor coding (O’Rourke & Hatcher, 2013). Additionally, the output is well organized and there is a clear record of how an analysis was accomplished
Information Management Capacity
Information management has three levels: descriptive analytics (i.e. quantitatively describing the characteristics of a data sample), predictive analytics (i.e. using known information to predict future outcomes), and prescriptive analytics (i.e. identifying correct decisions based on these predictions, and the effects of these decisions; De Bakker et al., 2005). SPSS can be utilized to build predictive models and conduct other analytics tasks. Its visual interface allows users to leverage statistical and data mining algorithms without programming. However, the menu-driven feature of SPSS is a major obstacle in handling more complex predictive analysis, and becomes significantly more cumbersome with prescriptive analysis (Van Barneveld, Arnold, & Campbell, 2012). On the other hand, SAS is considerably stronger in information management. For example, SAS has a specific module for predictive and prescriptive analytics which uses sophisticated and powerful data analysis techniques to garner significant information from large databases (Power, 2014).
To conclude, each statistical package has its own strengths and weaknesses. SPSS is easy to learn and use. It includes a full range of data management system and editing tools, provides in-depth statistical capabilities, and offers complete plotting, reporting, and presentation features. SAS can manage, alter, mine, and retrieve data from a variety of sources and perform complex statistical analyses on it. It provides both a graphical point-and-click user interface for non-technical users and more advanced options through the SAS language. Although not as easy to use and quick to learn as SPSS, SAS offers considerably greater possibilities with regards to analysis. If used frequently, the commitment required to learn SAS becomes worthwhile.
Arbuckle, J. L. (2010). IBM SPSS Amos 19 user’s guide. Crawfordville, FL: Amos Development Corporation, 635.
De Bakker, F. G., Groenewegen, P., & Den Hond, F. (2005). A bibliometric analysis of 30 years of research and theory on corporate social responsibility and corporate social performance. Business & Society, 44(3), 283-317.
Dimaggio, C. (2013). Introduction. In SAS for Epidemiologists (pp. 1-5).
Springer New York. Field, A. (2013). Discovering statistics using IBM SPSS statistics. Sage.
Green, S. B., & Salkind, N. J. (2010). Using SPSS for Windows and Macintosh: Analyzing and understanding data. Prentice Hall Press.
International Data Corporation (2015) Worldwide Business Analytics Software Market Shares, 2015: Healthy Demand Despite Currency Exchange Rate Headwinds, IDC says [press release]. https://www.sas.com/content/dam/SAS/en_us/doc/analystreport/idcbusiness-analytics-software-market-shares-108014.pdf. Accessed 24 April 2017
Levesque, R. (2005). SPSS® Programming and Data Management. A Guide for SPSS® and SAS® Users. 2nd ed. Levesque, R., editor. SPSS Inc.
Lora D. Delwiche; Susan J. Slaughter (2012). The Little SAS Book: A Primer : a Programming Approach. SAS Institute. p. 6. ISBN 978-1-61290-400-9.
Minelli, M., Chambers, M., & Dhiraj, A. (2012). Big data, big analytics: emerging business intelligence and analytic trends for today’s businesses. John Wiley & Sons.
O’Rourke, N., & Hatcher, L. (2013). A step-by-step approach to using SAS for factor analysis and structural equation modeling. Sas Institute.
Power, D. J. (2014). Using ‘Big Data’ for analytics and decision support. Journal of Decision Systems, 23(2), 222-228.
Salkind, N. J. (Ed.). (2010). Encyclopedia of research design (Vol. 1). Sage..
Van Barneveld, A., Arnold, K. E., & Campbell, J. P. (2012). Analytics in higher education: Establishing a common language. EDUCAUSE learning initiative, 1(1), l-ll.
Wicklin, R. (2013). Simulating data with SAS. SAS Institute.
Zhu, W., Zeng, N., & Wang, N. (2010). Sensitivity, specificity, accuracy, associated confidence interval and ROC analysis with practical SAS implementations. NESUG proceedings: health care and life sciences, Baltimore, Maryland, 1-9.