CRAN Task View: Statistics for the Social Sciences

Maintainer:John Fox
Contact:jfox at

Social scientists use a wide range of statistical methods, most of which are not unique to the social sciences. Indeed, most statistical data analysis in the social sciences is covered by the facilities in the base and recommended packages, which are part of the standard R distribution. In the package descriptions below, I identify base and recommended packages on first mention; packages that are not specifically identified as "R-base" or "recommended" are contributed packages.

Other Relevant Task Views:

Beyond the base and contributed packages, many of the methods commonly employed in the social sciences are covered extensively in other CRAN task views, including the following. I will try to minimize duplicating information present in these other task views, given here in alphabetical order.

It is noteworthy that this enumeration includes about a third of the CRAN task views. Moreover, there are other task views of potential interest to social scientists (such as the Graphics task view on statistical graphics); I suggest that you look at the list of all task views on CRAN .

Linear and Generalized Linear Models:

Univariate and multivariate linear models are fit by the lm function, generalized linear models by the glm function, both in the R-base stats package. Beyond summary and plot methods for lm and glm objects, there is a wide array of functions that support these objects.

Analysis of Categorical and Count Data:

Binomial logit and probit models, as well as Poisson-regression and loglinear models for contingency tables (including models for "over-dispersed" binomial and Poisson data), can be fit with the glm function in the stats package. For over-dispersed data, see also the aod package, the dispmod package, and the glm.nb function in the recommended MASS package (associated with Venables and Ripley, Modern Applied Statistics in S, Fourth Ed. , Springer, 2002), which fits negative-binomial GLMs. The pscl package includes functions for fitting zero-inflated and hurdle regression models to count data. The multinomial logit model is fit by the multinom function in the recommended nnet package, and ordered logit and probit models by the polr function in the MASS package. Also see the mlogit for the multinomial logit model, the MNP package for the multinomial probit model, and the multinomRob package for the analysis of overdispersed multinomial data. The VGAM package is capable of fitting a very wide variety of fixed-effect regression models within a unified framework, including models for ordered and unordered categorical responses and for count data.

There are other noteworthy facilities for analyzing categorical and count data.

Other Regression Models:

It is possible to fit a very wide variety of regression models with the facilities provided by the base and recommended packages, and an even wider variety of models with contributed packages, in addition to those covered extensively in other task views .

Other Statistical Methods:

Here is a brief survey of implementations in R of other statistical methods commonly used by social scientists.

Collections of Functions:

There are some packages that are so heterogeneous that they are difficult to classify, yet contain functions (typically in multiple domains) that are of interest to social scientists:


Jangman Hong contributed to the general revision of this task view, as did other individuals who made a variety of specific suggestions.

If I have omitted something of importance not covered in one of the other task views cited, or if a new package or function should be mentioned here, please let me know.

Compilation of this task view was partly supported by grants from the Social Sciences and Humanities Research Council of Canada.

CRAN packages:

Related links: