PREFACE:  "The statistics" vs  "Statistics"
Statisticians should fit the needs of the users, not the reverse! - J.W. Tukey

From the statistics to Statistics

There are statistics and statistics. "The statistics" - familiarly the "stats" - are the statistical data (averages, percentages, numbers  of all sorts) ubiquitous in  the  media and found  in all  possible and imaginable areas: official statistics , surveys, etc. By "Statistics" -  often   written with  a capital  S (the "science of  Statistics") - is meant the scientific discipline dealing with the  methods for analyzing statistical data. My work  has been concerned with "Statistics".
  Before we proceed,   two massive facts  must be mentioned, that overhang  any developments. Firstly,  there is the overwhelming dominance  (from 1945 to the present day) of Anglo-saxon statistics;   see a quasi-monopoly. Secondly, there is  the phenomenon of hyperspecialization, which fragments the same topic  (such as statistics) into isolated subspecialties.


Academic Statistics  and Statistics for researchers

The statistical discipline is a "metadiscipline", whose raw material lies outside the discipline. By its very nature, it lies at the junction of two lines of thinking, namely mathematics and  empirical sciences. Among the founding fathers of statistics, the two lines were always present; whereas nowadays they are well separated,  with academic statistics on the one hand, and  statistics for  researchers on the other hand.
One finds academic statistics in the mathematics departments of  universities and in the "theoretical" teaching in institutions like (in France) INSEE (Economical  Research) or INSERM (Medical Research). This discipline is self-called "mathematical statistics" and  purports to be a deductive theory, like mathematical physics.
One finds statistics for  researchers  in laboratories and empirical studies, from natural sciences to  social sciences.  It is an essentially normative discipline, which aims at  providing legitimate   "scientific proof", controlled by the referees of scientific journals. Let us  be  clear: We do say  "statistics for  researchers", not "applied  statistics", because even though the canons of academic statistics are recognized by statistics for researchers  in principle, they are hardly  applied in practice.
  My conviction is that while separating the two lines, it is vital to maintain the unity of Statistics (On the unfortunate consequences of the current division, see Medical statistics on the carpet). What justifies Statistics is its auxiliary role ( "Hilfswissenschaft")  of empirical disciplines. Statistics for researchers should  guide theoretical  statistics. The ideal situation is where the statistician participates in a large-scale empirical research, with scientists specifying the questions, and works out  the statistical procedures to help answer these questions.The key  ideas that give sense to my work  have all emerged in interaction  with  research problems; and  my contributions have tended to construct an autonomous  statistics for researchers.



The foundations of statistics;  the  history of statistics

The statistical discipline is a recent one,  highly dependent on  computational tools. Not surprisingly, it has faced persisting  identity problems. Originally a branch of probability theory, it was then, in the blooming days of  Operation Research, nearly absorbed in   the "science of  decisions". Nowadays,  it would rather tend to become a part  of algorithmics (a field surely  more creative).
Our  key ideas indeed  refer  to the fundamentals of statistics. But talking of the foundations of a discipline  means a specialized area, on the side of a discipline whose content is   "well established." The status of the key ideas, in contrast, is to call for  a restructuring of the traditional chapters of statistics.
The same goes for the history of statistics, to which I have been initiated by G.Th. Guilbaud and B. Bru: cf. Rouanet & Bru (1994b). At the age of the Internet, browsing  the "Electronic Journal of the history of probability and statistics" is for me a real pleasure. However, I must confess that  epistemology is not my strong point. If history fascinates me, it is (to paraphrase  Marc Ferro about history in general), "provided that his study provides an understanding of the problems of our time." Rather than scrutinize the forerunners of present dominating trends, I try to (re)discover neglected ways, that the tools of our time can make  practicable.
   It is clear that  many theoretical constructions in the past  have been built in order  to bypass the obstacle of computation: for example, the normal model.  Other theories remained in sketch form: for example, classification procedures, or  permutation modeling.  Now that the computational obstacle  is virtually removed, that the era of statistical  tables is  (or should be) over, one can and  should prefer, I believe, a direct approach to tackle the real issues that justify using  statistical methods. In fact, what were the problems  that Binet, or Durkheim, were attempting to solve ? What if they had had computers at their disposal, with their colossal databases and their fabulous means of calculation?



Statistics in Human Sciences

My work has focused on statistics in the human sciences, mainly psychology and social science, in other words, behavioral sciences, bordered  by  bio-medical statistics on one side, and   econometry on the other. As far as  statistics is concerned, this constitutes a quite homogeneous field: There is  "statistics for human sciences", not really  "statistics for  psychologists,"  "statistics for  sociologists", and so on.
In my view, the role of statistics in a research  paper should always conform to the following pattern:

Problem Research --> relevant data --> Statistical Analysis --> Statistical Results --> Research  conclusions.

Relevant data must constitute a  representative inventory of the area under study. This is the "completeness requirement"  of  Benzécri, close to the notion of  "field" of  Bourdieu. The statistical analysis should  either bring an answer to the research questions, or else  show that  the available data are insufficient to meet them. Enforcing the foregoing  scheme should facilitate the critical examination  of a research report and enable one to pinpoint  at which stage(s)   errors may have been committed: 1) Relevant data have been omitted; 2) The statistical analysis carried out is inadequate; 3) The conclusions drawn   exceed those authorized by the statistical results  (over-interpretation).

In academic statistics,   "real-life data"  are often just invoked in order  to illustrate techniques, while ignoring  research problems. Blatant violations  to the requirement of completeness abound. Suffice to mention an article by Goodman (1991), which purports to seriously discuss the comparative merits of  methods on a simple 4x5 array of social mobility, disconnected  from any context. In his reply,   D.R. Cox notes shrewdly: "A key question concerns how the models are to be adapted to address detailed substantive questions (etc.);  for example, there may be further dimensions or concurrent comment on the individuals concerned. . "




Two crucial distinctions

Beyond the diversity of disciplines, two distinctions are essential:
  1) Between experimental data (factors of interest are controlled) and observational data (factors of interest are only observed).
  2) Descriptive procedures (the findings relate to the data) and inductive ones, alias  statistical inference (the conclusions go beyond data); with in the background, the perennial problem of the role of  probabilities  in statistics.


Texts and publications

  The references to my texts and publications are given on the one hand in chronological order, on the other hand by themes (domains). Some texts are mathematically oriented and may call mathematicians interested in the applications. Others texts are case studies, where the statistical approach is exposed "in situation", and are directly readable by researchers (not necessarily versed in mathematics).


Organization of the  heading "Statistical Work"  (travaux statistiques)

. Key ideas:  Formalization, geometric,  descriptive-inductive,  specific, probability.

. Domains:  Stochastic models, analysis of variance and structured data, Combinatorial inference,  Bayesian inference, Geometric Data Analysis, Regression.

. Software, teaching, etc...

.  Reading Notes.


Heading "Personalia"

. CV and Scientific trajectory.

Please note. The heading "Loisirs" and "Feuilles et Bons Mots" lie outside  of my work.

Hyperspecialisation.

A quasi- monopoly.

Medical statistics on the carpet

 Home page
Top