“It is rumored that in the outside world there is a war
and a shortage of Coca-Cola”

Quoted by Wallis (1980).

Arguably, one of the most glorious moments in the history of statistics (and econometrics) is the creation of the Statistical Research Group (SRG) in the spring of 1942. The U.S. entered World War II in December 1941, following the Japanese attack on Pearl Harbor (7 December 1941) and the subsequent declarations of war by Italy and Germany (11 December 1941). On April 1, 1942, Stanford statistician Allen Wallis¹ wrote to his friend W. Edwards Deming of the Census Bureau, offering to “[…] work out a curriculum adapted to the immediate statistical requirements of war.”

Deming’s reply on April 24 outlined a plan: statistical efforts should prioritize materials and time, given the wartime scarcity of both. This correspondence set in motion a research program that not only contributed to the war effort but also laid foundations for post-war economic advances and statistical practice.

Sequential Analysis

The history of the SRG features some of the most renowned statisticians, including Harold Hotelling, Jacob Wolfowitz, Milton Friedman, Frederick Mosteller, Abraham Wald, and Wallis himself. The SRG’s contributions were diverse, evolving as “…one problem led to another, sometimes applying a technique to an unrelated issue.” However, SRG is best remembered for its work on sequential analysis².

The origins of sequential analysis trace back to the gambler’s ruin problem, but it re-emerged in isolation during Wallis’s work estimating the number of trials required for ordnance experiments. Wallis’s theoretical sample sizes were impractical for the U.S. Navy, given resource constraints and the risk of misfires. Wallis and Milton Friedman proposed a revolutionary idea: sequential tests, allowing experiments to terminate early based on interim results. This innovation reduced resource demands significantly.

Friedman and Wallis tasked Abraham Wald with developing the concept further. Initially reluctant due to concerns about test power, Wald quickly reversed his stance, producing groundbreaking work summarized in “Sequential Analysis of Statistical Data: Applications”³.

Wald, Rubin, and Manski

One of the SRG’s iconic projects involved estimating warplane vulnerability using data from aircraft returning from combat. Sequential analysis was crucial here, as one key parameter was the probability of a plane going down after sustaining damage⁴. Though complex, Wald’s insights were later reformulated by Mangel and Samaniego (1984) for improved readability⁵.

Wald’s memoranda were extraordinary for two reasons. First, they highlighted selection bias, a concept later formalized in Donald Rubin’s Potential Outcomes Framework. Second, they anticipated Charles Manski’s work on Partial Identification (e.g., Manski, 2008⁶). Wald emphasized the limits of point estimates, advocating for identification regions to bound elusive parameters and stressing the role of assumptions in guiding statistical conclusions.

The central insight of Wald’s model was that data from returning planes revealed little about vulnerabilities. Planes that didn’t return were likely hit in critical areas, while returning planes survived despite damage. This is selection bias at its purest.

Applying Rubin’s Potential Outcomes Framework

To formalize this problem, let’s consider the survival outcome \(Y\) and the treatment, \(D\), as being hit in a specific area (e.g., the tail). The potential outcomes are \(Y(D)\) for \(D \in \{0, 1\}\). Additionally, let \(Z\) indicate whether a plane returned (\(Z=1\)) or not (\(Z=0\)).

The quantity of interest is the Average Treatment Effect (ATE):

\[ ATE = E[Y(1) - Y(0)]. \]

However, \(ATE\) is not identified because planes reveal either \(Y(1)\) or \(Y(0)\), not both. This is the essence of the identification problem. (Detailed derivations follow in the body of the text.)

Bounding the ATE with Manski’s Methods

Using Manski’s partial identification methodology, we derive bounds for the ATE. These bounds depend on observable quantities like the proportion of returning planes (\(P(Z=1)\)) and assumptions about unobservable survival rates. The results show that selection bias inflates the apparent importance of damage patterns in returning planes.

Through Wald’s work, we see how statistics illuminated critical wartime problems. His legacy extends far beyond the SRG, influencing how statisticians approach incomplete data and uncertainty. Indeed, Wald’s insights remain a testament to the enduring power of applied statistical thinking.

Wallis is perhaps best known today for the Kruskal-Wallis test, a nonparametric method for comparing distributions.↩︎
For an entertaining historical account, see Wallis, A. (1980). The Statistical Research Group, 1942-1945. Journal of the American Statistical Association, 75(320), 320-330.↩︎
Sequential Analysis of Statistical Data: Applications. (Prepared by the Statistical Research Group, Columbia University, for Applied Mathematics Panel, NDRC.) New York: Columbia University Press, 1945.↩︎
Wald’s work, “Methods of Estimating Plane Vulnerability Based on Damage of Survivors,” was initially classified. A reprint was later issued by the Center for Naval Analyses.↩︎
Mangel, M., & Samaniego, F. J. (1984). Abraham Wald’s Work on Aircraft Survivability. Journal of the American Statistical Association, 79(386), 259-267. DOI: 10.1080/01621459.1984.10478038.↩︎
Manski, C. F. (2008). Identification for Prediction and Decision. Harvard University Press.↩︎

Gone with the Planes: Selection Bias, Sequential Analysis, and Partial Identification in WWII

Sequential Analysis

Wald, Rubin, and Manski

Applying Rubin’s Potential Outcomes Framework

Bounding the ATE with Manski’s Methods