Non-Euclidean statistics beyond linear regression

Statistical data
Editor's note:

This is a Brookings Center on Regulation and Markets working paper.


Academics, researchers, industry, and advocates who attempt to influence public policy use data analysis. In economics, linear regression is the workhorse for quantitative research. Linear regression requires a series of assumptions as well as a set of restrictions on what types of data and relationships between that data can exist. What if there were other methods to analyze relationships between variables that did not require those assumptions or restrictions? That would open up a new set of data and potentially the discovery of a new set of findings that could result in better policy.

This paper considers the application of one type of advanced statistical model that does not use linear regression or even assume a specific form behind the functional relationships between variables. The paper details three examples to demonstrate its ability to detect relationships in data that does not initially present either a numerical or ordinal scale. These three examples, socioeconomic status, movement of stock prices over time, and voting patterns within Congress, do not have any inherent relationship to each other. The three have been studied and analyzed together to show the ability of this model to detect and quantify interesting relationships.

Specifically, this AI model uses non-Euclidean statistical techniques to infer quantitative scales for variables and estimate metric distances among attributes. The model evaluates the fit by chi-square, a different statistically model for error than the normal distribution, which is commonly used in linear regression.  There is no linear regression at any moment in the model. By avoiding the use of regression techniques and statistics, this model has the potential to uncover new insights. It also opens up a broader question of where future analysis can migrate as mathematical and computational abilities create opportunities for more sophisticated analytical tools than regression analysis.

Download the full working paper here.

The Brookings Institution is financed through the support of a diverse array of foundations, corporations, governments, individuals, as well as an endowment. A list of donors can be found in our annual reports published online here. The findings, interpretations, and conclusions in this report are solely those of its author(s) and are not influenced by any donation.