# Freedman's paradox

Recently I came across the classical 1983 paper *A note on screening regression equations* by David Freedman. Freedman shows in an impressive way the dangers of data reuse in statistical analyses. The potentially dangerous scenarios include those where the results of one statistical procedure performed on the data are fed into another procedure performed on the same data. As a concrete example Freedman considers the practice of performing variable selection first, and then fitting another model using only the identified variables on the same data that was used to identify them in the first place. Because of the unexpectedly high severity of the problem this phenomenon became known as “Freedman’s paradox”. Moreover, in his paper Freedman derives asymptotic estimates for the resulting errors.