Big Data

Nassim says:

This tutorial presents the intuitions of the randomness of sample correlation (spurious correlation) and the methodologies in derivations.
Some later sections are somewhat technical as rederived an old equation with more precise functions (in order to apply to fat tails) and showed the distribution of the maximum of d variables with n points per variable.
This paves the way to the real scientific work on random matric theory under fat tails and failure of Marchenko-Pastur.

We’re more fooled by noise than ever before, and it’s because of a nasty phenomenon called “big data.” With big data, researchers have brought cherry-picking to an industrial level.

Modernity provides too many variables, but too little data per variable. So the spurious relationships grow much, much faster than real information.

In other words: Big data may mean more information, but it also means more false information.

Link: Beware the Big Errors of ‘Big Data’