Alright. First some essential background:
1:
You cannot measure randomness.
2:
Random does not mean uniform
By tossing a coin and converting the heads/tails to 1s/0s you might get these sequences:
1010011010
1111100000
Which one is more likely? The answer is, of course, that they are exactly equally likely.
Now someone might point to the apparent pattern in the 2nd sequence. Don't be fooled by that. If we discarded those sequences with "patterns"
then the data would no longer be random because we made a conscious selection.
Depending on how hard we look and how the data is presented a pattern may be more likely than a non-pattern. (See: Clustering illusion)
One last thing to bear in mind is the importance of how the data is interpreted. Let's say you treat those binary sequences as numbers and convert them to decimal:
666
992
Suddenly the innocent first sequence turns into the ominous number of the beast while the 2nd sequence becomes inconspicuous.
Obviously this would change yet again if we were to reverse the conversion of head/tails into 0s/1s and so on...
Let's turn to the GCP now:
Their Random Number Generators (RNGs) produce 1s and 0s. Like tossing a lot of coins very fast and converting to 1s and 0s.
Then they count how many 1s they got.
Each of their RNGs does 200 "tosses"per second. Mathematically one expects 100 1s. Mind: "Expects" is mathematical jargon!
It means that if you generate a few million (or more, the more the better) seconds of data then you will find that the average number of 1s per second approaches 100.
However: In less than 6% of all seconds will the number be actually exactly 100.
IE in any ordinary sense we don't expect exactly 100.
It's still the case that any specific combination of 1s and 0s is equally likely. There simply are more combinations with 100 1s than, for example, 200 (for which there is only 1 combination). Basically when we analyse the data like that we lump together a lot of combinations simply based on how many 1s they have and that is why some outcomes end up more likely than others.
In the above examples we have an equal number of 1s which means that this way of analyzing would not distinguish between the two.
The more 1s there are in a set period of time, the fewer combinations there are that have still more 1s. IE the less often you will get more 1s and this then is taken (by the GCP) to indicate that something is going on.
Before we go on I must explain why multiple analysis is a problem:
To do this I am going to talk about throwing dice instead of coins. We can easily translate the results of any sufficiently large number of coin throws into a dice throw. We just make a table which maps 1/6th of all combination to each side of a dice. This will only be an approximation but it will be arbitrarily close.
Throw the dice. Let's say you get a 3. What are the chances? 1/6th obviously. How amazing that we should have gotten this number instead of any other... But wait we could have said the same with any other number so this is obviously fallacious.
It's surely not noteworthy that a dice should land at all. Only if the number is somehow called in advance is there any reason to take note.
And what if we call 1 number but throw more than 1 dice? Certainly puts a hit in perspective. And more importantly: If we were using coins instead and translating the results with a table we could get 2 throws in 1 simply by changing the table.
When you're only looking at how many 1s you get you can also use the same data for multiple "throws" for example by looking at different time periods.
I hope that's enough basics to understand the basic problems with the GCP.
Let's get to some concrete instances now:
911
What Brian Dunning said:
Basically, when something happens, like 911 happened, they then went back to their data, they looked at the data from that period, and they tried very hard to find a pattern. By using the right statistical controls, carefully chosen statistical controls, it is easy to find just about any kind of pattern you want. That’s basically what they’re doing.
I hope that my introduction gives enough background so that one may realize why this is bad.
On his show's site Dunning gives this quote:
We show that the choice was fortuitous in that had the analysis window been a few minutes shorter or 30 minutes longer, the formal test would not have achieved significance... We differ markedly with regard to the posted conclusions. Using Radin’s analysis, we do not find significant evidence that the GCP network’s EGG’s responded to the New York City attacks in real time. Radin’s computation of 6000:1 odds against chance during the events are accounted for by a not-unexpected local deviation that occurred approximately 3 hours before the attacks. We conclude that the network random number generators produced data consistent with mean chance.
From:
http://www.lfr.org/LFR/csl/library/Sep1101.pdf
Here's a quote from Radin that Tsakiris gives:
In any case, several of us have independently re-analyzed certain events like 911 and we get the exact same result.
I think Americans call that being economical with the truth.
How much trust should we have in their analyses of other events that did not receive the same degree of attention?
Furthermore: It simply is not enough that an event was known in advance to happen. The precise "recipe" for the analysis has to be pre-specified.
You can find a pattern in virtually any random data, you just (probably) won't find the same pattern if you look at more of that data (a so-called out-of-sample test).
New year
New year's eve is in some ways ideal to look at. It is a recurring predictable event. There's no question of maybe simply not reporting the analysis if it turns out to be unfavorable.
However, the question is obviously if they always used the same recipe.
The answer is no. They did it right in the last 3 years ('07-'09) but without any remarkable results.
In 2006 they use a different recipe that they developed by 'data mining'. In 2005 they had looked through previous New Year's data for a pattern and then tried to find the same pattern in 06. As one would expect from random data it was not there.
There's more to be said but I don't want to bore people with irrelevant detail. Suffice to say:
It’s almost silly to have to go over this stuff repeatedly, because it’s all so spelled-out so darn clearly on the Web site.
IMHO the GCP is fatally flawed as a scientific experiment. An experiment is like asking Mother Nature a Yes/No question.
Practically, there is no way in which the GCP could find a no answer given the way it is set up.