| |||||||
| Scientific debates Discussions on the scientific side of psi research, including, publications, news, books, experiments, podcasts etc. Skeptics and supporters. |
![]() |
| | LinkBack | Thread Tools | Display Modes |
| |||
| Here is the long-anticipated review of Rupert Sheldrake's telepathic parrot study. Due to length, this review will be two posts. I apologize for being so wordy. As with all my reviews, I highly recommend you get the paper and read it for yourself. (http://www.scientificexploration.org...ke_morgana.pdf) If you don't care to read it in it's entirety, you can at least have it handy to refer to for greater detail or to see if I have made an error in my comments. First, a personal aside, and a little name-dropping. This paper refers to the world's most famous parrot, Alex (may peace be upon him). I had the distinct pleasure of meeting Alex while I was at the University of Arizona. I was hanging out in a hallway, waiting for a seminar to begin. From the other end of the hall I heard a loud "cat call" whistle (the kind of whistle you would stereotypically hear when an attractive woman walks by a construction site). I was intrigued, so I wandered to the hall to investigate. I arrived at the entrance to a laboratory and saw a research assistant holding a grey parrot. He looked at me and whistled again. Then something clicked in my head. I looked at the girl holding the parrot, with my eyes probably bulging out by now, and said, "Is this Alex?". She said it was. I was so excited, having read about him and seen him in several videos. He said "hello" to me, but I don't remember him saying anything else. I had to leave to get to my seminar, and they probably don't like gawkers hanging around much anyway, so I left after a couple minutes. It was a big thrill for me. This might seem odd to a lot of people, but it was like a couple minutes in nerd heaven for me. Anyway, on with the review. Synopsis This is a report of an experiment with an African Grey parrot, N’kisi (pronounced ‘‘in-key-see’’) that is reportedly capable of using more than 700 words. A 30 word subset of N'kisi's vocabulary was chosen to be key words for the experiment. Pictures were selected that contained images representing some of these words. In the end, a total of 20 key words were used in the experiment. These words corresponded to images in the pictures, with some pictures containing images for multiple key words. N'kisi's owner, Aimee Morgana, and N'kisi were in separate rooms. A wireless baby monitor was used to allow Morgana to hear N'kisi, but there was no allowance for N'kisi to hear or see Morgana. Each "trial" consisted of Morgana opening an envelope containing one of the photos and concentrating on the image for two minutes. N'kisi was recorded, with his comments transcribed by multiple independent individuals. For more detail on the experimental design and transcription process, see the paper. Only the "key words" said by N'kisi were transcribed. The key words said by N'kisi during each two-minute trial were compared with the pre-determined set of key word represented in the picture for that trial. A statistical analysis was performed to assess the significance of the key word matches between N'kisi and the images. THe significance is reported at a level of p = 0.0002. Analysis This is an interesting study, perhaps providing a basis for further investigation. The problems in this study are sufficient to call the reported results into question. Among the most blatant problems is the lack of a negative control. If we are to determine that there is an information transfer between Morgana and N'kisi, and this information transfer is impacting the words N'kisi says, we must have data with and without Morgana 'sending' the information. We only have one side of this equation. I don't know if I can stress enough the importance of having negative controls in order to have any confidence in experimental results. The consistent failure to include negative controls in experimental designs is inexcusable. There are multiple decisions that have the potential to tilt the outcome in a particular direction. A small subset of key words is selected and a set of photos in which these key words are unevenly represented is chosen as targets. The frequency with which N'kisi says the various words is not reported, so we have no way of determining if the word being represented in the target photo has altered this frequency. Of 147 trials conducted, only 71 were included in the analysis, this will lead to an overestimate of the significance of the results. Any trial in which N'kisi did not say any key word is discarded. This appear to be a formalization of confirmation bias. That is, trials that represent unequivocal failure are ignored. I will touch on this a bit more below. Even in those trials reported, we are not presented with a complete list of words said, only the 'key words' are presented. It would seem to matter if "flower" is one of fifty words said in a trial in which a flower is in the photo, or if "flower" is the only word said during that trial. But again, confirmation bis appears built in. Only those words that could possibly count as hits are reported, with no indication of the total verbal output of N'kisi durning the trials. When reviewing the results that are presented, it is striking how many times certain words are represented in the photos. "Medicine" and "Flower" both appear frequently. Its almost as though the photos were selected for their correspondence with N'kisi's favorite words. To be continued... Last edited by Im a Hedge; 10-10-2008 at 10:26 PM.. Reason: correcting spelling errors |
| Sponsored Links - register to remove ads |
| |
| |||
| The Review process and the Editorial policy of The Journal of Scientific Exploration. At the end of this paper, there are comments by two of the reviewers. This is uncommon in scientific journals, but it is a practice that is gaining popularity. I was intrigued by this, so I looked into the review policy of the journal. The website reports that individuals reviewers may choose to remain anonymous. If they so desire, they may include a statement to be published alongside the paper. It does not appear that this statement is the same as the comments that the reviewer provides to the authors and to the editor. This paper is accompanied by comments by two reviewer, who are identified as being the two original reviewers. It is common to find reviewers that are in the same, or closely-related, field as the study being evaluated. While that is understandably difficult in some of the fields covered by the JSE, one of the reviewers for this paper was from the Space Science Division at the NASA Ames Research Center, which seems quite far removed from telepathic parrots. This particular reviewer (Jeffrey D. Scargle) presents some criticisms similar to some of mine given above. He states: Quote:
The other reviewer (Mikel Aicken), also mentions the omission of data: Quote:
"In passing, I mention that the permutation test done in the article is incorrect". The permutation test is their key method of statistical analysis, and the basis for their conclusions. He then dismisses his own concerns once more. Surprisingly, this reviewer seems to have recommended publication. Let's go over this again. The authors did not present all of their data. Their ommission of data was found to have an have the effect of making their conclusions appear more strongly supported than they are. The statistical test which allows the authors to claim a significant effect is cited by a reviewer as an incorrect test. Yet, the paper is published in this state. The Editor comments that the 'split decision' of the initial two reviewers causes the paper to be sent to two additional reviewers, with the majority decision being 'publication'. There is a saying about seeing sausage and legislation in the process of being made. If one sees how these are made, one tends to loose confidence in the result. The process of peer-review in scientific publication is not like this. The first-hand experience of the review process increases confidence in the result. Borderline manuscripts get the benefit of expert analysis and suggestions. They may be improved and resubmitted for publication. Poor quality manuscripts are rejected. The process actually works quite well. A reviewer is often presented a series of options, such as: Accept as is. Accept with minor revisions. Accept with major revisions Reject. The first and the last choices are reserved for the extreme cases. A paper may be so superb that it clearly should be published, or it may be so horrible that it must clearly be rejected. Typically, the choice comes down to "Minor revisions" and "Major revisions". The main difference between these two options is that "Minor revisions" may be reviewed and accepted by the editor once resubmitted by the authors. "Major revisions" usually requires that the reviewers see the revised manuscript, and make a new decision. From the reviewers comments that were published alongside this paper, I would have expected "Accept with major revisions". That would be my decision on this paper. My approach is to give the authors a chance to address my concerns. It would have to be absolutely ghastly for me to say "Reject" upon initial review. (I typically say "Major revisions", providing a set of criticisms and suggestions. The ones that have come back to me after that I have ended up with "Minor revisions", usually with minor grammatical, spelling, or style comments.) The fact that this paper made it through and was published in this form, without incorporating criticisms and suggestions by the reviewers, does not reflect well upon the quality of the Journal of Scientific Exploration. To sum up, I agree with the statement of the first reviewer, Jeffrey D. Scargle. "I do not believe that this experiment provides any evidence supporting the claim of telepathy." I am Hedge Last edited by Im a Hedge; 10-10-2008 at 10:27 PM.. Reason: typo |
| |||
| Quote:
Quote:
The rest of your critique seems based upon whether it should have been published and you conclude in a agreement with Jeffrey D. Scargle who in past has also suspected psi is due to file drawer publication bias....I think Radin's Entangled Minds books shows this to be very unlikely. But there is another type of publication bias where scientific journal peer review prefer to publish psi studies that fail, rather than psi studies that succeed. So if one only includes the most esteemed peer reviewed journal publications, there could indeed be bias against psi. For example Sheldrake's telepathic dog experiments were rejected for publication in a surely coherent 'Animal Behaviour' journal on the a-priori grounds that psi would not be taken seriously .... Whilst the very prestigious British Journal of Psychology published Wiseman/Smith dog trial that apparently failed..... and which you agree was flawed. Last edited by Open Mind; 10-12-2008 at 11:27 AM.. |
| |||
| Yes, it appears the authors did what was requested. But we still don't have the data. I think this represents failure at all points of the system. The authors failed, the reviewers failed (at least three of them), and the editor failed. Quote:
My other major criticism, even worse than the missing data, is the absence of negative controls. In my opinion, this would also render the paper unpublishable, as there is no way to rule out non-telepathic explanations. Other concerns I have, are the selection of the key word list, the uneven representation of the key words in the images, and the lack of information of N'kisi's background frequency of different words. I don't know anything about Jeffrey D. Scargle, other than that he was reviewer for this paper and he works at NASA. If he has prior experience in psi research, that would help explain why he was chosen as a reviewer. I don't know about all the publication biases, although it's easy to believe they exist. As I said in my review of the Wiseman and Smith paper, I am not impressed with the British Journal of Psychology's decision to publish it. I don't know if this is a single case of poor working slipping through, if this is representative of the general quality of the journal, or if they had some ulterior motive in publishing this particular poor piece of work.One journal publishing bad papers doesn't excuse other journals from criticism. I think the file drawer effect is constantly present. This is in all fields, not just psi. Some types of results are harder to get published. You have to do a lot of additional work if you have a negative result that you think is worth publishing, because the journals aren't usually interested. I don't remember seeing many papers with titles like, "Multiple tests show that a novel extraction method does not work". Papers like this could be very useful in helping other researchers avoid repeating mistakes, but they just don't get much attention from the journals. As you say, there may be a sort of reverse-file-drawer effect resulting from journals refusing to publish positive psi findings. It would be interesting to learn if this paper (or others in the field) were submitted to more mainstream journals initially. If so, it might be enlightening to learn which journals, and to see the reviewers' comments (if reviewer) or the editor's comments. Thanks for taking the time to read and comment on my review. I am a Hedge |
| |||
| Hedge, I really want to recommend that you read Entangled Minds by Dean Radin. Not because I think the book will convince you of the reality of psi, but because you get to hear the view of one of the major experimentalists in the field, get a tour of the history of psi research, responses to common criticism and references to many many studies. Many of the best psi studies are listed there. This book was really helpful to me when I read it. I did go through many of the original studies mentioned in the book also. Some I found convincing. Some I found to have weaknesses. Of course, Radin has his opinions. But this book is not about cherry picking. My point here is that it will make your journey thorough this territory much easier. Tor |
| |||||||||
| This is a response to comments by David Bailey regarding paper, but in a different thread. I'm trying to consolidate the comments on this paper into a single thread, so I'm posting my reply here. To avoid loosing context, here's a link to David's post in the other thread (http://forum.mind-energy.net/skeptik....html#post7576). I also quote the relevant parts of David's post in my response below. Quote:
Quote:
Quote:
Quote:
I'm not sure what scoring method makes the most sense. They score each key word said, and count it as a hit or a miss, depending on whether or not is was pre-determined to be represented in the picture for that trial. I need to think about it more, but I'm not sure that's the best system. You could score each trial by whether or not a correct key word was said. You could score the percentage of represented key words that were said during the trial. For example, if the picture shows "flower" and "chair", and "chair" is said, but not "flower", you score that 50%. You could score a percentage of words said during the trial that are represented in the picture. For example, with the same picture as before, if "flower", "car", "water", and "keys" are said, this would score 25%. There are many variations that could be used, and I don't know which would be the most appropriate. Then there's the question of how to use these scores to distinguish between the telepathy hypothesis and the null hypothesis. So, this is a long-winded way of saying that I don't know what the optimal design would be, but I'm pretty sure it's not the one that was used. Quote:
Quote:
Quote:
Quote:
In this case, I don't think we even know that the result is non-chance. We don't have the data, and the legitimacy of the statistical test has been called into question. I don't know how to be sure the results are not chance without having the data. Quote:
Thanks again both of you (David and Open Mind) for taking the time to read and critique my review, and to discuss this research with me. I am a Hedge |
| |||
| Quote:
I am a Hedge |
| |||
| Quote:
![]() No, the researcher might be completely convinced of the existence of the phenomenon in question, so would be happy to implement the reviewers recommendations, confident that he'll still get statistically significant results. But then what happens is other skeptics complain about this modified design. The point here being that it is quite impossible to satisfy all skeptics. Parapsychologists can only ever satisfy some depending on their experimental design. But it's the skeptics that are not satisfied who are vocal and the ones who get listened to. So parapsychologists can't win here. Quote:
|
| |||
| I must amend my review. I was reading the paper again, and it appears that I was overly critical of the review process. There is a paragraph in the paper specifically addressing a criticism by one of the reviewers. From the paper: Quote:
I am a Hedge |
| |||
| wow man, that was a great review. I can't tell if your a Skeptic but if you are you have just replaced Ersby as my favorite one. when I first heard the results I was really impressed, but when I saw the data I really had some doubts. But the biggest thing, for me at least, was that this experiment was 3 years ago and we have heard nothing since. It would have been replicated by now if it were for real. As for the lack of a negative control, I thought that was the whole point of the bootstrap analysis which produced the 5 in 20,000 odds of success at the rate N'Kisi got. |
| Sponsored Links - register to remove ads |
| |
![]() |
| Thread Tools | |
| Display Modes | |
|
|