Announcement

Collapse

Skeptiko forums moved

The official forums of the Skeptiko podcast have moved to http://skeptiko.com/forum/.
As such, these forums are now closed for posting.
See more
See less

A discussion of the Beischel experiment and an attempt to leap the "Hurdle"

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A discussion of the Beischel experiment and an attempt to leap the "Hurdle"

    I’m posting this in a new thread as I think that makes sense.

    Some time ago Alex asked for criticism of the Beischel study. I said I’d do so. The “challenge” changed something and morphed into the “first hurdle” thread. Along that path there was also a call to speak to the Skepdude post (I haven't done that here, I may...later...).

    This is my effort. The path has been interested, involving much introspection, and not a little analysis and research. It’s been interesting and personally thought provoking.

    Fisrtly, a disclaimer and a call for reasoned replies.

    Disclaimer: This is my own work, it is all my own opinion. I am not a social scientist: have a background in Physics and Maths and have spent most of my career marketing relatively complex financial information and analysis. I have spent the last few years working with statistical software.

    I appreciate that I am open to criticism of suffering from the “Engineer’s Problem”. I have tried to mitigate that as much as possible.

    I will also add that I found this an incredibly difficult exercise. (Not just due to lack of time). The paper is somewhat confusing and it really doesn’t help when the authors keep making wild leaps into conjecture or asserting their beliefs as fact.

    Rules of engagement:
    I have tried to look at this from a critical point of view, but with an open mind. If you want to engage in this discussion, in addition to the normal and expected requirements for civilised discussion I ask that you address any of three forms of argument.

    1. Technical errors. If you are going to call me on that you are required to back up your assertions and probably cite references.

    2. Errors of opinion.
    Again, you are welcome to point this out, but you must supply a rational and reasonable counter-argument.

    3. Errors of interpretation. If you think I’ve misunderstood something, let me know and point out where I’m wrong.

    If you think I’ve got something else wrong, do speak up and tell me what and how something’s wrong and what the correct answer is.

    Structure: Fossil has done an exemplary analysis in general, so I’m going to limit myself to areas where I have problems.

    1. Objective

    I will admit that originally I got the objective of the exercise wrong (and I think others have too). I thought they were looking for evidence of anomalistic communication. It seems that they were actually testing whether they could blind out successful results.

    They are making the massive conjecture that any anomalistic communication is due to unexplained communication from dead folks to mediums.

    Realising this almost made me give up this whole idea. To me, attempting an experiment that has its root plausibility based on wild conjecture seems doomed to failure. It’s not even as though this is controversial science, it’s just not science at all from a mainstream point of view.

    2. Blinding, taken to excess, without controlling

    To begin with, blinding is perhaps a misnomer. The implication is that by “blinding” the protocol in this way, there is somehow a removal of information leakage.

    Giving the “mediums” names, removes a degree of blinding. By definition there is some information leakage. I am going to say that I found the researchers understanding of blinding somewhat odd.

    As mentioned elsewhere, if e.g. “Kate” can be identified with only that information, then surely it’s little stretch to identify her based on no information. I am afraid this gave me a huge mental image of a heavenly telephone exchange, “Kate, is Kate there? No, not you, the right Kate…”

    Anyway, I’m afraid that we go right back to the beginning, if there’s no anomalous transmission, it doesn’t matter how much you blind the experiment, it still gives no positive results.

    3. Sitter’s judgement

    I find it interesting that the sitters have to judge individual statements, yet then make a forced summary judgement as well.

    I see that the researchers believe that scoring statements makes the whole thing more robust. Unfortunately, we don’t have the statements at hand and we don’t know the facts about the supposed discarnates.

    I would prefer to see the statements inspected by an independent judge, who can then assess accuracy. Actually, I’d just like to see them published alongside the facts.

    I am afraid, as a hard scientist, that this weakens the conclusions. I have no data on which to independently verify whether there is any anomalous information flow, or not. Problem.

    My suggested interpretation is that what we’re seeing is a degree of apophenia, specifically the Forer effect – especially as there’s a forced choice.

    Sitters given readings were motivated internally and externally to recognise one as their own, mediums were potentially able to tilt the readings slightly based on the “discarnates” names.

    4. Controls

    Although I take on board the “single subject study” idea, where one can self control, we’re not discussing a drug test here.

    In my mind this is a physical science experiment not a social science one.

    So we need to control for confounding factors. A couple of questions I would like to see answered include:

    What’s the control result if the “mediums” are talented cold readers? Note well, here, cold reading includes a great deal of exploitation of apophenia, Forer effect, cognitive biases in general, etc. There doesn’t need to be a feedback loop between sitter and reader. The sitter is quite capable of doing the heavy lifting in isolation. Especially when forced.

    Why not simply do the readings without names? As above, it’s somewhat questionable that any posited “discarnates” are working off the names anyway.

    5. Statistics

    As stated, I’m not a statistician.

    I am far from convinced that it’s appropriate to use exact binomial stats. The main assumption (of randomness) underlying this seem to be violated. If we use Forer’s data, the probability that a given person will agree that a set of suitably vague statements agrees with their personality is 85.2%. I’m not saying these are suitably vague statements, but we don’t know what they are, so no baseline.

    So we’re not looking at random events. This makes me question the p values quoted. I am quite prepared to be told that this may make the p values smaller than quoted if appropriate analysis is applied!

    And, sorry, here it comes, sample size. Sure, a small sample can give you useful evidence, but only really draw hard conclusions in the presence of prior plausibility or a mechanism.

    Here, well it appears superficially interesting, and maybe worthy of further study, but that’s as far as we can go.

    It is certainly not evidence that consciousness survives death, that discarnate consciousnesses can communicate with incarnate ones, or that blinding the experiment a wee bit makes life any harder for those involved.

    6. Results

    OK, here’s where I really have an issue.

    I’m going to skip the summary scoring as I suspect self confirmation bias at play.

    Leaping to the “which reading is more applicable” section, I have my biggest problem here.

    Problem One. They’re forced choices. Leave the choice free.

    Problem Two. Of the 13 positive results, only 10 were “clearly” or “moderately” more applicable. I’ll allow the “slightly” more applicable to pass.

    Given that we know that people will self identify (say 85% of the time) with entirely alien descriptions – this seems entirely reasonable and supports the hypothesis that some mediums are better at writing statements that are identified as describing the sitter.

    Of the negative results, there are 1 “clearly” and 1 “moderately”.

    I’m going to suggest a different judging system. Given that the experimenters’ schema is entirely arbitrary, how about this?

    To be fair, we only count “more” applicable results to “hit”. After all, surely we have controlled for “discarnates” influencing our sitters’ decisions?

    This, at best, gives us a headline result of 11 out of 16.

    If I was being harder, that would be 10, and if I felt like (and there’s no statistical reason not to, ask any medical student) scoring negatively and counting the “clearly” and “moderately” incorrect scores against the results we’d have 9 out of 16.

    Which even under approximate binomial probability seems pretty close to chance.

  • #2
    I'm interested in the way you use the forer-effect in your counter argument. I think you agree with me that that effect would be equally applicable to both readings. If the data was truly random.

    From my point of view, if you are correct, then the results of the study are not what is expected from a random data set explained with the forer-effect.

    For example, if the data was random then, with this small sample size, I think it would be fair to expect at least one truly negative result ("reversal"). With a higher score of the control reading as supposed to the intended reading in the average summary rating score.

    So then it looks like the mediums where able to "tilt the readings slightly", you mention this yourself. You give as an argument that a medium would be able to do that based on only a name. That seems very reasonable if they didn't pair on a same gender deceased peer. It is mentioned that they did, and it seems less likely then (although I can think of some obvious examples of course, foreign names, old fashioned names).

    That the forer-effect did play a role in the actual choice the sitters made to which reading was intended for them. Where one even gave the highest rating to a reading not intended for the sitter, even if the summary score for the other reading was higher (or equal). Could well show that effect because it is so close to your 85%.

    I think your 'apophenia point' applies well on the forced choice. Though, in my opinion you should look for a different stronger explanation to account for the "significant summary score".
    Last edited by IscopeU; November 16th, 2010, 05:16 PM.

    Comment


    • #3
      Thank you for reading and responding.

      Yes, I (think) I agree completely with you.

      I do note that they run 4 pairs with intended first vs "control" first - I haven't had time to really run through the stats - nor am I particularly capable of doing so, but while I think that their protocol naively controls (!) for the Forer effect, I'm not convinced it works completely.

      Comment


      • #4
        Umm, I think my summary score point is simply that interpreted differently, it's no result.

        But I do see what you mean.

        I have to say, this was all bl00dy difficult. I'm not complaining about it as it was interesting and rewarding. I have a far better understanding of many parts of the whole topic.

        Doesn't make me not be a sceptic mind you,

        Comment


        • #5
          I think it might help to consider an analogous conventional experiment.

          Suppose you took 16 samples of underarm sweat from women - 8 while they were ovulating, and 8 at some other time (selected randomly), and you asked a man to judge the samples ovulating/not-ovulating using a forced choice. Suppose the result came out with 13 right out of 16. Would we be justified in concluding that at least the man under test, could discriminate on this basis?

          Alternatively, pick another conventional experiment - maybe an actual experiment - I mean from my perspective, forced choice experiments are made all the time - they seem to be an acceptable way to do an experiment.

          I am not sure how to classify this comment.

          David

          Comment


          • #6
            No no, for me this is not about turning you into a believer
            It is very difficult and i respect you for doing this. I saw your percentages for apophenia and it struck me how well it fit on the choice the people made when you pointed it out. It was kind of refreshing. I expected a sample-size rant because that seems the more obvious route to take. Instead you put some real thought in this and it showed.
            And I think I agree that all this blinding didn't (perhaps can't) fully exclude this forer-effect. And since we know nothing about the actual readings you should use it in your scepticism, of course!

            Comment


            • #7
              Originally posted by David Bailey View Post
              I think it might help to consider an analogous conventional experiment.

              Suppose you took 16 samples of underarm sweat from women - 8 while they were ovulating, and 8 at some other time (selected randomly), and you asked a man to judge the samples ovulating/not-ovulating using a forced choice. Suppose the result came out with 13 right out of 16. Would we be justified in concluding that at least the man under test, could discriminate on this basis?

              Alternatively, pick another conventional experiment - maybe an actual experiment - I mean from my perspective, forced choice experiments are made all the time - they seem to be an acceptable way to do an experiment.

              I am not sure how to classify this comment.

              David
              Well, the point is not that you can't use forced decision.

              But if the mediums gave a slight 'tilt based on name' data, then the actual choice can, based on the forer effect, account for 80% correct choices.

              Correct me if i'm wrong.

              Comment


              • #8
                David, I see your analogy. Yes, and this is why the results are, in their own way, interesting.

                What is being suggested to me is that yes, there is information flow, but the experiment as conducted doesn't do anything to suggest there's anything anomalous going on.

                All good food for thought.

                Not being hubristic or anything, but I'm a little surprised (possibly relieved?) that there haven't been more replies!

                Comment


                • #9
                  Originally posted by IscopeU View Post
                  Well, the point is not that you can't use forced decision.

                  But if the mediums gave a slight 'tilt based on name' data, then the actual choice can, based on the forer effect, account for 80% correct choices.

                  Correct me if i'm wrong.
                  The whole idea of a "name tilt" bothers me quite a bit. Unless these readings are significantly different from others I have read about or actually experienced personally, this just will not happen. Any variety of circumstances or characteristics may apply to any person with any name. Even people with the exact same first and last name are quite different in even the broadest characteristics. My name for instance, belongs to:

                  1) A 60+ yr old Canadian businessman
                  2) A 17-year old high school student
                  3) A 20-something race car driver
                  4) A semi-pro comedian in his 30's
                  5) An American drunk driver/criminal
                  6) A vegan artist in his 40's who lives in Holland

                  And this is just the 5 examples besides myself that I see most often when my name pops up on the Internet - and that is my full name. If you only look for "Andrew" or "Andy", the range extends even more, to the point where no meaningful distinction can be made. I once worked at an office that had three men named Caleb. It is still the only time I've met anyone with that name, and each of these men were totally different from each other:

                  1) Caleb 1: 30 year old Canadian Mathematician and Programmer
                  2) Caleb 2: 40 year old Native American (Indian) artist/animator
                  3) Caleb 3: 50 year old American bureaucrat

                  And these are three guys who all worked for the same company with the same discipline. It is so easy to think of examples like this that I really doubt the credibility of the idea that a single first name is enough to provide a medium with a meaningful clue of any kind. The only way this name would have an effect similar to what has been proposed is if: 1) The medium is fraudulent, 2) the results of the readings are so exceptionally broad that no one would have an effective means of distinguishing one reading from another.

                  AP

                  Comment


                  • #10
                    Andy, which is why we really need to see the results before we can judge better which is going on.

                    Comment


                    • #11
                      Originally posted by porker View Post
                      David, I see your analogy. Yes, and this is why the results are, in their own way, interesting.

                      What is being suggested to me is that yes, there is information flow, but the experiment as conducted doesn't do anything to suggest there's anything anomalous going on.

                      All good food for thought.

                      Not being hubristic or anything, but I'm a little surprised (possibly relieved?) that there haven't been more replies!
                      Well, I have never felt mediums were the best place to start to study psi - just too messy to cut off all conventional information flow, but I think that is where the effort needs to go. Whatever scoring scheme you use, you have to put the results into bins somehow - and choosing which reading fits best seems the best to me.

                      However, it is important to realise what this line of reasoning is really saying. In effect, you are saying that no experiment involving mediums will ever supply evidence of psi of any sort.

                      Also, I'll bet that actually faking this result using the names would be damn hard. I'd be more convinced if someone could demonstrate cold reading under these conditions.

                      David

                      Comment


                      • #12
                        No, not saying that nothing with mediusm would possibly yield results - just not like this.

                        I'd still want to set a baseline using talented cold readers.

                        Comment


                        • #13
                          Why not use the remote viewing way of 'connecting', I don't think a name is necessary. "I get a.. a.. G". They often don't know who they are talking to in the first place, it's not important. You can use shoe size, i can use any random number for remote viewing. A medium might object to this, but I find it perfectly reasonable.

                          And I don't find a 'slight tilt based on name' the best explanation ever. But for the sake of argument, let's assume that experienced cold-readers where able to do that. It's a sceptical stand point. (That's why they want a cold readers base line.)

                          The summary scores in themselves are very interesting for me. But with this sample size, is not going to be convincing to a critic. Who indeed will say that this forer-effect will explain every experiment. That's the way it is. And for us to find an experiment that shows it doesn't. This is not that experiment.

                          I can understand how you could stretch this apophenia idea to the summary score results. Although it doesn't really explain the scores in my opinion. But without knowing the actual readings, or see the scoring per item. I can see this critique justified on this paper.

                          The apophenia line of reasoning is fine, that effect is real. Even if it says "no experiment involving mediums will ever supply evidence of psi of any sort." Because that's ultimately the point. Every critic will be happy to shove that in your face.

                          A greater sample size, or a meta-analysis of summary results gets us past that effect. The summary results did show a significance. Like i said, to explain that away with the forer-effect is a lot harder. That needs a stronger argument from the sceptics. And I'm not going to ask Porker to come up with that argument now, I don't think that is fair.

                          Comment


                          • #14
                            Originally posted by porker View Post
                            No, not saying that nothing with mediusm would possibly yield results - just not like this.

                            I'd still want to set a baseline using talented cold readers.
                            Can you outline how you would design an experiment - and then perhaps imagine how others would attack that experiment if it were performed with positive results.

                            I mean, perhaps you could replace the names with code numbers, but would a medium feel they could operate under such conditions?

                            In a way, it has to be sceptics that provide a baseline of cold reading - if the experimenter does that, they could be accused of not picking a good cold reader.

                            I like ganzfield/presentiment/animal experiments best - they can be much more controlled.

                            David

                            Comment


                            • #15
                              I've got to admit I find the whole mediumship thing a bit of a stretch too, especially as all the media popular ones are frauds and charlatans.

                              I think all you'd need to do is have a nice skeptical panel select a similar number of cold readers who would also read the sitters, with the proxy sitters blind to who;s who, add the number of readings and see what the results looked like.

                              I'd negatively score as well, if I was writing the protocol...

                              Comment

                              Working...
                              X