Category Archives: Scientific Method

Problems and Beliefs

St. Brandon writes about problems in philosophy, but a lot of what he says seems applicable to physics as well.  In my field of quantum gravity, there’s no experimental evidence so a lot of the back-and-forth has to do with identifying conceptual problems with different ideas.  It’s not always easy to know which problems are fatal to a theory, and which should be viewed as an impetus to more research.

On a side note, we love to talk about the beliefs of scientific researchers (do you believe in string theory or loop quantum gravity or something else?) but in fact beliefs don’t always directly affect how one does research.  The most important thing is what questions one thinks are worthy of further investigation.  Two researchers may be developing the exact same argument A, even though one person is trying to work out the consequences of idea X, while the other is trying to refute X by a reductio argument.  However, it is important to have enough flexibility of mind to realize when you have accidentally constructed an argument for the other side.

On the other hand, beliefs do matter indirectly for structuring research, because they help determine which problems you think are worthy of study, and what factors you take into consideration.  Also, they obviously help determine what conclusions you draw when you’re done.  Beliefs about how one should structure an inquiry may be more important than beliefs about what the final conclusion should be.

Can Religion be Based on Evidence?

So I’d like to get kicking soon on the project of actually presenting the positive evidence for Christianity.  In my view the best evidence is the historical testimony of the apostles to Jesus’ Resurrection (along with other ancient and modern miracle claims).  However, some people have problems with this because it isn’t scientific, and they think that only a “scientific” proof of miracles should qualify as evidence.

The idea that Science is the only very reliable way to gather empirical data is called (usually pejoratively) Scientism.  It is closely related to Naturalism, the belief that the world consists entirely of a certain class of physical things, of a sort which can be scientifically analyzed.  However, the two are not the same, since Scientism is a claim about there being only one good methodology for learning about the world.  A Naturalist is free to believe that there are valid nonscientific methods for learning about the world, as long as they also think those methods don’t reveal the existence of any entities they’d consider supernatural.  (There’s a bit of a definition problem in defining what exactly natural vs. supernatural can mean, but we more or less know what kinds of things this sort of person doesn’t believe in: gods, miracles, spirits or ghosts of any kind, psychic powers, destiny, reincarnation etc.)

Well, Scientism in its strongest form is obviously stupid, since as I pointed out here there exist several other kinds of evidence-based inquiry that involve different methodologies.  Here’s another rebuttal by atheistic philosopher Richard Chapell, who points out that Scientism isn’t even logically consistent with itself.  So, there may or may not be good reasons to believe in religious claims, but “Science” taken by itself is not one of them.

Well, that was easy.  Maybe too easy.  Because, after all, someone could say this:  Even if there are nonscientific methods of inquiry, hasn’t Science at least taught us something about the way the world is?  And hasn’t it taught us something about what kinds of evidence are reliable?  Maybe there isn’t a sharp contradiction between Science and Religion, but maybe there are things that make it more difficult for a scientifically-minded person to accept religious claims.  I think a lot of people have this idea at the back of their heads, and I’m going to try to address it in my future posts.

For further reflections on the relationship between Science, History, Philosophy, and the various arguments for and against Christianity, see here:

Can Religion be Based on Evidence?

It also explains briefly why I think the Historical Argument for Christianity is quite strong, although I plan to go into that in considerably more detail here.

(Erratum: there are a couple things I’d phrase differently if I were re-writing this essay now.  First of all, my parenthetical statement about “overcredulous Catholics, Pentecostals, and missionaries to Third World nations” was intended as a statement of a skeptical point of view rather than my own view, although there certainly are some overcredulous people in the groups named.  And this book has convinced me that modern day miracles are more frequent than I had previously thought.  Also, the phrase “tortured to death” should really be replaced with “tortured or killed”—in fact the whole sentence is too strongly written, and should make clearer who exactly it refers to.  For now read it referring to “several of the key eyewitnesses”, I guess.)

Bayes’ Theorem

Today I’d like to talk about Bayes’ Theorem, especially since it’s come up in the comments section several times.  It’s named after St. Thomas Bayes (rhymes with “phase”).  It can be used as a general framework for evaluating the probability of some hypothesis about the world, given some evidence, and your background assumptions about the world.

Let me illustrate it with a specific and very non-original example.  The police find the body of someone who was murdered!  They find DNA evidence on the murder weapon.  So they analyze the DNA and compare it to their list of suspects.  They have a huge computer database containing 100,000 people who have previously had run-ins with the law.  They find a match!  Let’s say that the DNA test only gives a false positive one out of every million (1,000,000) times.

So the prosecutor hauls the suspect into court.  He stands up in front of the jury.  “There’s only a one in a million chance that the test is wrong!” he thunders, “so he’s guilty beyond a reasonable doubt; you must convict.”

The problem here—colloquially known as the prosecutor’s fallacy—is a misuse of the concept of conditional probability, that is, the probability that something is true given something else.  We write the conditional probability as \(P(A\,|\,B)\), the probability that \(A\) is true if it turns out that \(B\) is true.  It turns out that \(P(A\,|\,B)\) is not the same thing in general as \(P(B\,|\,A)\).

When we say that the rate of false positives is 1 in a million, we mean that $$P(\mathrm{DNA\,match}\,|\,\mathrm{innocent}) = .000001$$(note that I’m writing probabilities as numbers between 0 and 1, rather than as percentages between 0 and 100).  However, the probability of guilt given a match is not the same concept: $$P(\mathrm{innocent}\,|\,\mathrm{DNA\,match}) \neq .000001.$$The reason for this error is easy to see.  The police database contains 100,000 names, which is 10% of a million.   That means that even if all 100,000 people are innocent, the odds are still nearly equal to .1 that some poor sucker on the list is going to have a false positive (it’s slightly less than .1 actually, because sometimes there are multiple false positives, but I’m going to ignore this since it’s a small correction.)

Suppose that there’s a .5 chance that the guilty person is on the list, and a .5 chance that he isn’t.  Then prior to doing the DNA test, the probability of a person on the list being guilty is only 1 : 200,000.  The positive DNA test makes that person’s guilt a million times more likely, but this only increases the odds to 1,000,000 : 200,000 or 5 : 1.  So the suspect is only guilty with 5/6 probability.  That’s not beyond a reasonable doubt.  (And that’s before considering the possibility of identical twins and other close relatives…)

Things would have been quite different if the police had any other specific evidence that the suspect is guilty.  For example, suppose that the suspect was seen near the scene of the crime 45 minutes before it was committed.  Or suppose that the suspect was the murder victim’s boyfriend.  Suppose that the prior odds of such a person doing the murder rises to 1 : 100.  That’s weak circumstantial evidence.  But in conjunction with the DNA test, the ratio becomes 1,000,000 : 100, which corresponds to a .9999 probability of guilt.  Intuitively, we think that the circumstantial evidence is weak because it could easily be compatible with innocence.  But if it has the effect of putting the person into a much smaller pool of potential suspects, then in fact it raises the probability of guilt by many orders of magnitude.  Then the DNA evidence clinches the case.

So you have to be careful when using conditional probabilities.  Fortunately, there’s a general rule for how to do it.  It’s called Bayes’ Theorem, and I’ve already used it implicitly in the example above.  It’s a basic result of probability theory which goes like this: $$P(H\,|\,E) = \frac{P(H)P(E\,|\,H)}{P(E)}.$$The way we read this, is that if we want to know the probability of some hypothesis \(H\) given some evidence \(E\) which we just observed, we start by asking what was the prior probability \(P(H)\) of the hypothesis before taking data.  Then we ask what is the likelihood \(P(E\,|\,H)\), if the hypothesis \(H\) were true, we’d see the evidence \(E\) that we did.  We multiply these two numbers together.

Finally, we divide by the probability \(P(E)\) of observing that evidence \(E\).  This just ensures that the probabilities all add up to 1.  The rule may seem a little simpler if you think in terms of proability ratios for a complete set of mutually exclusive rival hypotheses \((H_1,\,H_2\,H_3…)\) for explaining the same evidence \(E\).  The prior probabilities \(P(H_1) + P(H_2) + P(H_3)\ldots\) all add up to 1.  \(P(E\,|\,H)\) is a number between 0 and 1 which lowers the probability of hypotheses depending on how likely they were to predict \(E\).  If \(H_n\) says that \(E\) is certain, the probability remains the same; if \(H_n\) says that \(E\) is impossible, it lowers the probability of \(H_n\) to 0; otherwise it is somewhere inbetween.  The resulting probabilities add up to less than 1.  \(P(E)\) is just the number you have to divide by to make everything add up to 1 again.

If you’re comparing two rival hypotheses, \(P(E)\) doesn’t matter for calculating their relative odds, since it’s the same for both of them.  It’s easiest to just compare the probability ratios of the rival hypotheses, because then you don’t have to figure out what \(P(E)\) is.  You can always figure it out at the end by requiring everything to add up to 1.

For example, let’s say that you have a coin, and you know it’s either fair (\(H_1\)), or a double-header \(H_2\).  Double-headed coins are a lot rarer than regular coins, so maybe you’ll start out thinking that the odds are 1000 : 1 that it’s fair (i.e. \(P(H_2) = 1/1,001\)).  You flip it and get heads.  This is twice as likely if it’s a double-header, so the odds ratio drops down to 500 : 1 (i.e. \(P(H_2) = 1/501\)).  A second heads will make it 250 : 1, and a third will make it 125 : 1 (i.e. \(P(H_2) = 1/126\)).  But then you flip a tails and it becomes 1 : 0.

If that’s still too complicated, here’s an even easier way to think about Bayes’ Theorem.  Suppose we imagine making a list of every way that the universe could possibly be.  (Obviously we could never really do this, but at least in some cases we can list every possibility we actually care about, for some particular purpose.)  Each of us has a prior, which tells us how unlikely each possibility is (essentially, this is a measure of how surprised you’d be if that possibility turned out to be true).  Now we learn the results of some measurement \(E\).  Since a complete description of the universe should include what \(E\) is, the likelihood of measuring \(E\) has to be either 0 or 1.  Now we simply eliminate all of the possibilities that we’ve ruled out, and rescale the probabilities of all the other possibilities so that the odds add to 1.  That’s equivalent to Bayes’ Theorem.

I would have liked to discuss the philosophical aspects of the Bayesian attitude towards probability theory, but this post is already too long without it!  Some other time, maybe.  In the meantime, try this old discussion here.

What is NOT Science?

In my Pillars of Science series, I enumerated six aspects of Science that help explain why it works so well.

It should be clear from my analysis that the characteristics of Science are quite flexible.  All of the criteria are matters of degree, so that they are met more strongly by some fields of study than by others.  Because of this fuzziness, we should expect to find borderline sciences, such as Economics, Anthropology, Psychology, and other social sciences.  It is both futile and unnecessary to try to come up with a criterion to draw an exact line between science and non-science.  In other words, the question of what counts as Science cannot itself be resolved with scientific precision, and is therefore not a scientific question.

This doesn’t bother me too much because my parents are linguists.  So when I was growing up, they made sure I was aware that concepts are defined by their centers, not their boundaries.  For example, if I say the word “chair”, then what pops into your mind is a thing with four legs at the dinner table.  You might admit under interrogation that a “beanbag chair” is also a chair, but it’s hardly the first thing you’ll think of.  Concepts can be useful even when they’re a bit fuzzy at their boundaries.

Despite their flexibility, the criteria are sufficiently strict that many things don’t qualify.  I don’t just mean pseudo-sciences such as astrology or homeopathic medicine, but genuine evidence-based fields of knowledge (“sciences” in the archaic sense of the word) which aren’t scientific in the modern sense, because they only satisfy some of the criteria.

For example, History and and Courts of Law, despite their empirical character, deal mostly with unique and unrepeatable events.  So they fail the repeatability prong of Pillar I.  Both of these fields are based primarily on testimony of witnesses, although Law Court fact-finding has much stricter rules about admissibility of evidence.  Since much of their subject matter can’t be defined with quantitative precision, they don’t do terribly well on Pillar IV either.  Academics in History do have a truth-seeking community similar in kind to the Sciences.  But in Law Courts, the role of ethics, community, and authority is completely different.

This does not mean that these fields should be held in contempt; their methods are sometimes capable of establishing specific facts with a very high degree of certainty, “beyond a reasonable doubt” as the saying goes.  They simply lack the particular methodology of science, which has a proven track record of almost routinely proving astonishing facts about the world, to a degree that ends rational opposition.  If you try to increase certainty by imposing a “scientific” approach on a subject that isn’t suited for it, you risk generating a pseudo-science which jingles the jargon of science while missing its core value: self-correction through rigorous testing of ideas.

Philosophy is nonscientific for a different reason than the empirical humanities.  While many philosophers strongly value elegance and precision of ideas, typical disputes between philosophers are not very amenable to empirical testing.  That doesn’t mean that observation plays no role.  But the way philosophers typically make arguments, they also rely on controversial background assumptions, which can’t be definitively settled just by looking at the world.

If, despite the potential for controversy, the argument for the position is sufficiently convincing, this can still establish the philosophical position with great certainty.  In fact, unless the skeptical thesis that no knowledge is reliable could be refuted with near certainty, the result would be that no field of inquiry could produce near certainty.  This potential for certainty does not change the fact that Philosophy operates by a different methodology, which on average does not resolve controversies as easily as the methods of Science or even History do.

For this reason a philosophical thesis based on Science will usually have the degree of certainty associated with Philosophy, not that associated with Science.  A chain of reasoning is only as strong as its weakest link.  So a philosophical argument based on Science should not necessarily trump, e.g. a strong historical argument, simply because Science is normally more reliable than History.

So how do we fit ideas from different fields together?  In a future post, I’ll discuss Bayes’ Theorem, which is a flexible way to think about all different kinds of evidence-based reasoning, without making specific assumptions about the sorts of evidence we can include.

Pillars of Science: Summary and Questions

I’ve now completed my Pillars of Science series.  My goal was to analyze why Science is  such an amazingly effective method for discovering new truths about the world.  Here are the 6 “Pillars” I identified.  Of course, Science is a multifaceted word: it can refer to a method, a set of theories, or a community.  Understanding how Science works really requires thinking about all 3 together.

Intro:

A. How do we test scientific ideas?

B. What kinds of ideas can be tested scientifically?

C. Who can test them effectively?

Having laid this preparatory groundwork, in the next few weeks I’d like to get to a more exciting and controversial topic: I plan to discuss Christianity specifically in the context of each of these 6 Pillars to see how well it holds up.  (But before I get to that, I plan to post a bit about whether there are any other evidence-based ways of looking at the world, besides Science.)

You see, in this blog I am taking seriously the “What about Science?” objection to Christianity.  Many people think that the basic principles of Science are somehow refute or undercut religious views.  These are supposedly based on something called “faith” which is diametrically opposed to “evidence”.  While everyone knows that some scientists are religious, many people think this is only possible because of “compartmentalized thinking” in which the two different approaches to life are somehow sealed off in different compartments so that the “evidence” compartment isn’t allowed to explode the “faith” compartment.

Now those of us who practice the spiritual discipline of Undivided Looking obviously approve of UN-compartmentalized thinking, in which we think of reality as a whole, without making special exemptions for parts of life we don’t want to subject to critical scrutiny.  Somewhat paradoxically, this does not require us to disapprove of compartmentalized thinking.  In certain respects Science itself is based on compartmentalized thinking (see Pillar III).

And we couldn’t stop doing it even if we tried, because our brains are wired for compartmentalized thinking.  (Especially the male brain, which is more likely to delegate tasks to particular regions of the brain, whereas the female brain is more likely to think using connections between different parts of the brain.  See e.g. this study.)  But what we can and should do sometimes, is make a conscious effort to look at things together, rather than separately.

Since I’m going to be referring back to these six Pillars of Science, I’d like to ask for some reader feedback.  Do you think my discussion of these Pillars could be improved?  I’d like to solicit criticisms on any of the following issues, or anything else you can think of:

  • Is there any practice which is important to Science which I have not included in the Pillars?  Or which I should have emphasized more?
  • Is there anything which I’ve said is important for Science, which actually isn’t?  Are there branches of Science which do without any of these things?
  • My perspective is that of a physicist who works on fundamental issues.  But there’s lots of other scientific fields: Biology, Geology, Chemistry, etc.  Do you think someone from these fields might have prioritized different aspects of scientific practice than I did?