Full description not available
A**N
Explains why Bayesian reasoning works best for everyday statistical reasoning
In his new book, Bernoulli's Fallacy, author Aubrey Clayton explains clearly and in great detail why the so-called 'frequentist school' of statistical research and accounting embodies a fatal omission, a flaw in reasoning that obfuscates and confuses both scientific researchers and ordinary people in matters involving anything to do with social science, economics, law, or psychology; it's because researchers often have hidden agendas that bias data selection, characterization, and analysis. It's a bit complicated to describe in a short review, but the bottom line is that one of the cardinal tenets of scientific research, which is that scientific experiments must be both falsifiable, and that their products and conclusions must be replicable, meaning that two or more different researchers can develop data sets from a specified population, apply the same criteria for statistical analysis, and generally come up with the same answers and conclusions. Clayton cites empirical studies of research papers that have been published within the past twenty years showing that the claims and results of more than fifty percent of certain categories of statistical research, much of which involve human behavior studies, cannot be replicated when the experiment is repeated by another researcher attempting to achieve the same result. This has serious implications for follow-on research studies because the data and the methodology become unreliable. More importantly, the fundamental assumptions of the so-called 'frequentists' is that outliers should be excluded the populations of entities that are being studied. In plain language, it offers a perfect opportunity to 'cook the books', thus biasing the study that has been advertised to be scrupulously neutral in its data selection. Evidence of this invidious bias is found in the career histories of the founders of modern statistics, Frances Galton, Karl Pearson, and Ronald Fisher, all Britons, who individually and collectively codified and professionalized the modern discipline of statistics. These men, along with many others were avid proponents of a pseudoscience known as Eugenics, the idea that certain desirable social qualities (generally found in well-off, well educated middle class people) were heritable, meaning that those traits could be passed on to their offspring. This was in the century that preceded the study of human cell biology and the double helix. What the world got was bigotry dressed up as hard science; and Galton, Pearson, and Fisher worked their will to ensure that their views of human heredity were integral with their conception of statistical science and Gaussian probability. Distasteful and these views are, at one time they were embedded in law and public policy both in Great Britain and the Commonwealth countries, and in the United States. Garden-variety racism gained a whole new area of academic respectability during the from the 1890s through 1930s, until 'scientific racism' in Germany saw its ultimate denouement in the death camps. In the United States, many states had laws permitting forced sterilization of people who were believed to be feebleminded. This is what happens when a system of accounting is allowed to exist with little consideration given to the collateral consequences of what amounts to circular reasoning and self-fulfilling prophesy. A test for statistical significance, specifically the eponymous 'null hypothesis significance testing' appears to have been one of those things that Pearson and Fisher cooked up to burnish their reputations for mathematical innovation, which most practitioners, not knowing anything about it, and not thinking about its implications in any detail, simply accepted the notion, as long as it did not impede getting their research published. Problems arose in matters of uncontrollable instances of false positives in studies that are largely attributable to base rate neglect, which is a cognitive bias that arises from not comparing the outcome of a test or experiment with those reported results' relative frequency in the outside world. Daniel Kahneman, in his magisterial book, Thinking Fast and Thinking Slow, referred to base rate neglect in his description of what he called, the 'Linda problem' (too much information about a hypothetical person he referred to as Linda, biasing the reader's answer to a question about Linda's current employment status). It's also the basis of the famously controversial question, the 'Monty Hall Problem, or Let's Make A Deal', another example used by Kahneman to illustrate that people are failing to think through the hypothetical situations readers are presented with, because their assumptions are anchored so deeply in the story line, and not in the real world.The one thing that Galton, Pearson, and Fisher insisted upon was that the prior (a priori) probability that any theory being tested be isolated from prior human experience, and that researchers use their own common sense and prior knowledge to frame the subject of inquiry. The idea was that the researcher proceed from a posture of either complete ignorance or indifference to extrinsic knowledge about the proposition being tested. The term 'statistical significance' arose and came into wide use without anyone ever questioning whether any measure of statistical coherence actually measured anything meaningful.Clayton advocates the notion that everything has a history, and not everything is immediately countable. The notion that social scientists are at an implicit disadvantage in academic research compared with those who do physics and chemistry gained traction in academic circles, leading to absurd results to mathematize research in political science, sociology, economics, and so on, because that was the only way to get one's work published in a reputable academic journal, and thus ascend the rungs of the academic career ladder.The notion that learned journals and textbooks would be find themselves burdened with published articles and papers whose provenance is now suspect because the methodology of the research findings they report cannot be replicated in subsequently conducted experiments, more than 50 percent, or because the studies themselves were, as Clayton describes them, Type III errors, where an observed phenomena was real in a statistical sense, but did not actually support the scientific theory it was supposed to, or it was something idiosyncratic to the experiment to yielded data that was of no use to anyone else. The problem of base rate neglect is that it has a tendency to prompt people to jump to conclusions, sometimes referred to as the Availability Bias; and these can lead to catastrophic consequences if it happens in the context of a criminal prosecution. The Prosecutor's Bias occurs when law enforcement uses statistical infrequency (rarity) to argue that because the coupling of one or more infrequent observations in the accused's behavior or circumstances of death, the logical conclusion had to be robbery or murder because the confluence of the accused's characteristics compelled no other conclusion. A posterior look at the facts of the cases that Clayton describes indicates that the prosecutors took everything at face value without probing deeper into whether there were significant facts that were overlooked because a statistician told the prosecutor that the confluence of observations was too rare to be ignored; but when matched against the relative likelihoods established by the base rate. Sticking with just the data you have generates false positives with marked frequency. In other words, investigators need to do a better job, which is hard because the world is influenced with which 'Law and Order' closes out its cases within the space of an hour, with commercial breaks.This is an important book. Statistics is not an easy subject to learn; and making sense of what comes out can be even more difficult. I like the idea of teaching Bayesian reasoning as a fundamental attribute of someone who has learned something in school that is intrinsically valuable. The world of Gaussian statistics and bell curves has much less applicability in real life. Learning to think probabilistically goes against the grain of wanting to know something with absolute accuracy; but that is a comfortable illusion fostered by our general unwillingness to go beneath and underneath what we're seeing out in the world. Thinking is hard work, and it's hard to get out of our comfort zone.I commend Aubrey Clayton for writing a highly readable, technically accurate book about an important skill that I am still developing.
J**N
A dissent to rote statistical methodology
I learned probability and statistics during my engineering education using the frequentist techniques discussed in this book. But I was somewhat uncomfortable with how procedures were applied.This book addressed many of the sources of my discomfort. I found the background history of Galton, Pearson, Neyman, and Fischer especially interesting.I wish there would have been more on the topics related to using computer-based techniques such as re-sampling, bootstrapping, importance sampling, etc.Overall, this was a very worthwhile read.
M**M
Well-written argument against classical frequentist statistics
The "Fallacy" in the title is this: observed data can be used to judge a hypothesis based solely on how likely or unlikely the data would be if the hypothesis were true. The author, Aubrey Clayton, calls it Bernoulli's Fallacy because Jacob Bernoulli's Ars Conjectandi is devoted to determining how likely or unlikely an observation is given that a hypothesis is true. What we need is, not the probability of the data given the hypothesis, but the probability of the hypothesis given the data.In the preface, Clayton describes the Bayesian vs Frequentist schism as a "dispute about the nature and origins of probability: whether it comes from 'outside us' in the form of uncontrollable random noise in observations, or 'inside us' as our uncertainty given limited information on the state of the world." Like Clayton, I am a fan of E.T. Jaynes's "Probability Theory: The Logic of Science", which presents the argument (proof really) that probability is a number representing a proposition's plausibility based on background information -- a number which can be updated based on new observations. So, I am a member of the choir to which Clayton is preaching.And he is preaching. This is one long argument against classical frequentist statistics. But Clayton never implies that frequentists dispute the validity of the formula universally known as "Bayes's Rule". (By the way, Bayes never wrote the actual formula.) Disputing the validity of Bayes's Rule would be like disputing the quadratic formula or the Pythagorean Theorem. Some of the objections to Bayes/Price/Laplace are focused on "equal priors", a term which Clayton never uses. Instead, he says "uniform priors", "principle of insufficient reason", or (from J.M.Keynes) "principle of indifference".Clayton is not writing for readers like me. If he were, he would have included more equations and might have left out the tried, true, but familiar Monty Hall Problem, Boy or Girl Paradox, and standard examples of the prosecutor's fallacy (Sally Clark and People v. Collins). But even with these familiar examples, he provides a more nuanced presentation. For example, under certain assumptions about the Monty Hall problem, it does you no good to switch doors. Clayton also provides notes and references, so I can follow threads for more detail.I appreciate that the book is also available in audio. The narrator is fine, but I find that I need the print version too.As someone already interested in probability theory and statistics, I highly recommend this book. I can't say how individuals less into the topic would like it.
W**H
deep but enjoyable
Very much enjoyed this tour de force. I did not try to follow all of the math — not the time or energy — but found the overarching message fascinating and ultimately highly persuasive. In my own career I’ve determined that 89.7% of all statistics are simply made up. This book shows a good part of the reason why.
Trustpilot
1 week ago
1 week ago