Different levels of proof
There are many different levels of proofs. Some people tend to regard everything as a proof, others nothing. Having a good ability to decide what consitutes a good proof is a clear sign of high intelligence. A person who doesn't know anything about scientific proofs is not able to see through pseudo-scientific proofs and might therefore walk around believing things which are not true. One measure of intelligence is the correspondence of a person's models of reality with reality itself.
Elements of a scientific method
A scientific method depends upon a careful characterization of the subject of the investigation. (The subject can also be called the problem or the unknown.) For example, Benjamin Franklin correctly characterized St. Elmo's fire as electrical in nature, but it has taken a long series of experiments and theory to establish this. While seeking the pertinent properties of the subject, this careful thought may also entail some definitions and observations; the observation often demands careful measurement and/or counting.
The systematic, careful collection of measurements or counts of relevant quantities is often the critical difference between pseudo-sciences, such as alchemy, and a science, such as chemistry. Scientific measurements taken are usually tabulated, graphed, or mapped, and statistical manipulations, such as correlation and regression, performed on them. The measurements might be made in a controlled setting, such as a laboratory, or made on more or less inaccessible or unmanipulatable objects such as stars or human populations. The measurements often require specialized scientific instruments such as thermometers, spectroscopes, or voltmeters, and the progress of a scientific field is usually intimately tied to their invention and development.
Measurements demand the use of operational definitions of relevant quantities. That is, a scientific quantity is described or defined by how it is measured, as opposed to some more vague, inexact or "idealized" definition. For example, electrical current, measured in amperes, may be operationally defined in terms of the mass of silver deposited in a certain time on an electrode in an electrochemical device that is described in some detail. The operational definition of a thing often relies on comparisons with standards: the operational definition of "mass" ultimately relies on the use of an artifact, such as a certain kilogram of platinum kept in a laboratory in France.
The scientific definition of a term sometimes differs substantially from their natural language usage. For example, mass and weight are often used interchangeably in common discourse, but have distinct meanings in physics. Scientific quantities are often characterized by their units of measure which can later be described in terms of conventional physical units when communicating the work.
Measurements in scientific work are also usually accompanied by estimates of their uncertainty. The uncertainty is often estimated by making repeated measurements of the desired quantity. Uncertainties may also be calculated by consideration of the uncertainties of the individual underlying quantities that are used. Counts of things, such as the number of people in a nation at a particular time, may also have an uncertainty due to limitations of the method used. Counts may only represent a sample of desired quantities, with an uncertainty that depends upon the sampling method used and the number of samples taken.
Observations have the general form of existential statements, stating that some particular instance of the phenomenon being studied has some characteristic. Causal explanations have the general form of universal statements, stating that every instance of the phenomenon has a particular characteristic. It is not deductively valid to infer a universal statement from any series of particular observations. This is the problem of induction. Many solutions to this problem have been suggested, including falsifiability and Bayesian inference.
Scientists use whatever they can — their own creativity, ideas from other fields, induction, systematic guessing, etc. — to imagine possible explanations for a phenomenon under study. There are no definitive guidelines for the production of new hypotheses. The history of science is filled with stories of scientists claiming a "flash of inspiration", or a hunch, which then motivated them to look for evidence to support or refute their idea. Michael Polanyi made such creativity the centrepiece of his discussion of methodology.
Prediction from the hypothesis
A useful hypothesis will enable predictions, by deductive reasoning, that can be experimentally assessed. If results contradict the predictions, then the hypothesis under test is incorrect or incomplete and requires either revision or abandonment. If results confirm the predictions, then the hypothesis might be correct but is still subject to further testing.
Einstein's theory of General Relativity makes several specific predictions about the observable structure of space-time, such as a prediction that light bends in a gravitational field and that the amount of bending depends in a precise way on the strength of that gravitational field. Arthur Eddington's observations made during a 1919 solar eclipse supported General Relativity rather than Newtonian gravitation.
Predictions refer to experiment designs with a currently unknown outcome; the classic example was Edmund Halley's prediction of the year of return of Halley's comet which returned after his death. A prediction (of an unknown) differs from a consequence (which can already be known).
Once a prediction is made, an experiment is designed to test it. The experiment may seek either confirmation or falsification of the hypothesis. Yet an experiment is not an absolute requirement. In observation based fields of science actual experiments must be designed differently than for the classical laboratory based sciences; for example, the observations of the Chaldeans were utilized in the work of Al-Batani, when he determined a value for the precession of the Earth, in work that spanned thousands of years.
Scientists assume an attitude of openness and accountability on the part of those conducting an experiment. Detailed recordkeeping is essential, to aid in recording and reporting on the experimental results, and providing evidence of the effectiveness and integrity of the procedure. They will also assist in reproducing the experimental results. This tradition can be seen in the work of Hipparchus (190 BCE - 120 BCE), when determining a value for the precession of the Earth over 2100 years ago, and 1000 years before Al-Batani.
The experiment's integrity should be ascertained by the introduction of a control. Two virtually identical experiments are run, in only one of which the factor being tested is varied. This serves to further isolate any causal phenomena. For example in testing a drug it is important to carefully test that the supposed effect of the drug is produced only by the drug. Doctors may do this with a double-blind study: two virtually identical groups of w:patients are compared, one of which receives the drug and one of which receives a placebo. Neither the patients nor the doctor know who is getting the real drug, isolating its effects.
Once an experiment is complete, a researcher determines whether the results (or data) gathered are what was predicted. If the experimental conclusions fail to match the predictions/hypothesis, then one returns to the failed hypothesis and re-iterates the process. If the experiment(s) appears "successful" - i.e. fits the hypothesis - then its details become published so that others (in theory) may reproduce the same experimental results.
Evaluation and iteration
Testing and improvement
The scientific process is iterative. At any stage it is possible that some consideration will lead the scientist to repeat an earlier part of the process. Failure to develop an interesting hypothesis may lead a scientist to re-define the subject they are considering. Failure of a hypothesis to produce interesting and testable predictions may lead to reconsideration of the hypothesis or of the definition of the subject. Failure of the experiment to produce interesting results may lead the scientist to reconsidering the experimental method, the hypothesis or the definition of the subject.
Science is a social enterprise, and scientific work will become accepted by the community only if they can be verified. Crucially, experimental and theoretical results must be reproduced by others within the science community. Researchers have given their lives for this vision; Georg Wilhelm Richmann was killed by ball lightning to his forehead (1753) when attempting to replicate the 1752 kite experiment of Benjamin Franklin.
All scientific knowledge is in a state of flux, for at any time new evidence could be presented that contradicts a long-held hypothesis. A particularly luminous example is the theory of light. Light had long been supposed to be made of particles. Isaac Newton, and before him many of the Classical Greeks, was convinced it was so, but his light-is-particles account was overturned by evidence in favor of a wave theory of light suggested most notably in the early 1800s by Thomas Young, an English physician. Light as waves neatly explained the observed diffraction and interference of light when, to the contrary, the light-as-a-particle theory did not. The wave interpretation of light was widely held to be unassailably correct for most of the 19th century. Around the turn of the century, however, observations were made that a wave theory of light could not explain. This new set of observations could be accounted for by Max Planck's quantum theory (including the photoelectric effect and Brownian motion—both from w:Albert Einstein), but not by a wave theory of light. Nor, for that matter, by the particle theory. More ...
Peer review evaluation
Scientific journals use a process of peer review, in which scientists' manuscripts are submitted by editors of scientific journals to (usually one to three) fellow (usually anonymous) scientists familiar with the field for evaluation. The referees may or may not recommend publication, publication with suggested modifications, or, sometimes, publication in another journal. This serves to keep the scientific literature free of unscientific or crackpot work, helps to cut down on obvious errors, and generally otherwise improve the quality of the scientific literature. Work announced in the popular press before going through this process is generally frowned upon. Sometimes peer review inhibits the circulation of unorthodox work, and at other times may be too permissive. The peer review process is not always successful, but has been very widely adopted by the scientific community.
The reproducibility or replication of scientific observations, while usually described as being very important in a scientific method, is actually seldom actually reported, and is in reality often not done. Referees and editors rightfully and generally reject papers purporting only to reproduce some observations as being unoriginal and not containing anything new. Occasionally reports of a failure to reproduce results are published--mostly in cases where controversy exists or a suspicion of fraud develops. The threat of failure to replicate by others, however, serves as a very effective deterrent for most scientists, who will usually replicate their own data several times before attempting to publish.
Evidence and assumptions
Evidence comes in different forms and quality, mostly due to underlying assumptions. An underlying assumption that 'objects heavier than air fall to the ground when dropped' is not likely to incite much disagreement. An underlying assumption like 'aliens abduct humans' however is an extraordinary claim which requires solid proof. Many extraordinary claims also do not survive Occam's razor.
Elegance of hypothesis
In evaluating a hypothesis, scientists tend to look for theories that are "elegant" or "beautiful". In contrast to the usual English use of these terms, scientists have more specific meanings in mind. "Elegance" (or "beauty") refers to the ability of a theory to neatly explain as many of the known facts as possible, as simply as possible, or at least in a manner consistent with Occam's Razor while at the same time being aesthetically pleasing.
Everyone has reason to learn what constitutes a scientific proof. Even if you never do scientific work, it will help you to evaluate other's work, and to protect yourself against quackery. Maybe even more importantly, it will enable you to think more clearly in general.
Whenever you hear an advertisement saying a new soap or lotion is scientifically proven to have a positive effect in some sense, statisitics have been used (or they lied about the scientificness of the proof). The philosophical ideas behind statistical proofs are these:
- Formulate a hypothesis which can be falsified by experiments (measurement)
- Decide what level of certainty you want. 95% and 99% are common choices.
- Perform experiments that might falsify the hypothesis.
Suppose, for instance, that you want to see if there is any connection between drinking alcohol during pregnancy and the intelligence of the child. Then you might start with the following:
Hypothesis: A mother drinking alcohol during pregnancy does lower the intelligence of her child.
This gives rise to the following anti-hypothesis or null hypothesis:
Null hypothesis: A mother drinking alcohol during pregnancy does not lower the intelligence of her child.
Now we want to be 99% sure of our result. That means the risk of error is 1%. After doing a lot of measurements and putting the measurements through the machinery of statistics, we will be able to conclude either:
- with 1% risk of error we cannot reject the null hypothesis, or
- with 1% risk of error the null hypothesis is rejected in favour of the hypothesis.
If 2 is the case, we have 'proven' statistically that drinking alcohol during pregnancy lowers the intelligence of the child. Of course this example is stylized. What do we mean by drinking alcohol? What amount, and how regularly? How do we measure intelligence? Those must also be specified.
The Axiomatic Method
The axiomatic method is fundamental in every mathematical theory. A complete theory is built of axioms and implications.* Other names for "axiom" are "premise", "postulate", and "assumption". An axiom is always assumed to be true, without discussion, for the sake of argument. Each time we say 'suppose', we describe an axiom. When a statement undoubtedly (logically) follows from another statement, we have an implication.
Suppose all human beings have pink eyes. Let that be an axiom. Now suppose that Melinda is a human being. (That's another axiom.) Then Melinda has pink eyes. Any other conclusion about Melinda's eye color would be wrong, because the axioms are defined as true (unless one could prove that the two axioms contradict one another, in which case one would have to be discarded, but that's another story).
People have different axioms. Have a look at this belief: "If we don't throw a pancake in the dragons cave each morning, the sun will not rise". You might say that this isn't very logical. But it indeed is, with the right axioms.
- There is a dragon in the cave.
- The dragon dies if it doesn't eat pancakes.
- The dragon and only the dragon can make the sun rise.
Now, if nobody gives pancakes to the dragon, then the dragon will die (suppose also that the dragon cannot make his own pancakes). But if the dragon is dead, then nobody makes the sun rise! This conclusion is logically derived from the axioms. The point is that even hard core mysticists may use logic in their thinking; only their assumptions are strange.
Most people don't consciously think about what are axioms and what are implications when they argue with each other. Also, most people would rather die than change any of their axioms of life. People have the strangest axioms like "Different configurations of the stars have different easily detectable effects on human beings". Maybe they have good reasons to have these axioms. However, having these axioms, they think they are thinking rationally, and they are! As long as the implications follow logically from their axioms, they are thinking rationally! At least, according to one definition of "rational". Another definition might be "a theory is rational if it has a good correlation with physical reality". But then many mathematical theories are not rational; for instance, most non-Euclidian geometries (and ordinary Euclidian geometry too, according to Minkowski-Einstein theory!). And we want mathematical theories to be "rational", so the latter was not a good definition.
An intelligent being should be aware of the fact that different people and different cultures have different axioms. It might be a good idea to practice believing in strange things. Be aware of your axioms! Don't believe in them, just regard them as axioms! Change axioms each time you change underwear, if you change your underwear reasonably often.
If you have never heard of Occam's Razor, this is the perfect time to learn what it is. It's a principle which roughly says: If you have to choose between two equally good theories for explaining a phenomena, choose the one with the smallest number of axioms.
A short repetition:
- Try not to believe in the axioms you use in daily life, just regard them as axioms which could very well be changed. This will help you to understand other people.
- Superstitious people can very well be rational, and they often are. They just have some strange axioms.
- Try to assume as little as possible (that is, use Occam's razor). And be careful not to draw too-bold conclusions. If you pray to god, and immediately your prayers are answered, does this imply that the Koran or the Bible is true? Does this imply that what's preached in your local church is true? Or does it imply that the Hindus are right?
* In mathematical theories, one introduces something called a formal system, which is a little bit more rigorous (for instance the logical rules are not taken for granted). In daily life, however, the axiomatic method is good enough.
Induction or inductive reasoning, sometimes called inductive logic, is the process of reasoning in which the conclusion of an argument is very likely to be true, but not certain, given the premises. It is to ascribe properties or relations to types based on limited observations of particular w:tokens; or to formulate laws based on limited observations of recurring phenomenal patterns. Induction is used, for example, in using specific propositions such as:
- The ice is cold.
- A billiard ball moves when struck with a cue.
to infer general propositions such as:
- All ice is cold.
- For every action, there is an equal and opposite re-action
Formal logic as most people learn it is deductive rather than inductive. Some philosophers claim to have created systems of inductive logic, but it is controversial whether a logic of induction is even possible. In contrast to deductive reasoning, conclusions arrived at by inductive reasoning do not necessarily have the same degree of certainty as the initial assumptions. For example, a conclusion that all swans are white is obviously wrong, but may have been thought correct in Europe until the settlement of Australia. Inductive arguments are never binding but they may be cogent. Inductive reasoning is deductively invalid. (An argument in formal logic is valid if and only if it is not possible for the premises of the argument to be true whilst the conclusion is false.)
The classic philosophical treatment of the problem of induction, meaning the search for a justification for inductive reasoning, was by the Scotsman David Hume. Hume highlighted the fact that our everyday reasoning depends on patterns of repeated experience rather than deductively valid arguments. For example we believe that bread will nourish us because it has in the past, but it is at least conceivable that bread in the future will poison us.
Someone who insisted on sound deductive justifications for everything would starve to death, said Hume. Instead of unproductive radical skepticism about everything, he advocated a practical skepticism based on common-sense, where the inevitability of induction is accepted.
20th Century developments have framed the problem of induction very differently. Rather than a choice about what predictions to make about the future, it can be seen as a choice of what concepts to fit to observation (see the entry for grue) or of what graphs to fit to a set of observed data points.
Induction is sometimes framed as reasoning about the future from the past, but in its broadest sense it involves reaching conclusions about unobserved things on the basis of what is observed. Inferences about the past from present evidence (e.g. archaeology) count as induction. Induction could also be across space rather than time, e.g. conclusions about the whole universe from what we observe in our galaxy or national economic policy based on past economic preformance.