Philosophy of Science and the Scientific Method

“Science is a way of life. Science is a perspective. Science is the process that takes us from confusion to understanding in a manner that’s precise, predictive and reliable - a transformation, for those lucky enough to experience it, that is empowering and emotional.” – Brian Greene

Ancient science

Modern empirical science is founded on epistemology, the branch of philosophy concerned with the nature and justification of knowledge, the rationality of belief, and scientific theories and prediction. Epistemology asks the question: how do we know what we know?

A helpful illustration of epistemology in context with different areas of philosophy comes from an example of a computer, given by Carneades.

Epistemology: What can we know about the computer? How can we justify our beliefs about this computer? Can the computer have knowledge?
Metaphysics: What kind of thing is a computer? Does it have free will? How does it relate to its parts? Is it necessary or contingent?
Ethics: Is the computer good? How can it be used for good or bad?
Philosophy of Mind: Does the computer have consciousness?
Philosophy of Art: Is the computer beautiful? Is it sublime?

Like any foundational account of complex concepts like knowledge, belief, rationality, and their appropriate justifications, and like any of the fair criticisms of philosophy about terminologies, we ground our approach and understanding with otherwise seemingly mundane concepts, or concepts we take at face value. Much of the information here follows from class notes, Michael Kohlhase’s lecture notes on Logic-Based Natural Language Processing (see here), the ever-verbose Stanford Encyclopdia of Philosophy (see here), and the Integration and Implementation Insights post on epistemoglogy (see here). This post builds to a formalism of scientific theory and the scientific method.

Propositions, Observations, and Experiment

We express belief, knowledge about the world, attitudes, and phenomena using language. That language, however, is otherwise agnostic to how the world, the universe, and everything else operates; the universe does not much care what we have to say about it. Truth, then, is only accessible by observation, which itself can be fallible. Yet, these are the tools with which we have to work, so we begin with a series of working definitions.

A proposition φ is a sentence about the actual world or a possible world; the meaning of a proposition φ can have a truth value (true or false) in the actual world or in a possible world.
A belief is a proposition φ that an agent a holds true about a possible world, regardless of its truth value in the actual world.
Knowledge is justified, true belief; that is, knowledge is a proposition φ that an agent a holds true about a possible world and is true in the actual world.

Suppose we have an agent a that believes φ is true in a possible world w. The truth value of a proposition φ in w can be determined by observation. Our agent a experiences that φ is true in w or conducts a deliberate, systematic experiment that determines φ to be true in w. By this observation or experiment, a no longer just believes that φ is true, but also has knowledge that φ is true.

We enumerate this as the following:

An agent a belives φ to be true in w
An agent a determines the truth value of φ by observation or experiment
An agent a has knowledge that φ is true if the truth value of φ by observation or experiment is true

Reproducibility, Phenomena, and Counterexamples

This is all well and great, except for what was warned above: observation can (and often is) fallible. One observation by one observer is hardly evidence, and credibility of such an observation is often taken skeptically. Indeed, our agent may well observe some φ to be true, when it actuially is false, or false when it is actually true!

The solution invokes the principles of probability: repeat the observations enough times such that we increase the probability of true observation. In other words, collect data. This gives us two new definitions:

An observation of a proposition φ is reproducible iff φ can be observed by different observers in different situations or contexts.
A phenomenon is a proposition φ that is reproducibly observable to be true in a class of worlds.

We add the following to our enumeration:

An agent b, an agent c, and agent d belives φ to be true in w
An agent b, an agent c, and agent d determine the truth value of φ by observation or experiment
An agent b, an agent c, and agent d have knowledge that φ is true if the truth value of φ by observation or experiment is true
Because agent a, agent b, agent c, and agent d have observed φ in different situations or contexts, φ is reproducible

Our agent a, agent b, agent c, and agent d would like to verify whether the φ is a phenomenon by observing it in a class of worlds, or even all worlds. However, there may be too many such worlds or the worlds may be too large to make this a practical exercise. Our agents have a secret weapon: a counterexample.

A counterexample to a proposition φ is a particular world or worlds w such that φ is observably false in that world or worlds w.

A counterexample alleviates the labor for observing a proposition φ in all and every world w. Instead, the absence of counterexamples is the best our agents can hope for in general for accepting phenomena. We continue our enumeration.

agent a, agent b, agent c, and agent d observe φ to be true in a class of worlds
φ is a phenomenon because it is reproducibly observable to be true in a class of worlds, and no counterexamples are found; if a counter example is found in w, then φ is not a phenomenon

It is impossible for our agents, however, to know all phenomena; our agents can only retain a (small) subset of known propositions from which all phenomena can be derived. The (small) subset of propositions from which the phenomena that are relevant to an agent can be derived will become the beliefs of the agent. An agent will attempt to justify these beliefs, and hence, knowledge.

Explanations and Hypotheses

Suppose now that our agents not only observe that φ is a phenomenon, but also there is another observation ψ which seems to be true whenever φ is true.

A proposition ψ follows from a proposition φ, iff ψ is true in any world where φ is true.

Now that out agents have this observation, we want them related.

An explanation E of a phenomenon φ is a set of propositions, such that φ follows from E.

Adding to our enumeration:

agent a, agent b, agent c, and agent d observe that whenever φ is true, ψ is true; ψ follows from φ
φ is a phenomenon following from a set of propositions, for which our agents now have an explanation

We prefer explanations Φ that explain more than just φ (that is, we prefer to have more evidence). Recall, however, that it is not a practical exercise to exhaust all possible propostions or worlds. We extend this notion to explanations in the following way:

A proposition is falsifiable iff counterexamples are possible and the observation of a reproducible series of counterexamples is practically feasible (to give a fair chance at falsifiability).
A hypothesis h is a proposed explanation of a phenomenon that is falsifiable.

The absence of counterexampels is the best we have without needing to explore every world and state. Indeed, we insist falfsifiability because we cannot hope to otherwise verify the hypothesis (this get even more outrageous when we have a set of hyopethesis, the building blocks of a scientific theory). This introduces a useful limiter: if finding counterexamples is impossible, then the hypothesis is not worth pursuing. The strategy follows: simply accumulate propositions to represent knowledge abouyt the world.

Because our agents have an explanation E of a phenomenon φ such that each element of E is falsifiable, our agents have a hypothesis

Scientific Theories

Falsifiability is the key here, not least of all because of the availability of resources, but also to work towards a coherent and verifiable system of explanations and predictions. Indeed, the strategy follows: collect hypotheses, drop any with counterexamples, and drop any explained themselves (trivial hypothesis are not useful tools for explanation or prediction). These culminate into a scientific theory:

A hypothesis h can be tested in a world w by observing the value of h in w.
If the value of h in w is true, then the observation supports h; if it is false, then the observation falsifies h.
A scientific theory for a set Φ of phenomena is a set H of hypotheses that:
- has been tested extensively and rigorously without finding counterexamples.
- is minimal: no subset of H explains Φ.
Any proposition φ that follows from a theory Φ is a prediction of Φ.

It is remarkably efficient to falsify a theory Φ; to do so, it is sufficient to falsify any prediction because any observation of a prediction φ of Φ supports Φ. We may now conclude our enumeration, then, as the following:

Our agents repeatedly observe the values of h in w to be true, so those values support h
Armed with multiple such hypotheses, our agents continue to test and do not find counterexamples
Our agents define their set of phenomena to be a scientific theory, from which they can make predictions

Indeed, the epistemological approach described here is the predominant approach and practice in modern science.

The Scientific Method

Now armed with the formal empirical process, we may outline the approach that our agents take into a coherent and consistent script: the scientific method. The scientific method is a deliberately structured approach used to investigate and acquire new knowledge or refine existing understanding in the scientific fields. Crucially, it is designed to be a clear pathway for error checking and ensuring the validity and reliability of results.

Observation and Research
Hypothesis
Experimentation
Analysis
Conclusion
Publication
Replication
Formalization

1. Observation and Research

The world happens; an agent observes the world and investigates. Observation and research are elemental in initiating any scientific inquiry; they set the stage for the entire method. By (relatively) simple observation, our agent can identify specific problems, gaps in the literature, and recognize patterns (or lack of) in data.

This initial phase of the method assists in identifying variables that might influence the outcome of the study. Simply, data must be collected, which serves in turn as a reference point for identifying patterns or discrepancies. Science is not isolation (indeed, collect any number of scientists in the same field and observe what follows!). The existing literature and studies are vast; scientists build upon the current state of knowledge in the field such that they avoid duplication of efforts and invite a cumulative approach to knowledge acquisition.

Inductive reasoning, at this juncture, smooths the transition from observation and research to hypothesis formulation; namely, it operates on the principle of making generalizations based on specific, observed instances. Indeed, our agent uses specific observations to make broader generalizations. As a treat, sometimes inductive reasoning at this stage can help in developing preliminary theories based on the observed patterns, which are to be tested in the later stages of the method.

Essentially, the observation and research phase of the method is a confluence of meticulous data collection, critical analysis, and the application of inductive reasoning.

2. Hypothesis

Recall: a hypothesis is a proposed explanation of a phenomenon that is falsifiable; formulating such hypotheses is a critical cornerstone of the method, linking preliminary work (our agent’s observations and research) with experimental verification. The hypothesis is the predictive tool, the chief ingredient of theory, anticipating potential outcomes based on the observations and research, and sets the boundaries for what to expect.

The hypothesis gives direction to the study. It: serves as a guide; determines the variables to be measured; determines the kind of experiments to be conducted; mitigates biases that might otherwise influence the outcomes. Recall that the real world is limited by practical limitations; the focused approach of the hypothesis saves time and resources, directing our agent towards obtaining specific, relevant data.

Scientific communication is critical, both within the scientific community and outside of it; a clearly formulated (and mostly minimal) hypothesis is the vehicle for communication, and (succinctly) conveys the focus and anticipated outcomes of the study. Furthermore, a well-formulated hypothesis presents testable claims that can be empirically tested, which is essentially acting as a built-in rigorous evaluation procedure while sensitive to falsifiability.

3. Experimentation

Now, our agent primes their laboratory, installs their favorite Linux distro, and proceeds to test their hypothesis. Experimentation is the very platform and playground for testing the hypotheses, to be either verified for falsified.

The world is a messy place, so a deliberate, controlled environment in which to experiment essentially isolates variablesuch that our agent can focus on specific phenomona in question and mitigate the interference and influence of external variables. Additionally, the data collection from the first step of the method continues here! A wealth of data follows from systematic experiments. This last point, systematic, encourages replicability necessary for validation (and encourages collaboration!).

Post-experimentation, our agent intends to draw conclusions based on the data collected. Armed with such data, ourt agent can determine whether the hypotheses hold true or need to be revised. Our agent should not miss the forest for the trees; in the long term, the intent of the experimentation phase is to contribute to the development of theories. Consistent experimental results can lead to the formulation of such theories.

4. Analysis

Raw information is marginally useful. Interpretation of data, of the results of the experiments, of the changing hypotheses constitute analysis. Analysis is the linchpin in the transition from data collection to drawing informed conclusions. Simply put, the intention of analysis is interpretation, to critically evaluate and scrutinize the data collected during the experimentation phase, identifying meaningful insights from noise or anomalies, and extrapolating relationships between variables.

Here, our agent studies the results of the experimentation, where the hypotheses are either validated or refuted based on empirical evidence by delineating the accuracy of the hypotheses. Pattern recognition is exceptionally useful here (by human observation, mathematical induction/deduction, computer-assisted analysis, etc.), tempered by rigorous analysis to identify and correct any potential errors and biases that might have infected the experimentation. Corrections of these kinds are vital to maintain the healthy integrity and reliability of the study. Analysis is the informed part of informed conclusions, to which our agent is building.

Often, the analysis phase can lead to the formulation of new questions or hypotheses due to unexpected patterns or trends, which naturally invites further inquiry and research, and thus perpetuates scientific inquiry.

5. Conclusion

Drawing a conclusion is a climcactic stage in the method; it synthesizes all previous steps into a coherent and rational finding, a culmination fo the reseach process. Indeed, the final statement(s) establish(es) whether the initial hypothesis stands validated or refuted based on the empirical data and analysis.

Conclusions often serve as a starting point for future research (in the mythical land of the future where all our work we saved is done). Based on the findings, other agents can propose new approaches for exploration, and continue the cycle of inquiry and knowledge expansion. To that end, ehe conclusion facilitates communication with the broader scientific community (which, in turn, is to be transalted and shared with the wider community as well). Here, our agent succinctly conveys their findings, promoting discussion, critique, and collaboration.

The conclusion is intended to contribute to the existing body of knowledge in the field, integrates the new findings with existing theories.

6. Publication

Here begins the denouement.

Science, as stated above, is not an isolated endeavor. Good science is open, accessible, and relies on communication for progress. Publication (and conference appearance) is the real setting for scientific progression, and does so with transparency, collaboration, and public scrutiny; indeed, the primary intent is to disseminate the research findings, which invites a wide range of experts and scholars to access and evaluate the research.

The publication is a peer-reviewed process, which is vital for maintaining the quality and integrity of scientific research and promotes transparency and good practice. Peer reviews help to identify potential flaws or biases, and creates a formal record chronicling the progress and developments in the field. By sharing their findings, our agent contributes to a collective of knowledge, and so promotes a collaborative approach where ideas can be exchanged, critiqued, and appropriately adapted.

It should not go unmentioned that publishing is also necessary for the reputation of our agent, establishing the credentials of the agent(s) involved by allowing them to showcase their contributions to the field, which in turn fosters a culture of recognition and academic progression.

7. Replication

Replication fortifies the veracity and reliability of scientific findings; it is here that that research undergoes further testing to affirm its credibility and verify the robustness of the original findings.

Through replication, the reliability of research findings is strengthened; indeed, by demonstrating that the results can be consistently reproduced under the same conditions contributes to the credibility of the concluisions. As an interative process, replication helps in identifying and rectifying errors or anomalies that might have been present in the original study that otherwise were missed or were unable to effectively be removed.

Abstrasction and generalizablity are important goals of the method, not least of all to discover broader applicability and relevance; by replicating studies under varied conditions or with different samples, our agent might well establish the generalizability of the results. Finally, replication promotes scientific rigor and collaboration by encouraging our agent’s colleagues to conduct similar studies that withstand repeated scrutiny.

8. Formalization

Recall: a scientific theory for a set of phenomena is itself a set of hypotheses that have been tested extensively and rigorously without finding counterexamples and is minimal. Scientific theory formation stands as a pinnacle in the realm of scientific discovery, encapsulating substantial verified knowledge into a cohesive explanation of some set of phenomena. The chief ingredients are those hypotheses that have been tested extensively and rigorously without finding counterexamples and is minimal.

A theory is highly collaborative (though singular person or multiple persons might lend their repsective names to it). Theory formation is the consolidation of verified knowledge that has withstood repeated testing and scrutiny, and represents the culmination of numerous research efforts as a comprehensive framework that explains some aspect of the natural observable world.

Theories hold predictive and explanatory power; they not only account for known phenomena but also enable predictions about outcomes under different circumstances, which again naturally invites further research. Excitedly, once established, theories can often intregrate across different fields of science, where insights from one field can do likewise in others.

Theory is instrumental in promoting (and inspiring) intellectual progress and innovation.

Thus concludes the sceintific method; now begin again!

“The theory of relativity is a beautiful example of the basic character of the modern development of theory. That is to say, the hypotheses from which one starts become ever more abstract and more remote from experience. But in return one comes closer to the preeminent goal of science, that of encompassing a maximum of empirical contents through logical deduction with a minimum of hypotheses or axioms. The intellectual path from the axioms to the empirical contents or to the testable consequences becomes, thereby, ever longer and more subtle. The theoretician is forced, ever more, to allow himself to be directed by purely mathematical, formal points of view in the search for theories, because the physical experience of the experimenter is not capable of leading us up to the regions of the highest abstraction. Tentative deduction takes the place of the predominantly inductive methods appropriate to the youthful state of science. Such a theoretical structure must be quite thoroughly elaborated in order for it to lead to consequences that can be compared with experience. It is certainly the case that here, as well, the empirical fact is the all-powerful judge. But its judgment can be handed down only on the basis of great and difficult intellectual effort that first bridges the wide space between the axioms and the testable consequences. The theorist must accomplish this Herculean task with the clear understanding that this effort may only be destined to prepare the way for a death sentence for his theory. One should not reproach the theorist who undertakes such a task by calling him a fantast; instead, one must allow him his fantasizing, since for him there is no other way to his goal whatsoever. Indeed, it is no planless fantasizing, but rather a search for the logically simplest possibilities and their consequences.” – Albert Einstein

Written on August 31, 2023