When I was a small child I got lost at a store with lots of aisles. I remember being very scared that my mom would never find me. I walked from aisle to aisle on the verge of tears hoping to run into her. Eventually, to my great fortune, I did.
However, it occurs to me that had my mom and I started looking for each other at opposite ends of the store, our paths inside the store been constantly reflected over the center of the store, or had it been a store with an odd number of evenly-spaced aisles, I would have still been wandering those aisles, just barely missing my mom for the rest of time.
As eerie as this thought experiment may be, It has some interesting mathematical properties that are fun to consider.
A Simple Model
Imagine two points on a plane that can travel anywhere on the plane at the same speed. Between them lies an obstacle. Perhaps the obstacle is just a static point.
The red points travel. The blue point is a static obstacle.
The motion of the red points is restricted by the condition that at any given moment, the red points are reflections of each other over the blue obstacle point.
If the first red point approaches the obstacle, so does the second one. If the first red point moves away from it, then so does the second one.
We’ll say that the red points can ‘see’ each other if one can draw a line that goes through both red points without being blocked by the obstacle point.
The question is: can the two red points find a position where they can ‘see’ each other?
The answer is predictably no: the points will never be able to ‘see’ each other. The reflected motion condition implies that no matter where the two red points move, the line going through them will also go through the obstacle point. Therefore in every possible configuration, the obstacle will block the red points from directly ‘seeing’ each other.
In every configuration, the line going through the red points will also go through the blue point.
This principle can be extended to various obstacles and to situations where this reflected motion comes about naturally. Here is a real-world thought experiment with some surprising ramifications:
Imagine two genetically identical twins that have been conditioned in an absolutely identical way. Ask them to close their eyes and lead them to stand on opposite ends of a perfectly symmetrical rotunda with a pillar in the center just big enough to obstruct the twins’ view of each other at every distance.
The twins start off with the same exact view of the pillar and rotunda.
For the sake of this example, the twins cannot communicate in any way. They cannot speak to the other twin or hear the other twin’s footsteps. As the twins open their eyes and are faced with the same exact angular view of the rotunda, they will begin to move around trying to find their twin. But because they are the same in every conceivable way, their motion around the rotunda will be identical, and therefore mutually-reflected over the center. If either of the twins decides to speed up or abruptly run in the reverse direction from the one they appeared to be going in the hopes of ‘tricking the system’, as long as the twins are absolutely identical in genetics and conditioning, the other twin will decide to do the same thing. This locks the twins in an endless search for one another in a room that doesn’t have to exceed a few meters in area.
In general, the room does not need to be circular and the obstacle does not need to be a pillar. The two conditions that must be satisfied for this experiment to work are:
Whatever obstacle you choose must cover the center-point of the room and obstruct the twins’ view of each other.
The twins must begin at opposite ends of the room with respect to the obstacle point as mentioned, and their fields of vision must be indistinguishable. Otherwise, at least one of their conditions is not truly identical.
Getting back to the topic of stores and aisles, a store with an even number of evenly-spaced aisles won’t have a center aisle which allows the twins to find each other since the center-point isn’t covered:
With no center aisle, the twins walk to a position where they can see each other.
A store with an odd number of aisles will likely have a middle aisle, covering the center-point, and could satisfy the above two conditions, meaning the twins will never find each other:
The middle aisle blocks the center point meaning there is no position where the twins can see each other.
While this situation might seem unlikely and impractical as an experiment, if done correctly, it would allow us to test if human beings are the ‘sums of our parts’, namely our genes and conditions as the determinists claim, or whether there is some third, unpredictable component to our nature that will eventually cause the twins’ motions to diverge.
“How is it possible that mathematics, a product of human thought that is independent of experience, fits so excellently the objects of reality?” — Albert Einstein
Inthe garden, through a microscope, or deep in the depths of the ocean, patterns are all around us if we know how to look for them. Whether it’s humans or nature, mathematics are used to contrive these patterns, orchestrating a fluent and alluring result. To look for these patterns you must have a basic knowledge of the fundamentals of mathematics, which is what this list will help you gain:
The Fibonacci Sequence
Mathematicians have been enthralled by the Fibonacci sequence for centuries. It’s ubiquity in both nature and man-made synthetic structures hints to its significance in both our lives and in the universe.
Here’s a quick refresher on what the Fibonacci sequence is:
The Fibonacci sequence is a series of numbers in which each subsequent number can be found by adding up the two numbers that precede it. It starts like this:
0, 1, 1, 2, 3, 5, 8, 13, 21, 24, 34, … and continues on forever.
The Fibonacci sequence was named after the man who came up with it, Leonardo Fibonacci, and the way he calculated it was quite peculiar. Over the course of a year, Fibonacci observed and calculated the ideal reproduction patterns of rabbits. The chart and diagram below provide an explanation and visual as to how the Fibonacci sequence was determined:
Chart and Diagram from Maths Careers
Although its discovery dates back to 1202, the Fibonacci sequence can be found from the microscale to the macroscale today. Not every object in the universe possesses it, but here are some prime examples:
(Left) Spiral Galaxy from itl.cat. (Right) Spiral Shell from Insteading.
Both spirals can be found in nature, and both follow the Fibonacci sequence. The logarithmic spiral, which is another name for the infamous spiral that the Fibonacci sequence emulates, can be seen in these natural spirals that theoretically continue to infinity. An interesting feature of spiral galaxies is that they defy Newtonian physics. According to Newtonian physics, these galaxies should begin to curve as a result of their rotation patterns, and their spiral arms should begin to wind until the galaxy disappears (the winding problem). Interestingly, neither of these phenomena occur: the spiral shape of the galaxy and the logarithmic spiral it mimics, are preserved.
Fractal Branching
Although fractal branching is not as well known as the Fibonacci sequence, it equates in its importance and presence in the universe. Fractal branching is a detailed pattern that is repeated over and over again, resulting in a shape that appears rather complex. These patterns look exactly as you would expect, and its most popular models are tree branches and snowflakes:
(Left) Fractal branching in a snowflake from SnowCrystals. (Right) Fractal branching in tree branches from MSI Chicago.
The basic process of fractal branching can be described as a beginning prototype that is copied and repeated. Using the example of a tree branch, one branch is split into other branches, that is split into more branches. It is as if two new, almost identical, smaller branches emerge from their origins. And so a tree with many branches can also be seen as an assembly of many smaller, similar-looking trees.
Voronoi Pattern
You may be unfamiliar with what a Voronoi pattern is, but I can assure you you’ve seen them everywhere. A Voronoi pattern is a pattern in which regions have points, and every point is the origin from where the region expands. The expansion of a region ends once it meets its neighbouring expanding region, thus, the radius from the end of each neighbouring region is equal in length to the point from which it expands.
A Voronoi pattern diagram from Quora.
Voronoi patterns can be found in giraffe skin, to cracked mud, to bubbles:
(Left) A Voronoi pattern in giraffe skin from Scientific American. (Right) A Voronoi pattern in cracked mud from MSI Chicago.
Voronoi patterns are synonymously tessellations in these examples, and appear as connecting pieces of a puzzle. Although they are not labelled, each region has a point from which they expand from, and connect to other regions with points of their own.
Patterns are truly found all around us in nature, all attributable to mathematics. One with a limited mathematical background may not think to look for them, but once a fundamental knowledge is gained, it’s hard to not see that they are ubiquitous in the universe.
As Vincent van Gogh once said, “if you truly love nature, you will find beauty everywhere”. If you understand mathematical patterns, this beauty is not hard to find.
On August 2. 2020 Cantor’s Paradise published Bruno Campello’s brief critique of Cantor’s approach in transfinite mathematics. The author raises some doubts about Cantor’s debunking Euclid’s 5th principle stating that the whole is greater than the part. Campello gives an interesting argument against Cantor’s reasoning, but it itself raises a number of question marks to say the least. Below I comment on his discussion of the alleged Cantor’s fallacy.
Georg Cantor RefutedThe celebrated Georg Cantor believed he could refute Euclid’s 5th principle (that the whole is greater than the part)…medium.com
To begin with, Campello says that “Cantor and his epigones believed that, along with a principle of ancient geometry, he was also breaking down an established belief of common sense and one of the pillars of classical logic”, but it seems that not only Cantor and his mob believed it — the article comes to a close with these words:
That such crude sophisms could pass as serious threats to the foundations of classical geometry and even to the principles of civilization that we inherited from the Greco-Roman tradition, is only the sign of the impotent revolt of the mathematical imagination exacerbated against the real order of things
Wherein the reality of this change is acknowledged by the author. However, regardless of his opinion about this shift, it must be clarified that it was not Cantor who originated the disbelief in common sense in scientific reasoning that “demolished our civilisation”.
In fact, the revolution Campello dispises so much took place a long time before Cantor. It was Newton and Galileo who were first to decide to entrust the scientific explanation to their abstract mathematical models at the price of intuitive intelligibility coming from common sense and observation. After them, science strives for consistent theory of a mere fragment of phenomena, even if the theory contradicts some other aspects of reality (Chomsky, Belletti, Rizzi, 1999, p. 3). Steven Weinberg called this kind of making science the “Galilean style” (Weinberg, 1976, pp. 28–29). It stems from the “recognition that it is the abstract systems that you are constructing that are really the truth; the array of phenomena are some distortion of the truth”(Chomsky, Belletti, Rizzi, 1999). And indeed, today’s science bases securely on the mathematical imagination, rather than on the common sense. The latter, it seems, died with the Middle Ages.
So it is not Cantor who is responsible for “crude sophisms” in science, but the very fathers of the modern science. That’s for starters. But let us turn to the heart of Campello’s critique now, and see where he believes Cantor went wrong.
Entire issue here is Cantor’s striking conclusion that if we are to compare two infinities, namely the infinite set of integers and the infinite set of even numbers, we can put their elements in one-to-one correspondence, hence both sets are equinumerous. This is to break Euclid’s 5th principle, as the evens are a subset of integers. Such result was indeed staggering and raised eyebrows of many leading mathematicians of the era, including Kronecker, Weierstraß, and the proto-intuitionist Poincaré. But now Campello has a decisive argument against it, so we can relax. What is this argument?
Firstly, Campello explains that one can treat infinity as an object that can be a subject to arithmetical operations and comparisons only if one takes it to be the actual infinity, that could be thought of as a unity. And this of course deends on one’s philosophical standpoint. If one does not allow for such treatment and only understands infinity as potential, that is, something which is never a finished whole, but rather which could be always (potentially) increased, Cantor’s approach will be unjustified from the beginning.
The discussion about the nature of infinity is maybe one of the most absorbing in philosophy of mathematics, and that is likely because of the lack of decisive results. Any argument against the notion of the actual infinity I have heard of could be parried by the Platonist Cantor. For instance, the Kantian claim that the infinite is never experienced in nature, hence can be only conceived as a potential entity, meets the answer that the nature of mathematical endeavour exceeds the matter of experience and sensual observation, so its principles go beyond the physical and there is no immediate contradiction in the notion of actual infinity. In fact, the most convincing reservation I have encountered is simply that the concept of the actual infinite goes directly against common sense and is thus unintelligible. But we have seen that arguments “from common sense” do not work here.
Yet Campello does not try to discuss the fact that the actual/potential issue is the matter of philosophy of mathematics, and merely refers to Aristotle’s authority. He knows that this ab auctoritate would not suffice to knock Cantor down, so he goes on with his argument.
It starts with a scrupulous distinction between a sign standing for a number and the number itself. Campello believes that Cantor’s mistake is caused by confusion of these two:
If we represent the whole numbers each by a sign (or cipher), we will have there an (infinite) set of signs or ciphers; and if, in this set, we want to highlight the numbers that represent pairs by special signs or numbers, then we will have a “second” set that will be part of the first; and, both being infinite, the two sets will have the same number of elements, confirming Cantor’s argument. But that is to confuse numbers with their mere signs […].
He continues: “ it is not the sign “4” which is twice as much as 2, but the quantity 4, whether it is represented by this sign or by four little balls”. This latter sentence I take to express the essence of Campello’s demonstration. It sounds reasonable. But there is a lot more behind this approach than mere common sense. As a matter of fact, Campello’s reasoning presupposes a particular notion of number, and this being not free of philosophical shortcomings.
The concept of number Campello sympathizes with originates from Husserl’s early work in philosophy of mathematics, Philosophie der Arithmetik. “In the first part of the work, Husserl developed a psychological analysis that started from the everyday concept of a number” (Beduerftig & Murawski, 2018, p. 36) — something Campello is surely fond of for its commonsensicality. In his oeuvre, Husserl builds the general concept of number in a strictly epistemological way: he takes the notion of number to be the abstraction of quantitative features of the perceptible phenomena, which is made by the intellect. Firstly, he notes that while perceiving the world, we encounter various appearances. To distinguish them from each other, we use the relations of “difference” and “identity” (Husserl, 2003, p. 52). Our mental faculty allows us to spot that things we see are not all one, but are different. They differ in terms of their identity — A differs from B, B differs from C, etc. (Husserl, 2003, p. 53). In other words, we have one thing (different from any other), than another, and so on.Then we abstract the one-ness of things into the concept of unity, i. e. number one. And in order to arrive at bigger numbers, be simply add them together:
“One and one, … , and one — a form with which a definite number term is associated” (Husserl, 2003, p.86).
Edmund Husserl published his “Philosophie der Arithmetik” in 1891. Frege’s review of the book was scornful and forced Husserl to revise his account.
“Unity” is here understood simply as the number one, and this is the “unity” which makes 2 twice as much as 1, as Campello would put it.
And so, he says that the “set of whole numbers may contain more numerical signs than the set of even numbers — since it includes both odd and even signs — but not a greater number of units than that contained in the even series.” His entire analysis depends heavily on this “quantitative” understanding of a number and recognizing units as bricks making up the structure of numbers. Now it is obvious why he accepts only the potential infinite.
Later on in the article, Campello accuses Cantor of metaphysical speculation, but now it seems that it is not only Cantor who is eager to assert positive claims about mathematical reality: the author himself, consciously or not, relies heavily on some philosophical interpretation of numbers and infinity. Moreover, his accusation against the refutation of the 5th principle that “[Cantor’s reasoning] is obviously metaphysical, since it aims not only to imagine a possible space but to describe a property of real space” makes entire mathematical practice a metaphysical endeavor.
But it is even worse than that. Campello openly imposes his philosophical standpoint on Cantor, and then criticizes him for not playing by the inflicted rules. However, it is the difference in philosophical accounts between Cantor and the author that causes the issue, not the confusion of sign and number. To see that, let us take a brief look at Cantor’s conception of number, one that lets him not only talk about the actual infinite, but also compare it with other objects of the system.
His approach is more abstract from the beginning. The key term in Cantor’s ontology is a set, and his account hinges strongly on it. But it does not need as refined an explanation as the Husserlian stnace: a number is defined on the basis of equinumerosity of sets (that idea is actually due to Frege). The general concept of a number is understood as the intersection of all sets with the same number of elements. What makes this intersection possible is the property of “having n elements”. This approach does not focus on the quantitative feature; it’s idea is that as long as some elements can compose a set, it does not matter how many of them is there — even if infinitely many.
What follows from it is that a number for Cantor is not a structure of units, but a complete unity of a set. Crazy as it sounds, being a set is more important than the number of elements a set has. Hence, what is crucial in the number “2” is that it is an intersection of sets; what is crucial in “4” is also that it is an intersection of sets. There is nothing twice as many in 4 as in 2, because they are only two different sets.
Ciphers in the black circles stand for the numbers of elements. It does not matter how many one-membered sets are in the intersection composing the number “1”, what counts is that it’s a complete class of objects.
Understood in this way, numbers do not “differ in size”, as Campello would probably like to state it. They are entities of the same kind, perfectly camparable to one another. We find out that on this account, if we look for sets equinumerous with the set of integers, we will come by the set of even numbers, since it is possible to establish an isomorphism between them. Thus they are the same kind of infinity. So Cantor rejoins (when judged by his standards, as he should have been).
If Campello wanted to effectively attack Cantor’s idea, he should have gone for the root of it, the meta-mathematical assumption that Cantor makes for his transfinite arithmetic. Instead, he chose to aim at the problem that arises only in his own defective interpretation of Cantor, an interpretation unaware of its own philosophical weight. Campello’s refolmulation only brought unclarity. Add to that vague accusations about metaphysical nature of Cantor’s business and you don’t know which confusion should be rectified first.
I believe that we can learn two things on this account. First, that if geniuses of first order like Poincaré or Brouwer were not able to effectively demolish Cantor’s set-theoretical approach, there probably is some point to it. Second, that while entering the discussion in the philosophy of mathematics, one should know the root of the discussed issue, since the prima facie diagnosis could not suffice. As Sir Arthur Eddington said:
We used to think that if we knew one, we knew two, because one and one are two. We are finding that we must learn a great deal more about ‘and.’
Chomsky, Belletti & Rizzi (1999). “An Interview on Minimalism”, University of Siena, Nov 8–9 (rev: March 16, 2000) .
Weinberg (1976). “The forces of nature”, Bulletin of the American Society of Arts and Sciences 29.4.
Beruerfrig & Murawski (2018). “Phenomenological Ideas in the Philosophy of Mathematics from Husserl to Gödel”, Studia semiotyczne, XXXII, No. 2 (2018), 33–50.
Back in June, I wrote this article about Noether’s Theorem, the idea that continuous symmetries and conservation laws are intrinsically linked. But, while Noether’s Theorem is one of the most beautiful concepts in physics, symmetries themselves are equally as fundamental, and often similarly beautiful. And further, much of the coal-face of theoretical physics is a consequence of breaking such symmetries. So, what are symmetries? And what happens when we break them?
In the simplest terms, a symmetry is any action which leaves a system unchanged. These can be physical actions, such as a rotation or a translation, or they can be more abstract, such as time-reversal symmetry or spin space rotation symmetry (more on these later). As I detail further in my Noether’s theorem article, these symmetries can be discrete, or they can be continuous.
For example, a cube is rotationally symmetric for rotations of 90 degrees, but this symmetry does not hold for an angle of, say 45 degrees. Thus, a cube has a discrete rotational symmetry of 90 degrees (and multiples thereof). Conversely, a sphere has a continuous rotational symmetry, since it can be rotated by any angle — even those which are extremely tiny — and still look the same.
More generally, when we are talking about a physical system, we define a symmetry of a system as a transformation which leaves the Hamiltonian unchanged. In the previous article, we defined a system in terms of its Lagrangian, but rest assured, the Lagrangian and Hamiltonian are intrinsically linked by what is known as a Legendre transform, the details of which are for another time. I choose to use the Hamiltonian here since it is the object of choice for quantum mechanics, which will be the main focus of what follows.
To cut a long story short, the Hamiltonian is a description of the behaviour of a physical system. It in the position and momentum of a state and gives out the time evolution of that state depending on a given potential. States which are constant with time under said potential are the eigenstates of the Hamiltonian. Just as eigenstates of the position operator have a well-defined position, the eigenstates of the Hamiltonian have a well-defined totalenergy. Thus, these eigenstates are often called energy eigenstates.
An example of a quantum mechanical Hamiltonian. Note how the kinetic energy depends only on the momentum, while the potential energy depends only on the position. The potential energy can also depend on time, but we ignore this here for simplicity.
An important property of a Hamiltonian is that it is Hermitian. This can be defined in a variety of ways involving matrices and complex conjugation, but for our purposes it means that the eigenvalues of the Hamiltonian — the energies of the eigenstates — must be real, and not complex. All operators which measure “classical” quantities, such as position and momentum, are Hermitian, which makes sense since we shouldn’t be measuring values of energy, position or momentum which aren’t real.
While this may seem like an innocuous property, its consequences are far-reaching. One of the most powerful is that the eigenstates of a Hermitian operator are orthogonal. This means that the eigenstates are independent — no eigenstate depends on any other — and it has the consequence that we can build any state as a superposition of eigenstates, as I detail for position and momentum eigenstates in this article.
Ok, so how do symmetries come into this picture? We said that a symmetry is a transformation that leaves the Hamiltonian unchanged. The eigenstates of a Hamiltonian for a given potential are uniquely determined, so if a transformation of the Hamiltonian leaves it unchanged, then the eigenstates must also be unchanged. If the Hamiltonian gives the time evolution of a state, and that state is a superposition of energy eigenstates which are unchanged by the transformation, then the time evolution of the state must also be unchanged. A symmetry is a transformation which does not changethe behaviour of the system!
Phew ok, I think we need an example. Imagine a single electron in completely empty space. Such an empty space has no potentials present, and with no potentials, there can be no forces on the electron. Then, by Newton’s first law, the electron will continue to move with whatever velocity it had when we first put it in the space.
Now, lets perform a transformation. We move the electron, say, two metres to the right, but we leave its velocity unchanged. How does the electron behave now? Well, there are still no potentials around since we are still in the same completely empty space, so we would expect the electron to carry on moving in the same way as before the translation. So, the time evolution of the system — the movement of the electron within the space — is unchanged by the transformation, and we can say that translation is a symmetry of the system.
What does this look like in terms of the Hamiltonian? Well, with no potentials present, the energy of the electron depends only on its velocity, and thus its momentum, since the electron only has kinetic energy. The Hamiltonian gives the total energy of the system, so the Hamiltonian can also only depend on momentum, and importantly not on position. So, if we perform our translation, the Hamiltonian is unaffected since we only change the position of the electron. Translation thus fulfils our definition of a symmetry!
That we perceive an “arrow of time” is a consequence of the lack of time reversal symmetry in most thermodynamic processes. Image from: http://www.smonad.com/problems/arrow_of_time.php
So, what are some examples of symmetries? Translational symmetry is actually one of the more common, particularly when working with crystals, which are defined by their periodicity and therefore spatial invariance over particular intervals. Closely associated is rotational symmetry, which we often encounter, perhaps unsurprisingly, in systems when rotation is involved. Notably, rotational symmetry is important in astrophysics, where spherical bodies lead to rotationally symmetric gravitational fields and the elliptical orbits we all know and love.
Alongside these continuous symmetries, we have several discrete symmetries. Among these is time reversal symmetry, which is where a system behaves the same if we run it “forwards” or “backwards” in time. Time reversal symmetry is conspicuous in its absence in the world around us, since our very experience of time having a “direction” is a consequence of the lack of this symmetry in many processes, most notably in thermodynamics.
Another, possibly less profound, symmetry is lattice inversion symmetry. This applies for crystal lattices, and is invariance under taking the “mirror image” of the lattice by setting all spatial coordinates to their negative.
If a system exhibits time reversal symmetry, each energy eigenstate has a time-reversed partner with the same energy but opposite momentum. Similarly, in system with inversion symmetry, each energy eigenstate has a partner with an opposite momentum whose energy is of the same magnitude, but negative.
Tracks of charged particles moving in a magnetic field, visualised using a bubble chamber. Notice how pairs of tracks spiral in opposite directions. This is a result of the presence of positrons, the antimatter counterpart to the electron, which in some senses is a “time reversed” electron. This means that a positron feels the force from the magnetic field in the opposite direction to the electron, and thus spirals in the opposite direction. Image from: https://cds.cern.ch/record/39474
What happens when we start to break these symmetries? First, we remove time reversal symmetry. Imagine that we confine an electron to a 2D plane, and that we apply a uniform magnetic field that’s perpendicular to that plane. An electron is a charged particle, and a charged particle moving in a magnetic field feels a force perpendicular to its velocity, the so-called Lorentz force, and we see that the path of the electron curves in the plane.
If we were to run this experiment with time reversed, we would see that the magnetic field was unchanged, but the initial velocity of the electron is reversed. The Lorentz force the electron feels is then the negative of the force experienced by the electron in the first example, and the path the electron follows is curved in the opposite direction. So, if the electron curved up in the first example, it would curve down in the second example. Thus, the electron behaves differently under the action of the magnetic field when time runs backwards compared to when time runs forwards. This means that a magnetic field breaks time reversal symmetry!
This idea can be applied to electrons moving in a crystal, which is a system which exhibits time reversal symmetry. In the simplest case, the time-reversed partner of the eigenstates is just the state with opposite spin. For every up-spin electron on the crystal there exists a down-spin electron of the same energy. We call this property spin degeneracy.
We know that if we apply a magnetic field, the time reversal symmetry of the system is broken, so we should see this spin degeneracy lifted. Or, in other words, under the influence of a magnetic field, the energy of up-spin electrons should be different to that of down-spin electrons.
And, as if by magic, this is what we see! The magnetic field reduces the energy of electrons with spins parallel to the field and increases the energy of electrons of electrons with spins antiparallel to the field. The partners now have different energies, and the degeneracy is lifted. This process, known as Zeeman splitting, can be seen experimentally in the splitting of absorption lines in elemental spectra in the presence of a magnetic field.
What about if we break inversion symmetry? A good example of this can be seen in physics on a honeycomb lattice. As mentioned in this article, a honeycomb lattice has two distinct types of sites, the A sites and the B sites. If we take the mirror image of a honeycomb lattice, we find that the A sites map onto the B sites. Inversion symmetry is maintained if the A sites and the B sites are the same, such as in graphene with its carbon-only honeycomb lattice. But if the A sites and the B sites are different atoms, such as in hexagonal Boron Nitride (hBN) where the lattice consists of alternating boron and nitrogen atoms, then inversion symmetry is broken.
Inversion symmetry in graphene and hBN. On the left, we have a section of a graphene lattice with carbon atoms in green, and on the right we see a section of a hBN lattice with nitrogen in blue and boron in pink. If we reflect in the dotted line, we see that all carbon atoms end up on another carbon atom in the graphene, while, for hBN, all nitrogen atoms end up on a boron atom, and vice versa. We thus see how inversion symmetry is maintained on a graphene lattice, and broken on a hBN lattice. Left image from: https://doi.org/10.1063/1.4951692. Right image from: http://exciting-code.org/nitrogen-physical-properties-of-boron-nitride
We generally describe the behaviour of electrons on a crystal lattice by what is known as a band structure. This shows how the energy of the eigenstates of the Hamiltonian varies with the momentum of the state. For simple lattices such as graphene and hBN, we generally consider two bands, a lower energy valence band and a higher energy conduction band.
Now, we know that time symmetry requires that energy eigenstates have a partner with opposite momentum and the same energy, while inversion symmetry requires that energy eigenstates have a partner with opposite momentum and negative energy of the same magnitude. Generally, these two conditions can happily exist independently for most momenta. But for some momenta, they collide.
For any crystal lattice, there are certain momenta which are equivalent to their negatives. This is a direct consequence of the periodic nature of the lattice, and such momenta are generally referred to as high symmetry points.
The logic which leads to the existence of protected band crossings at high symmetry points in systems with both inversion and time reversal symmetries
If an eigenstate and its time reversed and inverted partners have the same momentum, such as at the high-symmetry points, then we require that the energy of the state equal its own negative. This is only possible if that energy is then zero. This means that we must have a band crossing at the high symmetry point, since a state in the valence band and its partner in the conduction band have the same energy of 0. Such a band crossing is called symmetry protected, since it arises directly from the symmetries of the system.
In graphene, we have both time reversal and inversion symmetry, which is why we have the famous Dirac points at the K and K’ high symmetry points. But, in hBN, we break the inversion symmetry. This removes the necessity for the energy eigenstates to have an energy of zero, and we expect that the band structure of hBN should have a bandgap at the K and K’ points. And, once again, this is what we observe.
The gap results physically from the difference in energy that the electron experiences when on a Boron atom compared to that when on a Nitrogen atom. This means that extra energy is required for an electron to move between sites, or in other words, extra energy is required for the electron to be able to conduct electricity. Thus, in hBN there exists an energy gap between the valence and conduction band at the K and K’ high symmetry points.
This difference in symmetry dramatically changes the electronic properties of the two materials. The symmetry protected Dirac point in graphene leads to one of the best conductivities of any known material, while hBN is an insulator with a resistivity comparable to diamond.
Examples of Ferromagnetic (top) and Anti-Ferromagnetic (bottom) ordering in a 1D chain. Image from: http://www.nanoscience.de/HTML/research/noncollinear_spins.html
So far, we have considered scenarios where symmetries of a system are broken by external factors, such as magnetic fields, or where they never present in the first place like in hBN. But sometimes, a system is such that it conspires to break its own symmetries. This is the idea of spontaneous symmetry breaking.
When we examine a system in quantum mechanics, we often first consider its ground state. This is the behaviour of the system at zero temperature, and is the state of the system with the minimum possible energy. We obviously cannot observe zero-temperature behaviour directly, but for many systems behaviour in the ground state persists up to temperatures we can observe, or at least changes in a predicable way.
Generally, we expect the ground state of a system to exhibit the same symmetries as its Hamiltonian. The Hamiltonian which describes electrons on a graphene lattice exhibits both time reversal and inversion symmetry, and the ground state of electrons chilling on carbon atoms also has these symmetries. But, for some systems, the ground state breaks one or more of the symmetries present in the Hamiltonian. In other words, the natural state of the system does not have all the symmetries available to it! The ground state is the state of minimum energy, so we don’t need to add energy to the system to break these symmetries. We call such a process spontaneous, hence spontaneous symmetry breaking!
One classic example of spontaneous symmetry breaking is magnetism. Magnetism can arise in number of ways, but all rely on the alignment of electron spins to give a material an overall magnetic moment. In most cases, this is in response to an external magnetic field, but in a few materials, interactions between the electrons give rise to a permanent alignment of their spins, leading to a permanent magnetic ordering. The spins can be parallel, in which case we have a ferromagnet such as elemental iron metal, or they can be in alternating directions, known as an antiferromagnet.
In general, such systems exhibit spin space rotational symmetry. In other words, there should be no “preferred direction” for electron spin to point which reduces the energy of the system. All magnetic orderings of a material break this symmetry, but only the case of a ferromagnet or an antiferromagnet is this the case without any external influence. Both states are therefore examples of spontaneous symmetry breaking.
Why does this occur? One way we describe systems with interacting spins is the Heisenberg model. This is a Hamiltonian where the energy depends only on the angle between two neighbouring spins and a scalar interaction energy. By convention, we give the Hamiltonian an overall negative sign. Parallel spins give a maximal positive contribution, while antiparallel spins give a maximal negative contribution. We can see the spin space rotational symmetry in the Hamiltonian, since the energy does not depend on the direction of any individual spin, only on its angle in relation to its neighbours.
We know that the ground state is the state which minimises the energy of the system. So, in the case of the Heisenberg model where we only consider energies from the spin interaction, the energy is minimised when the energy is as negative as possible. If the interaction energy is positive, then parallel spins will minimise the overall energy since the overall negative sign of the Hamiltonian is maintained. The ground state is then the state in which all electron spins point in the same direction.
If the interaction energy is negative, we minimise the energy by having all spins be antiparallel to its neighbours, since the negative interaction energy and negative overall sign cancel to give a positive sign, meaning that antiparallel spins now give an overall negative sign. Thus, the ground state[1] consists of alternating up and down spins on neighbouring sites. So, the only difference between a ferromagnet and an antiferromagnet is the sign of the interaction energy!
Now, Coulomb interactions, or the potential energy associated with the electric repulsion of two like charges such as negatively charged electrons, give a positive energy contribution in the Hamiltonian. Conversely, the kinetic energy of the electrons gives a negative contribution. The interaction energy, and thus the magnetic ordering, is often dependent on which of these factors “win”, since the spin properties of electrons come into play to ensure that the energy is minimised.
A cartoon of the Coulomb forces between charged bodies. The Coulomb repulsion between like-charged electrons plays an important role in magnetic ordering. Image from: https://commons.wikimedia.org/wiki/File:CoulombsLaw.svg
For example, if the coulomb repulsion between electrons on the same atom, known as the Hubbard interaction, is particularly strong, then the energy is minimised by preventing electrons from occupying the same sites. Recalling the Pauli exclusion principle, which forbids two electrons of the same spin from occupying the same site, then double occupation of sites can be prevented by ensuring that electrons all have the same spin, which produces a ferromagnetic system. We can predict this from the sign of the interaction energy, since the positive coulomb energy dominates the interaction, so we obtain a positive interaction energy.
There are many other fascinating examples of spontaneous symmetry breaking, from the Mott insulator transition in condensed matter physics to even the spontaneous breaking of gauge symmetries in the early universe to produce the Higg’s mechanism. But aside from wanton technobabble, what do we gain from a study in symmetries?
Symmetries provide a powerful tool in theoretical physics. As we have seen, we can make predictions about the behaviour of a system just from the symmetries it possesses. And, perhaps even more importantly, they allow a check on the validity of our models. If you produces a Hamiltonian which does not preserve a symmetry of the system in question, then chances are that you’ve screwed up somewhere along the line.
But also, symmetries are beautiful. Even ignoring Noether’s theorem, the physics which emerges just from the symmetries of a system is incredible. Look at graphene: the very property of the system which leads to its technological potential is intrinsically and fundamentally linked to the inversion symmetry of its crystal lattice. The simplicity of this connection is, in my opinion at least, fantastically beautiful.
[1] Being technical, the AFM state is not the ground state of the Heisenberg model with a negative interaction energy. Its in fact not even an eigenstate of the system! But it’s close to being the ground state and, since it provides an intuitive picture, we will ignore this distinction here.
Mathematical modelling of uncertainty stands or falls on our ability to assess probabilities, but how do you know if your assessments are any good and what does that even mean? This article takes a pragmatic approach to answering these questions. We’ll start in the shadows of the valley of frequentism, step out on to the Bayesian foothills of subjectivism and stride on to the summits of pragmatism and an objective form of the Bayesian view.
From this standpoint, we will see what can go wrong with probabilistic assessments and discuss systematic biases. Finally, the pragmatist interpretation will bring us to a methodology that not only reveals bias, but also makes it clear how well supported the inference of bias is in the data, by showing an uncertainty range that reflects both consistency of bias and the number of data used to reveal it.
To give a sense of the problem, say we assessed at 20% the probability that Stochastic Steel will win the contract they’re negotiating with Bigger Ball Better Ball Ball-Bearings. They win the contract. Was that assessment correct? What does that even mean?
Frequentism
Frequentists believe in objectively correct probabilities, but their only access to these objectively correct probabilities is as frequencies in large (infinite) numbers of repeated identical experiments. They say things like if Stochastic Steel negotiate that contract an infinite number of times then the average number of times they win the contract is the correct probability. But we all know that how ever many times we negotiate the same contract under the same conditions, we either win every time or never at all. If we ever repeat negotiations, it certainly isn’t under identical circumstances. There are workarounds (involving ensembles of possible worlds fixed with respect to the information we have, but variable with respect to everything else), but there’s something very unsatisfactory about an utterly inaccessible notion of probability.
Bayesianism is the natural foil to frequentism. Bayesians start with the notion that probability is simply quantified belief and then work out coherent ways to modify beliefs in the light of data. This is Bayes’ theorem, preached in its purest form by the high priest of Bayesianism, E.T. Jaynes. The problem for many Bayesians (though not Jaynes) is that while we know how to update beliefs, it’s not clear where our belief should start before we bring any data to bear. One way out is to give up on objectively correct probabilities and claim it is legitimate to assess that “prior” belief subjectively.
While it’s certainly true that we can elicit subjective degrees of belief, as a normative theory, I find a subjective notion of probability almost as unsettling as an inaccessible one. It’s a terrible licence for all sorts of silliness. Jaynes seeks (and finds) answers in a coherent, objective description of pure ignorance. I will here come at it from the other end and argue that a pragmatic approach to our final assessments of probability brings us, like Jaynes, to a working notion of objective Bayesianism.
Pragmatism
Pragmatism is a philosophical tradition that started in the late 19th century and continues to this day. It is summed up beautifully in this quote from Charles Peirce, one of its founding members
Consider the practical effects of the objects of your conception.
Then your conception of those effects is the whole of your conception of the object
What Peirce is saying here is don’t get too hung up on what objective probability is, think about how you might use it. Think about its practical effects.
The problem with my 20% chance of success is that the outcome is too uncertain: 20%, 4:1 against. Not great great odds, but still more likely than getting out of jail in one throw in monopoly. So even if the 20% is correct, it doesn’t really have any practical effect. To be able to say something a bit more practical, I need to reduce this uncertainty.
Aggregation to reduce uncertainty
If I have a list of probabilities and outcomes — all the contracts Stochastic Steel negotiated in the last quarter, for example — I can look at the total number of wins, which will reduce the uncertainty a bit.
Assuming a notion of correct probability, I can ask myself if my probabilities were all “correct” what would the probability distribution for the total number of wins look like?
If the probabilities are all the same, this is just a binomial distribution, if not then I explain how to do this (and provide an Excel spreadsheet with the methods implemented) in my article on auditing probability sequences. The result looks something like the figure here, where I have assessed twenty contract negotiations.
On the horizontal axis here, we have the number of successful wins and vertically we have the probability of getting that number, calculated directly and with an approximation.
Interpretations
Let’s say we we came in on the low side of our prediction with three wins in the quarter.
A frequentist would say, “the hypothesis that the probabilities are correct is accepted at 90% confidence, but not at 95%”. This is bonkers. No one ever imagined for a second that every single assessment in that list was perfectly accurate, so the null hypothesis whose truth we are trying to establish at some level of confidence starts with a probability that is both practically and theoretically zero. But it’s nonetheless “not refuted” with a particular level of confidence. Happy days. As you were. All is well.
Lucky that there were three and not just two wins, because then we’d have been refuted at 90%. Sad feelings. Refuted. Switch the lights out on the way out
Still, at least frequentists are trying. The subjective Bayesians just quit on objectively correct. They’ve gone fishing.
Can pragmatism do better? Can a pragmatic interpretation of probability help us understand whether that low result was bad judgement or just bad luck?
Bias
Assuming, as we are, that there is such a thing as an objectively “correct” probability, but for whatever reason — congenital optimism, human heuristics, mathematical ineptitude — I consistently and systematically get it wrong.
Interestingly, such systematic biases are the only things we can hope to capture in this kind of analysis. Random errors — going a bit high here, a bit low there, will tend to cancel out in the aggregation we do to reduce the uncertainty. This is the price we pay for aggregation — we reduce uncertainty, but we give up being able to say anything about individual assessments.
A taxonomy of systematic bias
It turns out that because of the ways random variables behave when you add them up, there are only four kinds of systematic bias we can catch.
Optimism: Consistently pitching probabilities too high
Pessimism: Consistently pitching probabilities too low
Polarization: setting probabilities that are a little higher than average much higher than average and probabilities that are a bit lower than average way lower than average— if it’s good it’s very very good, if it’s bad it’s horrid.
Vagueness: Moving probabilities that are removed from average back towards the average — fence-sitting.
Polarization is a reflection of over-confidence in your ability to assess, or reading too much in to the data. Geologists are terrible at this. Vagueness is rarer, but you do see it, though often as an overcorrection to polarization or because people are gaming their probabilities to get good results on average.
Modelling bias
The pragmatic notion of at least the theoretical existence of a correct probability allows us to build a model for what goes wrong with assessments of probabilities. The basic idea is to use Bayes’ theorem, which as I mentioned tells you how to change probabilities in the light of data or evidence, to alter the true probability on the basis of spurious data. In the case of optimism and pessimism, these spurious data are data supporting or undermining the positive outcome. In the case of polarization and vagueness, they are additional data magnifying the available data that have moved the probability away from its starting point.
It turns out that optimism and pessimism can be captured in a single number, which is positive for optimism and negative for pessimism; and polarization and vagueness can be captured in a single number, which is positive for polarization and negative for vagueness. For the mathematically initiated, bias is modelled simply as a linear transformation of evidence, as I discussed in my article last week. See the endnote below.
The figures above show how true probabilities (red line) are lifted by optimism (left figure, solid lines) and depressed by pessimism (left figure, dashed lines). Polarization (right figure, solid lines) pushes low probabilities lower and high probabilities higher. Vagueness (right figure, dashed lines) brings probabilities back towards an uninformative 50:50.
So now we have a concept of objective, correct probabilities and we have an elegant little two-parameter model for the effects of bias. How do we put these things together to get something useful that also explains what we might mean by objective probability?
The pragmatists “practical effects” of precise probability.
The crucial step is to realize that the concept of objectively correct probabilities allows us to treat our outcomes as samples of the probability distributions described by those correct probabilities. That, in turn, allows us to set up what is essentially a regression problem to work out the bias parameters — those two numbers that describe all the possible forms of bias.
Here’s how it works. We start with our list of biased assessments and outcomes and we wash the biassed assessments backwards through our bias model for some choice of bias parameters to get our objective, putatively correct probabilities. We then compare the outcome predicted by these with the actual outcomes and measure how well that choice of bias parameters fit the data.
There are two ways of closing the loop. The first just says, OK, choose the bias parameters that give the best fit between prediction and outcome. So here I’ve plotted the mean squared deviation between predictions and outcomes (sometime known as a Brier score). The point where it’s lowest is my regression-fitted bias parameter.
The horizontal axis is the optimism / pessimism parameter and the vertical axis is the vagueness / polarization parameter. So we can see our data our best explained by a modest optimism, flavoured with a squeeze of polarization.
Another, I think, better way of looking at this is to ask, what is the probability of seeing the data we see for a range of choices of bias parameter. This is a likelihood function — it gives the maximum likelihood estimator, which isn’t massively different from the least squares estimate above, but which comes equipped with an uncertainty distribution, which gives a very real sense of how well we have nailed that parameter.
Perspectives
Defining probability in terms of what you do with it and leaping lightly over the ontological bog has proven remarkably fertile. It led us first to the recognition that there are only four systematic biases we can hope to extract from an analysis of probabilistic prediction, and we were able to develop a forward model for these. This in turn allowed us to invert back to the parameters in the model and answer the question we asked ourselves in the first place: are these probabilities any good?
Of course the answer to that is never better than “probably”, at least for any reasonably sized set of outcomes, but at least now we have a coherent sense of the uncertainty that circumscribes that “probably”.
Mathematical endnote
We first transform the true probability to an evidence value, as discussed in my earlier article
Then the bias transform just utilizes that the impact of data (spurious or otherwise) in evidence space is linear. So it’s simply a linear transform
Systematic optimism is a constant addition of spurious supporting data (apositive), pessimism a constant addition of spurious detracting data (a negative). When b is positive, events that are more likely than not are increased in their relative likelihood and the converse is true for b negative. Thus b captures, respectively for positive and negative values, polarization and vagueness.
Finally, we transform back to get back to probability
Quantum field theory or QFT is the framework in which the Standard Model of particle physics (which includes the theories of the electroweak and strong nuclear interactions) is formulated. In particular, quantum electrodynamics or QED has predicted the values of physical quantities with unprecedented precision. For example, the magnetic dipole moment of the muon has a measured value of 233 184 600 (1680) × 10⁻¹¹. The theory predicts the following value 233 183 478 (308) × 10⁻¹¹. This agreement is nothing short of astounding (see this link).
However, quantum field theory is also famous for its divergences. Perhaps, the most significant one regards the vacuum energy density (or vacuum expectation value, usually referred to as VEV), discussed in a previous article (see link below). Every quantum field has a corresponding divergent zero-point energy. Summing over all modes (until some physically reasonable energy or frequency cutoff) leads to a gigantic value for the vacuum energy density.
Figure 1: Examples of Feynman diagrams representing divergences in quantum field theory. In the first, a photon creates a virtual electron-positron pair, which then annihilates (vacuum polarization). In the second, an electron emits and reabsorbs a virtual photon (self-energy) (source).
However, the vacuum energy predicted by general relativity and observed experimentally is extremely small. The difference in both estimates can be as high as 120 orders of magnitude (check this link).
The cosmological constant problem, also known as vacuum catastrophe, one of the most important unsolved problems in modern physics is precisely the existence of such discrepancy between the experimentally measured value of the vacuum energy density and the theoretical zero-point energy obtained using quantum field theory, which is much larger (see the link below). Hobson, Efstathiou & Lasenby (HEL) refer to it as “the worst theoretical prediction in the history of physics.”
When gravity is considered, such an energy density leads to severe difficulties since, in general relativity, any form of matter or energy must be added to the vacuum energy (in the case of the other fields, the vacuum energy can be subtracted, as will be discussed later).
The Energy of the VacuumQuantum Vacuum Fluctuations and the Casimir Effecttowardsdatascience.com
A Very Quick Overview of Einstein’s Gravity Theory
Between 1915 and 1916, Einstein concluded the formulation of his general theory of relativity. His gravity field equations showed that the warping of a spacetime region is generated by the nearby energy and momentum of the matter and radiation.
Figure 2: An illustration of the warping of spacetime caused by the presence of the Sun (source).
The following diagram illustrates the mutual correspondence:
Written in terms of tensors G, R, and T the equation reads:
Equation 1: Einstein field equations with Einstein’s tensor G, the Ricci curvature tensor R, and the scalar curvature R (the trace of R).
For a much more detailed treatment of the gravity field equations, check the following article:
Quantum Gravity, Timelessness and Complex NumbersIs The Wave Equation of the Universe Timeless and Real?towardsdatascience.com
The tensor R,the Ricci curvature tensor measures the extent to which the geometrical properties of spacetime differ (locally) from that of the usual Euclidean space.
Figure 3: How the geometrical properties (in this figure, the distance between two points) of a manifold (in this case, with positive curvature) differ from that of the usual Euclidean space.
The tensor g in Eq. 1 is the metric tensor g, andhas the form:
Equation 2: The metric tensor g.
The corresponding line element (representing the infinitesimal displacement) can be written as follows:
Equation 3: The line element corresponding to the tensor g. Figure 4: The vector line element dr (in green) in three-dimensional Euclidean space. The square of the magnitude of dr is equal to the line element (source).
The tensor T on the right-hand side is the energy-momentum tensor. It contains information about the matter and energy which deforms spacetime.
Equation 4: The components of the energy-momentum tensor T (source).
A Bird’s-eye View of Cosmology
In 1917, one year after publishing his general theory of relativity, Einstein applied it to the entire universe (which was thought to be just only the Milky Way galaxy) in his seminal paper “Cosmological Considerations on the General Theory of Relativity.” This paper marked the birth of modern cosmology.
In this article, Einstein assumes the universe to be static and to have a closed spatial geometry (a three-dimensional sphere, which is finite yet unbounded). However, his theory did not admit static solutions (there were only contracting ones), and he had to introduce a new term (and an appropriate mix of matter and vacuum energy), the cosmological constant Λ term in Eq. 1:
Equation 5: Einstein field equations with the new term containing the cosmological constant Λ.
This universe is known is the Einstein static universe and it is not of empirical interest (see Carroll).
Figure 5: Einstein (source) and his 1917 paper where he applied his general theory of relativity to the whole universe (source).
He was able to include this term dependent on g in Eq. 1 for the following reason. The covariant derivative ∇ of both G and T is zero:
Equation 6: The covariant derivative ∇ of both G and T is zero.
Since the metric tensor also has this property
Equation 7: The covariant derivative of the metric tensor is zero.
the consistency of the equation is not spoiled by introducing the term. In the weak-field (or Newtonian) limit, we obtain:
Equation 8: Going to the Newtonian limit we see that Λ acts as a gravitational repulsion, linearly dependent on the distance r.
We see that Λ acts as a gravitational repulsion, linearly dependent on the distance r.
The Expanding Universe
However, in 1929, the American astronomer Edwin Hubble found that:
Several objects thought to be clouds of dust and gas were in fact galaxies beyond the Milky Way (this wasn’t known when Einstein introduced Λ into his equations).
The recessional velocity of galaxies increases depending on their distance from the Earth (the so-called “Hubble’s law”)
His findings together with previous work by the Belgian Catholic priest, mathematician, and astronomer Georges Lemaître concluded that the universe is expanding.
Figure 6: How to calculate Hubble’s constant (source).
The expansion of the universe meant, therefore, that introducing the cosmological constant wasn’t necessary, as Einstein thought. In a conversation with the Ukrainian born physicist and cosmologist George Gamow, he famously remarked that the introduction of the cosmological term was the biggest blunder he ever made in his life.
Figure 7: Edwin Hubble (source) and Georges Lemaître (source), whose work demonstrated that the universe is expanding.
A Different View of the Cosmological Constant
Today, however, the cosmological constant is viewed in a completely new way. In fact, its presence is responsible for the following spectacular experimental result: our universe is not just expanding, it is expanding at an accelerated rate. The phases of the expansion are illustrated in the figure below. Accelerating expansionoccurs when the velocity at which distant galaxies are receding from observers is increasing with time.
Figure 8: Accelerated expansion of the universe (source).
The Friedmann–Lemaître–Robertson–Walker (FLRW) Metric
If one considers very large regions (such galaxy clusters, on scales of the order 100 Mpc where a parsec pc is equal to 31 trillion kilometers) the geometry of the universe (the spatial part of the metric), is approximately homogeneous (it is the same in all locations) and isotropic (it is the same in all directions). See Figure 10 below.
Figure 9: The Galaxy cluster IDCS J1426 has the mass ~500 trillion suns (source).
The figure below illustrates the concepts of isotropy and homogeneity:
Figure 10: The concepts of isotropy and homogeneity (source).
Based on these assumptions, one obtains as solutions of Einstein field equations, universes with constant curvature, the so-called Friedmann-Lemaître-Robertson-Walker (FLRW) universes. Their spatial part can be written in different ways.
One possible choice is:
Equation 9: One possible way to write the spatial part of the FRW metric.
where the function a(t) is the cosmic scale factor, which is related to the size of the universe. The parameter k specifies the shape of the FLRW metric. The three possible values of k are +1,0 or 1 and are associated with universes with positive, zero, and negative curvatures, respectively.
Figure 11: The three possible values of k in Eq. 9 are 1,0 or -1 and are associated with universes with positive, zero, and negative curvatures, respectively.
Introducing spherical coordinates:
Equation 10: Spherical coordinates with dimensionless radial curvature. Figure 12: Spherical coordinates, where the r in the figure is the r with the tilde in Eq. 10.
and defining the coordinate:
Equation 11: The parameter r in terms of the usual r which is identified by the ~ on top of it.
where the ~ identifies the usual radial variable the FLRW metric, and now including the temporal part the line element becomes:
Equation 12: A more convenient way to write the FLRW line element.
It is important to understand that, points in spacetime and spacetime coordinates are two different concepts. Coordinates are labels assigned to thepoints and therefore their choice shouldn’t change the laws of physics. The coordinates r, θ, and ϕ are referred to as comovingcoordinates. As the cosmic scale factor, a(t) increases, the distance between points also increases but the distance in the comoving coordinate system does not.
Figure 13: The cosmic scale factor, denoted by R(t) in the figure, corresponds to the function a(t) (source).
Assuming isotropy and homogeneity at large scales the energy-momentum tensor T becomes the energy-momentum tensor of a “perfect fluid”:
Equation 13: The energy-momentum tensor of a perfect fluid.
where ρ is the mass-energy density and p is the hydrostatic pressure.
A perfect fluid by definition:
Is completely characterized by its rest frame mass density ρ and its isotropic pressure p.
Have no shear stresses, viscosity, or heat conduction (see this link).
Figure 14: A perfect fluid flowing past an infinitely long cylinder (source).
For example, consider T in its rest frame. It becomes simply:
Equation 14: The energy-momentum tensor of a perfect fluid in its rest frame.
With this simpler T we can describe matter using only two quantities: its density ρ and pressure p:
Note that both depend only on the cosmic scale factor a(t). The Einstein field equations then become the well-known Friedmann equations for the scale factor:
Equation 15: The Friedmann equations which are the EFE when the energy-momentum tensor is isotropic and homogeneous. Figure 15: The Russian physicist Alexander Friedmann circa 1922 (source).
The third important equation is the equation of state which is just the equation on the right of Eq. 6 for FLRW universes:
Equation 16: Equation of state for FLRW universes.
Now consider, following HEL, some unknown substance with a particularly unusual equation of state, with negative pressure:
Equation 17: A positive vacuum energy resulting from Λ implies that the pressure p0). The latter is equivalent to the presence of a form of energy, dubbed dark energy, a positive form of vacuum energy. The description mostly used today in the current standard model of cosmology includes the presence of both dark energy and of the postulated dark matter. Figure 17: Three possible endings of the universe are considered by physicists depending on the nature of the dark energy (source).
However, as noted in Carroll (and mentioned in the introduction of this article), general relativity has the following peculiarity: while in non-gravitational physics (electromagnetism for example), only differences in energy are relevant to describe the motion of bodies (the zero of energy is arbitrary), in general relativity the value per se of the energy must be known. This leads us immediately to the question: if the zero of the energy is the energy of the empty state, what is the vacuum energy? One of the most important unsolved problems in physics is how can to answer that question.
Calculating the Vacuum Energy in Quantum Field Theory
Using quantum field theory one can calculate the quantum mechanical vacuum energy (or zero-point energy) for any quantum field. The result of this calculation can be as high as 120 orders of magnitude larger than the upper limits obtained via cosmological observations. It is believed that there exists some mechanism that makes Λ small but non-zero.
Let us calculate the so-called quantum energy of the vacuum which exists throughout the whole Universe.
Figure 18: Fluctuating virtual particles coming in and out of existence and therefore violating energy conservation for short periods according to the Heisenberg uncertainty principle (source).
To avoid unnecessary complications I will follow consider a real massless scalar field φ (instead of the more convoluted electromagnetic field) whichis described by a real function φ(x,t). The classical Hamiltonian, in this case, is:
Equation 24: The free Hamiltonian of the classical real massless scalar field.
Now canonically quantizing the classical field φ. The vacuum energy is obtained by taking the expectation value with respect to the quantum vacuum state:
Equation 25: The energy of the vacuum.
Expressing the field in terms of creation and annihilation operatorsand performing simple algebraic manipulations we arrive at the following expression for the vacuum expectation value (where there are no particles).
Equation 26: The energy of the vacuum.
The second term in Eq. 26 implies that the vacuum expectation value (VEV) is infinite. This infinite contribution to the VEV is what shows up as the cosmological constant. As noted by Carroll, the infinite value is not a consequence of a possible infinitely big space: it is a consequence of the modes with high frequency over which we integrate. If we limit the integration using some cutoff we obtain:
Equation 27: The energy of the vacuum discarding modes with very high frequency.
If QFT is valid for energies as high as the Planck energy we obtain:
Equation 28: Order of magnitude of the VEV if QFT is valid for energies as high as the Planck energy.
Dividing Eq. 28 by Eq. 23 we obtain the (in)famous factor of 120.
Thanks for reading and see you soon! As always, constructive criticism and feedback are always welcome!
My Linkedin, personal website www.marcotavora.me, and Github have some other interesting content about physics and other topics such as mathematics, machine learning, deep learning, finance, and much more! Check them out!
In Albert Einstein’s original formulation of general relativity (his theory of gravity), the fundamental field is the metric tensor g and the theory is covariant in spacetime. Covariance (more precisely general covariance) “consists of the invariance of the form of physical laws under arbitrary coordinate transformations.” The idea is that since coordinates are only human-made labels, physical laws shouldn’t depend on the way they are chosen.
In general relativity, the action (“an attribute of the dynamics of a system from which its equations of motion can be derived”) is called Einstein–Hilbert action:
Equation 1: The Einstein–Hilbert action.
where g =det(g), R(g) is the Ricci scalar or scalar curvature, which is roughly a measure of how much the volume of a small ball in a curved manifold differs from the volume of a ball in Euclidean space, G is Newton’s constant and Λ is the cosmological constant or vacuum energy (which I will omit from now on). (note that S should include a boundary integral if the spacetime in question is bounded). For simplicity, let us assume the absence of matter.
Einstein and Hilbert (source).
Demanding that the variation of this action with respect to the (inverse) metric be zero we obtain Einstein’s source-free field equations:
Equation 2: Einstein’s field equations in the absence of matter.
where R is the Ricci curvature tensor.
An analogy with particle mechanics may help clarify these ideas. In Lagrangian mechanics, we are given a Lagrangian L, we build an action functional S and obtain the equations of motion by extremizing S:
Equation 3: Given a Lagrangian L, and extremizing the action functional S, we obtain the equations of motion. Figure 1: The particle chooses the path that extremizes the action.
The Goal of This Article
The goal of this article is to show that time is encoded in the geometry of space. The procedure follows the 1962 paper by Baierlein, Sharp, and Wheeler (which I will henceforth call BSW). In the words of the famous book by Misner, Thorne, and Wheeler, Gravitation, “three-geometry is a carrier of information about time”.
This occurs for the following reason. In the extension of the Lagrangian approach in Eq. 3 to general relativity, we will have (see Eq. 4):
An initial three-dimensional surface with an intrinsic geometric structure (first term of Eq. 4 below).
A second three-dimensional surface also with an associated intrinsic geometry (second term of Eq. 4).
Equation 4: The intrinsic geometries of the two three-dimensional surfaces described above and the geometry of the four-dimensional spacetime.
The goal is to find a four-geometry (the third term of Eq. 4) that satisfies Einstein’s equations (Eq. 2) and reduces to the three-geometries on three-surfaces σ and σ’. In other words, quoting the book Gravitation, “Given the 3-geometries of the two faces of a “sandwich of spacetime”, adjust the 4-geometry in between to extremize the action.” For that, they develop a variational principle (the so-called “condensed thin-sandwich variation principle”) that depends only on the intrinsic properties of the two three-surfaces. Following their procedure we:
Find the timelike separation between two three-dimensional surfaces (quoting Barbour “how far apart in time” the three-spaces are)
Find the location of the surfaces in spacetime.
In such four-dimensional spacetime, each three-geometry is well specified based only on the two intrinsic geometries.
Before going any further, it is crucial for the reader to have a notion of the so-called ADM formalism of general relativity. This will be the topic of the next section.
Bird’s-Eye View of the ADM Formalism
In the ADM formalism, named for its authors RicardArnowitt, StanleyDeser, and CharlesMisner, general relativity is formulated as a dynamical theory (an initial value problem such as described in the two sections above).
Figure 2: Richard Arnowitt, Stanley Deser, and Charles Misner (source).
The dynamics of general relativity is referred to as Geometrodynamics. As will be shown in this section, the “things that change” in general relativity are distances within three-dimensional surfaces embedded in four-dimensional spacetime (and not the four-dimensional spacetime distances).
Figure 3: The superspace of geometrodynamics (source).
In this dynamical version of general relativity, the configuration space is called superspace.
The ADM construction is illustrated below:
Figure 4: The ADM construction: two three-dimensional surfaces (three-geometries) embedded in four-dimensional spacetime. The normal vector to P₁ on Σ(t) will cross Σ(t+dt) in a point with different coordinates. The shift vector measures this distortion (source).
The shift vector N, shownin the figure, is a measure of the “distortion of the surface as it evolves with time.” The proper distance between the two surfaces is dτ = N₀ dt where N₀ (or N) is the lapse function. The lapse function measures the rate of change of proper time with respect to the time label of the surfaces Σ(t).
Using Fig. 2, we can re-write ds² using the shift and the lapse (for details see Poisson):
Equation 5: Distance between P₁ and P₄.
where the tensor g after the second equality is the metric of the three-surface. The extrinsic curvature of the three-dimensional hypersurfaces embedded in the four-dimensional spacetime has the following form
Equation 6: Extrinsic curvature of the hypersurfaces.
where the symbol “|” denotes the covariant differentiation with respect to the intrinsic spatial metric within the surfaces.
Figure 5: The extrinsic curvature (based on source).
The Ricci scalar R can be written in terms of the extrinsic curvature K, itstrace K, andthe 3-curvature intrinsic ³R(the 3-dimensional version of the Ricci scalar). Denoting the intrinsic curvature by:
Equation 7: Renaming the intrinsic curvature.
the Ricci scalar can be written as (see Poisson):
Equation 8: The Ricci scalar written in terms of the extrinsic curvature, its trace, and the three curvature.
The Lagrangian density is then:
Equation 9: The gravitational Lagrangian density expressed in terms of the three-curvature, the extrinsic curvature, and its trace.
The Lagrangian density does not depend on the time derivatives of the shift and the lapse. Denoting by π and dπ/dt the momenta conjugate to N₀ and Nᵢ respectively we find:
Equation 10: The momenta conjugate to N and N are identically null for all times
Hence, πandπⁱ, the lapse and shift are zero for all times and therefore are not dynamical variables (they are only measurements of deformations of the surfaces Σ). Consequently, we have:
Equation 11: Conditions on the Hamiltonian H.
From the H expression:
Equation 12: The Hamiltonian using ADM variables.
where we defined:
Equation 13: Definitions of the objects after the second equality in Eq. 12.
The tensor G in the first equation is given by:
and is called the Wheeler–DeWitt metric.
Finally, using Eq. 11 we obtain the equations of a Hamiltonian dynamical system with constraints, which a reformulation of the Einstein field equations, which describes the evolution of three-metrics:
Equation 14: Equations of a Hamiltonian dynamical system with constraints that describes the evolution of three-metrics.
Paul Dirac, one of the most important physicists of the 20th century, and the American theoretical physicist John Wheeler who was largely responsible for reviving interest in general relativity after World War II were “impressed by the simplicity of the Hamiltonian formulation that he questioned the status of spacetime”.
Dirac declared:
“This result has led me to doubt how fundamental the four-dimensional requirement in physics is. A few decades ago it seemed quite certain that one had to express the whole of physics in four dimensional form. But now it seems that four-dimensional symmetry is not of such overriding importance, since the description of nature sometimes gets simplified when one departs from it.”
— Paul Dirac (1963)
Wheeler wrote:
“Here the dynamic object is not space-time. It is space. The geometrical configuration of space changes with time. But it is space, three-dimensional space, that does the changing. No surprise! In particle dynamics, the dynamical object is not x and t, but only x… [H]ow can physicists change their minds and “ take back “ one dimension? The answer is simple. A decade and more of work.. has taught us through many a hard knock that Einstein’s geometrodynamics deals with the dynamics of geometry: of 3-geometry, not 4-geometry.”
— John Wheeler (1967)
Figure 6: Paul Dirac (source) and John Wheeler (source) were “impressed by the simplicity of the Hamiltonian formulation that he questioned the status of spacetime”.
Using geometrodynamics with the action obtained from Eq. 9, the time evolution of a three-geometry of curved empty space evolving can be determined by giving:
The geometry of the initial surface (the induced metric three-dimensional hor intrinsic curvature)
The extrinsic curvature of the initial surface K which describes its embedding in the spacetime to be constructed using Einstein’s equations (see Fig. 3).
However, h and K cannot be specified independently. They must obey the initial value equations of Foures and Lichnerowicz.
The Baierlein, Sharp, and Wheeler Procedure
I will now describe the steps in BSW (following BSW, the prefactor 1/16πG will be omitted).
Step 1
First, choose two very similar (almost identical) three-dimensional metrics:
Equation 15: Two almost identical three-dimensional metrics.
The time separation between the two surfaces is finite and it is chosen for convenience to be Δx⁰=1.
Figure 7: Illustration of the two three-dimensional surfaces.
Step 2
The next step is to fill the region between the surfaces with a yet undetermined four-geometry. The separation between two points with coordinates
Equation 16: The coordinates of two points, one on each surface.
is given by:
Equation 17: The line element between the two points in Eq. 16
Here η₀ is the lapse function, ηᵢ the shift vector and the first factor of the first term is an average between the metrics of the two three-surfaces.
Step 3
A four-geometry that is an extremum of the BSW action below and has as boundary conditions the geometries of the two three-surfaces will satisfy Einstein’s field equations (Eq. 2).
The action integral is:
Equation 18: The action integral.
In the ADM language it becomes:
Equation 19: The action integral expressed in the ADM formalism.
The quantity π is the geometrodynamic field momentum conjugate to the geometrodynamic field coordinate ³g defined by:
Equation 20: The ADM field momentum π.
The vertical bar represents covariant derivatives within the 3-surfaces.
Step 4
Now suppose that the three-geometries are nearly identical. We then have:
Equation 21: The volume element when both three-geometries are nearly identical.
Replacing the extrinsic curvature by the following unnormalized time derivative of the three-metric in the direction normal to the surface:
Equation 22: The unnormalized time derivative of the three-metric in the direction normal to the surface.
and using Eq. 21, the action becomes:
Equation 23: The action when the three-geometries are nearly identical.
Step 5
One now extremizes with respect to η₀ and find:
Equation 24: The proper time separation between the two three-surfaces.
This is the proper time separation between the two three-surfaces.
Step 6
Note that η₀ depends on κ which dependsonthe shift vector ηᵢ(x¹, x², x³). To obtain ηᵢ(x¹, x², x³), one substitutes Eq. 22 into the action Eq. 23. The result is:
Equation 25: The action in which only the components of the shift vector are to be varied.
where η₀ was eliminated from the action. One then varies the action I only with respect to the components ηᵢ of the shift vector and obtains three equations of the second order for the ηᵢ.
Step 7
The next step is to solve the equations in Step 6 for the ηᵢ (with appropriate boundary conditions) and substitute them into Eq. 22 and Eq. 24. One, therefore, finds the time separation η₀ in terms of the intrinsic geometries of two three-surfaces.
Step 8
The extrinsic curvature K is obtained from Eq. 22. It can be shown (see references in BSW) that using Einstein’s field equations together with the initial three-metric and the extrinsic curvature K determine(up to a coordinate transformation) the four-metric of the spacetime where the surfaces are embedded.
BSW, therefore, showed how to find the time-like separation between the two surfaces and where they are located in spacetime given the two three-geometries.
Thanks for reading and see you soon! As always, constructive criticism and feedback are always welcome!
My Linkedin, personal website www.marcotavora.me, and Github have some other interesting content about physics and other topics such as mathematics, machine learning, deep learning, finance, and much more! Check them out!
In 1982, Kahnemann, Slovic and Tversky published “Judgement under uncertainty: Heuristics and biases” and shattered humanity’s collective self-delusion that we had any functional intuition for even the most rudimentary problems in probability theory. This work has seen a renaissance in popularity since the publication of Kahnemann’s rather more accessible “Thinking fast and slow”.
Kahnemann is broadly sympathetic to our struggles, but much of the follow-up literature and course material has a slightly disparaging, not to say patronizing odour, as if reliable probabilistic intuition is just a question of a little hard work and application.
I have taught probability theory to all people of all ages, backgrounds, levels of motivation and levels of mathematical enthusiasm; from talented school kids and struggling school kids to undergraduates who take it because they have to and post-graduates who study it out of love. A large part of my consultancy involves teaching probability to people who want to use data to make more informed decisions.
I’m here to tell you that probability is hard and why, but that a little goes a long way and it’s entirely worth the struggle.
Probability theory is not intuitive
When we learn to drive, we turn the steering wheel, the car responds. It responds differently at different speeds. When we start, this takes us by surprise. We first over-turn, then we under-turn because we just over-turned, then we begin to narrow in on an appropriate response and we learn to do this across a range of speeds. We can do this because, largely, our car responds in the same way for a given speed every time. We subsume and integrate those responses and we absorb cues that allow us to extend that learnt intuition to other, similar mechanical systems and from there to mechanical systems in general.
Photo by Keenan Constance on Unsplash
Uncertain systems, by definition, respond differently every time, to the same input. The wiring we possess to subsume, integrate and automate responses simply can not work. Best case, it just can not engage. Worst case, we try to hypothesize response protocols— inferring patterns where there are none — and we become angry and frustrated when they don’t work.
Given the immense challenge of developing an intuition about just one stochastic system, we can hardly be surprised that it is virtually impossible to generalize to uncertain systems in general
Probability theory is all “Slow”
Readers of Kahnemann’s “Thinking Fast and Slow” will recognize the distinction between System I “Fast” (intuitive, instinctive, often emotional) and System II “Slow” (deliberate, methodical, rational) thinking. System II is slow, but it’s also hard; it demands energy, will-power and — certain mind states aside — it is a limited resource. Because Probability Theory is non-intuitive, it is perpetually doomed to languish in System II thought paradigms.
Photo by Ray Hennessy on Unsplash
We might hope that notwithstanding the additional costs of developing stochastic intuitions, sufficient exposure over a long enough period of time may bring us to an instinctive apprehension of uncertain systems. That may be; however, despite a maths PhD, and two decades using probability theory professionally, I have not experienced this myself. It is, though, undoubtedly possible to develop an effective intuition for how to System-II-solve probability problems. So while we can develop an intuition to speed up our “Slow” thinking, it’s still “Slow” (and hard).
Probability is conceptually confusing
Students (in the broadest sense) who look to learn the “Slow” logic of probability are immediately faced with considerable conceptual challenges.
Photo by Parsa khass on Unsplash
First, probability theorists don’t even agree what probability is or how to think about it. While there is broad consensus about certain classes of problems involving coins, dice, coloured balls in perfectly mixed bags and lottery tickets, as soon as we move into practical probability problems with more vaguely defined spaces of outcome, we are served with an ontological omelette of frequentism, Bayesianism, Kolmogorov axioms, Cox’s theory, subjective, objective, outcome spaces and propositional credences.
Even if the probationary probability theorist is eventually indoctrinated (by choice or by accident of course instructor) into one or other school, none of these frameworks is conceptually easy to access. Small wonder that so much probabilistic pedagogy is boiled down to methodological rote learning and rules of thumb.
Conclusion
There’s more. Probability theory is often not taught very well. The notation can be confusing; and don’t get me started on measure theory.
The good news is that in terms of practical applications, very little can get you a very long way. The alternative to the basic level of understanding that allows a quantitative analysis of uncertainty is, frankly, crystal balls and tea leaves. Even simple models, based on the most rudimentary probabilistic paradigms, will clarify outcomes, furnish a framework and provide insight into the data required to make informed decisions.
And though it’s hard, it’s without a shred of doubt entirely worth the effort. Probability theory, despite its ongoing turf war, is mathematically mature enough that whatever framework you adopt actually represents pretty much the minimal conceptual machinery you need rationally to navigate uncertainty.
So go for it, but be ready and be kind to yourself. It’s hard.
Since existence, humans have been trying to predict the weather. Early methods looked to astrology and the lunar phases. Even the Bible contains references to Jesus deciphering local weather patterns! And it makes sense, understanding the weather would offer immense advantages both in the battlefield and for agriculture.
However, like many achievements, it took a major conflict to really stir us into action. World War 1 really highlighted the need for accurate predictions of the weather. Both sides depended on knowing wind patterns for bombing raids and the drift of poison gas. This led to immense efforts to improve our understanding of these complex predictions. Attempts during this era were largely unsuccesful, as we just simply lacked the computational power. Forecasts would take days to run through and simply were not accurate enough. Instead, we relied on heuristic predictions that were far from reliable.
That all changed as computers slowly became more advanced. John von Neumann led the charge toward numerical forecasting. He envisioned future humans with the ability to completely control the weather due to both the accuracy of predictions and the ability for computers to know exactly what would happen given any pertubation.
Oh how wrong he was.
Edward Lorenz was freshly minted graduate in mathematics from Dartmouth in 1938 when he was called to war. There, he worked as a weather forecaster. Forecasting was still primitive, but was gradually being improved due the telegraph and the ever-growing amount of data available. This sparked an interest in meteorology, in which Lorenz received his doctorate from MIT, but he never really lost his mathematical mind.
Lorenz, along with Ellen Fetter, worked in the 60’s to develop a set of differential equations that could model the weather. For one of his studies, he examined a “cell” of atmospheric convection (think of this as air circulating) between a hot plate and a cold plate and watched how it changed over time. This model, seen below, had 3 different equations.
Lorenz’s ConvectionModel
Lorenz’s first equation describes the convection itself. The variable x gives the rate of convection, and this equation describes how that rate evolves with time. His other two equations deal with temperature gradient, with y in the horizontal and z in the vertical. σ, ρ, and β are empircal variables that Lorenz varied to get different results. For more detailed explanation on working through the math, check out this fantastic piece. The Lorenz Equations show up in a slightly different form at the bottom of page 3.
This set of equations is deceptively complex and can quickly get out of hand. The symbol ρ is referred to as the Rayleigh Number, which gives insight into how heat is transferred due to convection. When ρ < 1, this system will eventually settle into an equilibrium with x, y, z all being 0. However, this is not what we observe in nature. Lorenz used the values σ = 10, ρ = 28, and β = 8/3 based on observation. See Figure 1 for an example of this system using these parameters and initial conditions (0, 1, 0).
Figure 1: Chaos in the Lorenz System, taken from this page
If we plot z and x on the same axis, we get the famous Butterfly Plot as seen in Figure 2 below.
Figure 2: The Butterfly Plot, taken from this page
But what do these pretty pictures tell us about the weather? Scientist knew it would be complicated, but won’t we know exactly what will happen if we just get enough data? Like many things in science, a discovery was made completely on accident. Lorenz wanted to redo an earlier model run, but didn’t want to have to wait for the entire process (computers were slow back then). Instead, he input the x, y, and z values from halfway through and let it work from there. What he found shocked him. The solution began to move further and further away from his previous run, eventually becoming unrecognizable. Turns out, his input values were slightly less precise than the values that the computer was using. This very slight change in input drastically changed the output. You may know this as the Butterfly Effect.
This observation went against all previous intuition related to the weather. Yes, we knew it would be complex, but we had no idea how even the slightest imprecision would create massive inconsistencies. This also disproved John von Neumann’s idea about weather prediction; it was just too chaotic!
Modern weather forecasting only tries to limit the chaos; it is now accepted that it will never be able to eliminate it completely. This is done through ensemble forecasting, which was suggested in the 70’s and began being implemented in the 90’s. In this method, many simulations are done with varying initial conditions and the results are average to produce the most likely outcome. Newer versions will utilize different mathematical models as well and average those outcomes together.
Anyone who’s looked at the forecast know that we have a long way to go with weather prediction. Chaos Theory is still young field with so much to offer! For an informal introduction, I highly recommend Chaos by James Gleick. It does a lot to put the theory in historical context and gives you the basics. For more details and problems to work through The Nonlinear Workbook. This book is full of example and problems (some are coding based!) that help give an idea of just how varied the field is!
Thanks for reading! Leave a comment if you have any thoughts or questions about this article.
The Best Numerical Derivative Approximation Formulas
Approximating derivatives is a very important part of any numerical simulation. When it is no longer possible to analytically obtain a value for the derivative, for example when trying to simulate a complicated ODE. It is of much importance though, as getting it wrong can have detrimental effects on the
Calculus
Exploring a Different Kind of Calculus
And Deriving a Beautiful Identity in the Process
Calculus
Determining the Shape of a Hanging Cable Using Basic Calculus
How to Derive the Equation of the Catenary
Calculus
Two Myths of Numerical Integration, Debunked
Many programmers believe that the use of higher order integration algorithms, combined with a large number of integration interval…
Calculus
An Easier Way to Deal with Derivatives
An easier method to evaluate the product and quotient rule, and display their symmetry.
Feynman
Finding Derivatives of Complicated Functions According to Feynman
A Clever Way to Quickly Take Derivatives Used by Richard Feynman
Calculus
The Famous Problem of the Brachistochrone
From Newton’s “Lion Claws” to the Modern Solution
Calculus
Calculus You Forgot (Or Never Learned): Derivatives
Intuitive ideas about the derivative
Calculus of Variations
Introduction to the Brachistochrone Problem — Finding the Time to Slide Along a Path
It seems like the most common introduction to the Calculus of Variations is to talk about the Brachistochrone problem. It goes like this.
Calculus
Let’s Derive the Power Rule from Scratch!
From nothing but the definitions of derivatives and exponentiation, we’ll prove one of the first derivative rules you’ll learn in college
Calculus
The Intuition Behind Integration
An introduction to integral calculus with a focus on the core concepts and the beautiful intuition that lies behind the algebra.