Category Archives: Physics

Curvature II: Spacetime

By Scott Church – Guest Blogger

In the first installment of this series, we explored the nature of curved spaces and introduced ourselves to some of the mathematical tools needed to describe how length, breadth, and height can be curved without higher dimensions to “curve into.” In the interest of keeping our exploration as intuitive as possible, we began with the Euclidean geometry we learned in high school and explored curvature from the vantage point of time as we experience it—a universal history that is the same for all of us and independent of the spatial stage on which our lives unfold. Today we will explore the nature of time and its relationship to space and discover (spoiler alert!) that in fact, it is neither separate from space nor absolute—not only can length, breadth, and height be curved, duration can be as well. The universe we inhabit is one of curved spacetime.

Special Relativity

The Newtonian physics we learned in high school presumes absolute three-dimensional space and time. In the low gravity and velocity world we live in, that is how we experience them. But intuitive as this may seem to us, there are hints that something is amiss. That physics also taught us that the speed of light \(c\) is a universal constant that can be derived from Maxwell’s equations. And as we saw in Part I, the laws of physics, including \(c\), must be invariant for all observers stationary or moving. Pause for a moment and reflect on what this implies. If I am standing beside a highway and you drive by at 50 mph, that is the speed I will observe. In the car, you will see yourself as stationary and the world passing you at 50 mph in the opposite direction, including me. Another driver doing 70 mph in the fast lane will pass me at that speed and you at 20 mph. But Maxwell’s equations will remain true and invariant for all observers, so if a beam of light is shined in the same direction, it will pass all three of us at the same speed. How is this possible?

Imagine that you are now the one who is stationary, and I fly past you in a fighter jet at a speed \(v\) of 3600 mph, (or one mile/sec for round numbers) carrying a clock that is in sync with an identical clock of yours. As I pass you, it emits a pulse of light in your direction at time \(t_1\) which reaches your eye after travelling a distance \(d_{t1}\) (Figure 1). One second later at \(t_2\), a second pulse is emitted, but I will have flown one mile further so that pulse must travel a distance \(d_{t2}\) before it reaches your eye. My clock will be ticking at the same rate in my reference frame as yours is for you, but the seconds you observe on my clock will be longer because the second pulse you receive from it must travel further at the same speed \(c\) to reach your eye than the first one did. Your experience will be that my time runs slower for you than it does for me. And for the same reason, your clock will be running slower for me than it is for you.

 

Figure 1

As for distances, the length of my jet will be measured by the time it takes a pulse of light to travel from the nose (\(A\)) to the tail (\(B\)) at speed \(c\) (Figure 2). In my reference frame that will be given by,

\(L = c\Delta t_{ba}\)                    [Eqn. 1]

 where \(t_{ba}\) is my proper time (that is, the time measured by a clock at rest in my reference frame).

Figure 2

In your reference frame, the pulse of light will take a time \(\Delta t^{‘}_{ba}\) to travel the length of my jet. However, while the pulse is in transit, point \(B\) will have moved forward a distance \(v \Delta t{‘}_{ba}\) so the pulse will arrive at point \(B_2\) instead (Figure 3),

Figure 3

And you will observe the length of my jet to be the distance between \(A\) and \(B_2\), or,

\(L^{‘} = (c – v)\Delta t^{‘}_{ba}\)                    [Eqn. 2]

Not only will you see the pulse travelling a shorter distance that me, the time \(\Delta t^{‘}_{ba}\) will also be less than the \(\Delta t_{ba}\) I observe because time is running slower for you than for me. The length \(L^{‘}\) you observe for my jet will be smaller than the length \(L\) I observe, and you will see me and my jet as though we were compressed in the direction of travel.

Thus, we arrive at one of the foundational principles of special relativity; Space and time are neither absolute nor independent of each other. They’re united in a single spacetime manifold whose metric contains an underlying symmetry that preserves Maxwell’s equations and \(c\) for all observers. And this manifold is not simply a map of locations and distances—it’s a frame-independent history of events for every location within it.

In Part I we saw that in the flat Newtonian universe of our experience, time is absolute and independent of space. All observers experience it the same, and spatial geometry is Euclidean with the interval between any two points is given by the Pythagorean Theorem,

 \(ds^{2} = dx^{2} + dy^{2} + dz^{2}\)                    [Eqn. 3]

In spacetime, however, this is no longer the case. Now we have a collection not of points, but events that reflect the histories of each spatial point within it. The interval no longer defines the distance from here to there; It defines here and now, to there and then. Accounting for this in our metric tensor won’t be as simple as it may sound. As we’ve seen, the speed of light must remain the same for all observers whether stationary or moving in any reference frame. And the relative motion slows time down and compresses space until both reach zero at the speed of light. From our vantage point, a photon’s reference frame is a single event with a zero-length interval, so our interval must include time with a sign opposite to that of space. After multiplying time by \(c\) to convert it to equivalent distance units, this gives,

\(ds^{2} = dx^{2} + dy^{2} + dz^{2} – (ct)^{2}\)                    [Eqn. 4]

Which adopting the usual (though not strictly necessary) convention of making time the first, or zero component, results in the spacetime metric tensor,

                              [Eqn. 5]

The diagonal terms expressed as a tuple, [-1, 1, 1, 1], is known as the metric’s signature. In differential geometry (the branch of mathematics that generalizes the geometry we learned in high school to all types of curves spaces), a continuous N-dimensional manifold that has a well-defined and positive-definite metric tensor at all points (not all mathematically possible ones do) is referred to as Riemannian. That is, a flat 4-D Riemannian metric is one that for every point on it, infinitesimal displacements in the locally flat tangent plane have a metric signature of [1, 1, 1, 1]. A universe with Euclidean geometry and absolute time would have this metric everywhere. But in a universe constrained by special-relativity the interval can be zero as well as positive, so the metric is non-degenerate rather than positive-definite. Manifolds of this type are referred to as pseudo-Riemannian.

In Part I, we conducted a geometric thought experiment in which we traversed a closed triangular path through the flat space of an observer named Freddy, and another through the curved space of an observer named Cathy along geodesics (paths that reflect the shortest distance between any two points). In each, we carried one vector with us while leaving an identical parallel copy of it behind and upon returning to point A. When we did this in Freddy’s flat space we found, not surprisingly, that after completing the journey the two vectors were still parallel to each other. But after the same journey through Cathy’s curved space, we discovered that the vector we carried with us was no longer parallel to the one we left behind even though both were still pointing in the same direction (globally south), and we had travelled a shorter distance that still encompassed a larger area. We introduced some mathematical concepts that allowed us to define a covariant derivative 1 to describe the rate of change of the vector we carried with us along our path \(s^{\sigma}\),

\(\nabla_{\mu}s^{\sigma} = \partial_{\mu}s^{\sigma} + \Gamma^{\sigma}_{\mu\nu}s^{\nu}\)                    [Eqn. 6]

The first term on the right is the usual vector calculus gradient along the direction of travel. The second term, however, introduced a new object, the Christoffel symbol, that allowed us to map changes in the underlying tangent plane containing \(s^{\sigma}\), itself onto local coordinate systems within it as we traversed the path. Integrating this derivative along our path would then fully capture the changes in our mobile vector with respect to its stationary twin we left behind.

That exercise, however, traversed a path through Cathy’s curved two-dimensional space, so equation 6 described distances and directions only. Had we included time in her curved universe, the path we walked would have been a trajectory of motion with history, and upon arriving back at A we would have found that our mobile vector was now older or younger than its stationary copy as well. In curved space, geodesics are the shortest distance between points—here and there. But in spacetime they are histories that reflect the shortest path, stationary or moving, between here and now, and there and then. As such, they define an equation of motion for the trajectory an object will follow when no forces are acting on it.

In a flat spacetime like Freddy’s, an object left to itself will remain stationary or move at constant velocity, so its geodesic will be a straight line whose slope will be the constant speed it is moving at. If one or more forces act on the object it will accelerate, and its history will follow a curved path whose velocity changes from moment to moment. We can derive the equation of motion for this by using equation 5 to derive the second order time derivative along \(ds^{\sigma}\), to equate the acceleration produced by a force it to its strength divided by the object’s mass. In his flat spacetime, a single unvarying tangent plane spans the entire universe, so the Christoffel term will vanish, leaving us with,

\(\frac{\partial^2 s^{\sigma}}{\partial t^2} = \frac{F}{m} = 0\)                    [Eqn. 7]

Which we will recognize as a geodesic equation of motion for Newton’s second law that we learned in high school.

In Cathy’s universe things are different. There, geodesics are curved so the Christoffel term will generally be non-zero, and her equation of motion will be given by,

\(\frac{\partial^2 s^{\sigma}}{\partial t^2} + \Gamma^{\sigma}_{\mu\nu}\frac{\partial s^{\mu}}{\partial t^2}\frac{\partial s^{\nu}}{\partial t^2} = 0\)                    [Eqn. 8]

Notice that in curved spacetimes like hers, the second term on the left will be non-zero even in the absence of forces, so the first term will be as well. Left to themselves, objects in a curved spacetime will experience freefall along accelerating trajectories.

Which brings us to the next topic…

General Relativity

The other hallmark of our high-school physics lessons was Newtonian gravity. In a universe of flat space and absolute time like Freddy’s, gravity is an attractive force between objects whose strength is a function of their masses and the distance separating them. Specifically, the gravitational force \(F_g\) between two objects with masses \(m_1\) and \(m_2\) is given by,

\(F_g = g_c\frac{m_1m_2}{r^2}\)                    [Eqn. 9]

Where \(r\) is the distance between their centers of mass and \(g_c\) is the universal gravitational constant we also learned in our high-school physics classes.

For centuries this understanding of gravity has served us well in regions of low mass, velocity, and distance, and still does. I spent twenty years as an aerospace engineer designing commercial jet aircraft structures, and the aircraft my colleagues and I applied these principles to still have exemplary safety and performance records. But even so, physicists have long been troubled by the idea of “spooky action at a distance” forces. How can objects interact with each other invisibly over large distances? On the other hand, we can put it differently by saying that gravity causes objects with mass to accelerate toward each other at a rate given by their masses and the distance separating them, and as we saw above, freefall acceleration is a consequence of spacetime curvature. Jumping the gun, we also know that mass and energy are equivalent (hence Einstein’s celebrated \(E = mc^2\)) and moving objects with mass have a kinetic energy that is a function of their momentum and mass (\(K = p^2/2m\)). This raises an interesting question…

What if gravity isn’t a force at all, but simply a local manifestation of spacetime curvature due to mass, energy, and momentum?

If this is true, then we would expect that two objects of differing mass in the field of a third object of much larger mass (like the earth, for instance) would experience the same freefall acceleration toward it—essentially, that the “force” \(F_g\) the gravitational field exerts on their differing small masses would result in the same acceleration for both,

\(\frac{F_{g1}}{m_1} = \frac{F_{g2}}{m_2}\)                    [Eqn. 10]

And this would be the same acceleration that would result from an equal but non-gravitational force (e.g. – the thrust produced by a rocket engine). As you’ve probably guessed by now, this is the case. Gravitational mass and inertial mass are indistinguishable from each other, and freefall accelerations induced by the former are a consequence not of any “spooky action at a distance” force, but of the local spacetime curvature created by its presence. This identity, known as the equivalence principle, is the heart and soul of general relativity. Throw a pebble into a pond and watch it arc through the summer sky before splashing down, and you are literally seeing the curvature of length, height, breadth, and duration where you’re standing because of the mass of the earth beneath your feet! 2

And once again, if spacetime curvature is caused by mass, energy, and momentum, we can ask ourselves how this could be captured mathematically. As in Part I, a formal derivation of the relationship between the two is beyond the scope of an introduction to the topic, but we can introduce the types of mathematical objects needed and how they relate to each other. The first thing we need is an object that describes curvature. Like the terms introduced so far, it will need to capture the change in angles over infinitesimal displacements from any reference frame we view it from, so it will need to be a covariant or contravariant tensor. And since we want it to describe curvature specifically rather than displacements, it will be a function of the Christoffel symbols that describe how they change when we walk a parallel transport path (or more properly, a function of their first derivatives, or rates of change). To unambiguously capture this, we will have to carry a four-vector \(ds\) (that is, a vector in three spatial dimensions plus time) around an enclosed path for which all the interior angles are orthogonal to each other (locally 90 degrees). Previously, we were able to do this with a triangular path in Cathy’s space because for clarity of the underlying principles we presumed it to be spherically curved, but that won’t be true of curved spacetime in general. So, now we must carry our four-vector along a four-legged parallel transport path (presumed to be infinitesimally small for a local curvature description), again preserving its local orientation at every point, as shown in Figure 4 (Wikimedia, 2015).

Figure 4

Upon returning to our starting point, we will have a function that describes how each of the four components of \(ds\) changed with respect to the others for each of the four legs of the journey. As such it will be a tensor with four indices (rank 4) each of which covers four dimensions, so it will have \(4^4\), or 256 components. This tensor, known as the Riemannian curvature tensor \(R^{\mu}_{\nu\rho\sigma}\), fully describes the actual curvature of spacetime at every point on the manifold. It can be specified in covariant or contravariant terms, but since it captures how a contravariant vector is affected by local covariant curvature, it’s customary to express it with one “upstairs” index and three “downstairs” ones, as shown here.

Before going any further, there are two related tensors we’re going to need (why will become apparent shortly). In Part I we discussed how a tensor object defined by N indices can be “contracted” to fewer indices by projecting one or more of the index’s components onto the others—in essence, “averaging” it into the remaining ones. For a tensor expressed in covariant form for all indices, we do this by multiplying it by the contravariant metric tensor in one or more of its indices. Contracting the Riemann tensor in this manner for two of its four indices gives,

 \(g^{\rho\sigma}R_{\mu\nu\rho\sigma} = R_{\mu\nu}\)                    [Eqn. 11]

The resulting tensor, \(R_{\mu\nu}\), is known as the Ricci tensor. Contracting it again on both of its indices yields the Ricci scalar, \(R\). These have different physical interpretations. The Ricci tensor describes the rate of change of an infinitesimal element of spacetime volume along \(ds\) due to tidal forces. That is, as we move through spacetime along a group of infinitesimally separated parallel geodesics, it describes how an element of volume between them changes in each direction. The Ricci scalar, on the other hand, gives a non-dimensional measure of how the overall enclosed volume itself changes.

Next, we need a tensor object that describes the mass, energy, and momentum we suspect to be curvature’s source. That tensor (which we won’t make any attempt to formally derive here), is known as the stress energy momentum tensor, \(T^{\mu\nu}\). Its components are defined in a manner similar to those of the metric tensor, \(g_{\mu\nu}\), but using momentum density four-vectors (momentum density in three spatial dimensions plus energy density, which can be thought of as “momentum” in time for a stationary object). Because its momentum density components are vectors, it is customary to express it in contravariant form (indices “upstairs”). The first index (\(\mu\)) gives the four-momentum components being considered, and the second (\(\nu\)) gives the direction it is being compared to. The physical significance of its components is as shown in Figure 2 (Wikimedia, 2013).

 

Figure 5 – The Stress Energy Momentum Tensor

With these tools in hand, we can proceed with our investigation of how mass, energy, and momentum curve space and time, but there are still a few constraints we need to account for.

First, the stress energy momentum tensor is rank 2 but the Riemann curvature tensor is rank 4 (that is, the former has two indices with 16 components, whereas the latter has 4 indices and 256 components), so we can’t just equate them to each other. Whatever effect \(T^{\mu\nu}\) has on curvature will have to manifest itself as a rank 2 curvature object as well—that is, it will have to be a contraction of the Riemann tensor that reflects the behavior we observe in gravity, so we want to know what sort of contraction will give us that.

We saw earlier that in the absence of forces, spacetime curvature manifests as acceleration. Strictly speaking, this applies only to point masses in the gravitational field of a much larger mass. For objects that have size and shape, the story changes. In Newtonian physics, the gravitational force between two masses varies inversely as the square of the distance between them (equation 9). So, if you are falling toward the earth feet first, your feet are being pulled harder than your head because they are closer to the earth’s center of gravity. Inasmuch as this is the low mass/energy/momentum limit of GR, the same will be true in curved spacetime as well. Likewise, your freefall into the earth’s gravitational well will be along a geodesic, and the deeper you go, the closer adjacent geodesics to your sides will be. Figure 3 (Wikimedia, 2008) shows what a gravitational well created by a mass as the bottom of the “pocket” looks like.3 The longitudinal lines are freefall geodesics with their steepness at each node being the strength of gravity there, and the squares enclosed by the grid can be thought of as shapes.

 

Figure 6 – Gravitational Well

Notice how falling into the well squeezes the latitudinal rectangles into increasingly longitudinal ones. In the earth’s relatively weak gravitational field compared to your size, the effect is too small to notice. But as you fall toward it, feet-first, you are being stretched and squeezed. This stretching and squeezing of large objects are tidal forces, and in the limit of a point mass, they reduce to simple freefall acceleration. Since in the most general terms, tidal forces are how curvature manifests, we would expect the stress energy momentum tensor to equate to a rank 2 tensor that describes them. And as we’ve seen, we have one… the Ricci tensor!

But we’re not out of the woods yet. There is one more constraint we need to honor; Another of the fundamental ones we learned in our high school physics, conservation of energy and momentum. Although neither is well-defined nor self-evidently conserved for the whole universe (or large regions of it), for locally flat inertial reference frames in the tangent planes of every point in it, both need to be conserved. This means that for every point on the manifold the divergence of the stress energy momentum tensor must be zero. That is,

\(\nabla_{\mu}T^{\mu\nu} = 0\)                   [Eqn. 12]

And here we have a problem… Tidal forces do not vanish in locally flat regions, and neither does the divergence of the Ricci tensor. If they did, falling through a black hole event horizon would be a lot less traumatic! So, our contracted curvature tensor object is going to need some tweaking.

Fortunately, the full Riemann curvature tensor itself gives us a way out. As it happens, its own internal consistency does require it to vanish locally; When curvature vanishes (as it must in local tangent planes) so does the curvature tensor. One consequence of this is that the sum of its divergences with respect to any three of its four indices must add to zero. That is,

\(\nabla_{\mu}R^{\mu }_{ \nu\rho\sigma} + \nabla_{\nu}R^{\nu}_{\mu\rho\sigma} + \nabla_{\rho }R^{\rho}_{\mu\nu \sigma} = 0\)                   [Eqn. 13]

This relationship is known as the second Bianchi identity (of which there are several). Again, we needn’t worry about its formal derivation here. But for our purposes, what matters is that with some mathematical gymnastics we can derive from it the contracted Bianchi identity,

\(\nabla_{\mu}R^{\mu\nu} = \frac{1}{2}\nabla_{\mu}g^{\mu\nu}R\)                   [Eqn. 14]

 Gathering terms gives,

\(\nabla_{\mu}\left ( R^{\mu\nu} – \frac{1}{2}g^{\mu\nu}R \right ) = 0\)                   [Eqn. 15]

And finally, by combining the Ricci tensor for tidal forces and the Ricci scalar for volumetric curvature, we have a tensor object we can equate to the stress energy momentum tensor that captures the spacetime curvature it induces while sharing with it a zero divergence that locally preserves conservation of energy and momentum. It’s customary to refer to the term in brackets as the Einstein tensor \(G^{\mu\nu}\), from which we have,

\(G^{\mu\nu} = R^{\mu\nu} – \frac{1}{2}g^{\mu\nu}R = \kappa T^{\mu\nu}\)                   [Eqn. 16]

Where \(\kappa\) is a proportionality constant which again, we won’t derive here, but turns out to be,

\(\kappa = \frac{8\pi g_c}{c^4}\)                   [Eqn. 17]

And there you have it, Ladies and Gentlemen… an equation that relates mass, energy, and momentum to spacetime curvature, and therefore gravitation!

One final question remains. Technically, equation 16 is arbitrary to within an additive constant as well. When Einstein first derived this relationship, he realized that it predicted a universe that was necessarily expanding or contracting, and thus impermanent. The idea of a universe that wasn’t eternal was philosophically abhorrent to him, so he included a constant term on the left (typically denoted with the Greek letter \(\Lambda\)), multiplied by the metric tensor for consistency and sized to offset the expansion, thereby preserving a curved, but static and eternal universe. Later, when it was independently confirmed that the universe is in fact, expanding (a fascinating story in its own right!), Einstein retracted the constant calling it “the greatest mistake of my life.” But as it turns out, it wasn’t. It has since been discovered that the cosmological constant is not only real, but positive and causing the expansion of the universe to accelerate! The discovery was so striking that the leaders of the team who discovered it, Saul Perlmutter, Brian Paul Schmidt, and Adam Guy Riess were jointly awarded the 2011 Nobel Prize in physics.

So… combining equations 16 and 17 with all terms expressed as covariant (which is customary), and restoring the cosmological constant to its rightful place we have,

\(G_{\mu\nu} + \Lambda g_{\mu\nu} = \frac{8\pi g_c}{c^4} T_{\mu\nu}\)                   [Eqn. 18]

These are the celebrated Einstein Field equations that are the hallmark of general relativity. The terms on the left fully describe the geometry of spacetime for all observers at every point in the universe, and the term on the right describes the mass, energy, and momentum that produces that geometry.

This was meant to be an introduction to spacetime curvature, so we’ve arrived at them with some big leaps and little in the way of formality. Though at first blush they may seem daunting and difficult to wrap your mind around, the important thing for today is an understanding of what the terms in these equations mean, and why they must have the general forms they do to describe how length, height, breadth, and duration can be curved. For those who want to explore further, there any number of good introductions to general relativity for the layperson. One that I found particularly readable and informative was Clifford Will’s book Was Einstein Right – Putting General Relativity to the Test (1993), first published in 1986 when I was in grad school. If you feel ready to make the deep dive into the full formalism of general relativity, there are many textbooks on the subject. But if there is one that has stood for many years as the Bible of general relativity, it’s Misner, Thorne, and Wheeler’s Gravitation (2017). It’s rigorous and will take some time to wade through, but it’s the best, and most thorough general relativity course I am personally aware of and has been since it was first published in 1973.

The psalmist tells us,

“The heavens are telling the glory of God; and the firmament[a] proclaims his handiwork. Day to day pours forth speech, and night to night declares knowledge. There is no speech, nor are there words; their voice is not heard; yet their voice goes out through all the earth, and their words to the end of the world.” – Psalm 19:1-4

When I gaze up at the nighttime sky, I see stars that are hundreds of light years away, many of which are surrounded by worlds, possibly even worlds not unlike my home. And I realize that I’m gazing upon those stars and worlds not as they are now in my reference frame, but as they were centuries ago. If I were to turn a large enough telescope on that sky I would see galaxies, quasars, nebulae, and a bewildering spectacle of other wonders, some of which are billions of years old and revealing themselves to me from a time long before humans or even our solar system existed. And if I filter their light through a spectrometer, I will see the fingerprints of their chemical constituents shifted increasingly toward the red the more distant they were, and I would realize that I was watching the universe grow—not as an expansion of matter into a pre-existing void, but literally the expansion of space and time themselves from a cataclysmic birth 13.73 billion years ago. I would see in that the glory of God and his handiwork…

And I would suspect, as J.B.S. Haldane did a century ago, the handiwork of God, where length, breadth, height, and duration are themselves clay in His artistic hands, is not only queerer than I suppose, but queerer than I can suppose.

Footnotes

1)  In Part I we introduced the nabla symbol on the left (\(\nabla_{\mu}\)), which in mathematics is known as the Laplace operator. It is a shorthand reference for the gradient (first derivative) in the direction of a vector defining the \(\mu\) coordinate system. That is, \(\nabla_{\mu} = \frac{\partial }{\partial x_0} + \frac{\partial }{\partial x_1} + \frac{\partial }{\partial x_2} + \frac{\partial }{\partial x_3}\) where the index \(\mu\) = 0, 1, 2, 3. This representation of a gradient in a particular direction is also referred to as the divergence.

2)  Interestingly, this isn’t just theoretical. Google and Apple map apps leverage first-order corrections for spacetime curvature near the earth’s surface to refine the accuracy of your location from raw GPS triangulated signals. General relativity is literally why your phone knows your location to within a couple hundred feet or so rather than one or two city blocks!

3)  Strictly speaking, this is a 2-D gravitational well with absolute time rather than a true 4-D gravitational which would include time. But for the current purpose, it suffices to illustrate the point.

References

Misner, C.W., Thorne, K.S. & J.A. Wheeler. 2017. Gravitation. Princeton University Press (Oct. 24, 2017). ISBN-10: 9780691177793, ISBN-13: ‎978-0691177793. Online at https://www.amazon.com/Gravitation-Charles-W-Misner/dp/0691177791/ref=sr_1_1?crid=1OKXLNQA5YVAR&keywords=gravitation&qid=1694219167&sprefix=gravitation%2Caps%2C253&sr=8-1&ufe=app_do%3Aamzn1.fos.18630bbb-fcbb-42f8-9767-857e17e03685.  Accessed Oct. 9, 2023.

Wikimedia. 2008. Image courtesy of AllenMcC. Based on the work of Bamse, and Melchoir, CC BY-SA 4.0, Mar. 2, 2013. Online at https://commons.wikimedia.org/wiki/File:GravityPotential.jpg. Accessed Oct. 9, 2023.

Wikimedia. 2013. Image courtesy of Maschen. Based on the work of Bamse, and Melchoir, CC BY-SA 4.0, Mar. 2, 2013. Online at https://commons.wikimedia.org/w/index.php?curid=24940142. Accessed Oct. 9, 2023.

Wikimedia. 2015. Image courtesy of IkamusumeFan, CC BY-SA 4.0, Jan. 1, 2015. Online at https://commons.wikimedia.org/w/index.php?curid=2615879. Accessed Oct. 9, 2023.

Will, C.N. 1993. Was Einstein Right? – Putting General Relativity to the Test. Basic Books; 2nd edition (June 2, 1993). ISBN-10: ‎0465090869; ISBN-13: ‎978-0465090860. Online at https://www.amazon.com/Was-Einstein-Right-Putting-Relativity/dp/0465090869/ref=sr_1_1?crid=TOG1ZAWGPF20&keywords=was+einstein+right&qid=1696883510&sprefix=was+einstein+right%2Caps%2C171&sr=8-1. Accessed Oct. 9, 2023.

Curvature I: Space

My own suspicion is that the Universe is not only queerer than we suppose, but queerer than we can suppose. – J.B.S. Haldane (Possible Worlds and Other Papers, 1927)

I was born hopelessly curious and under the tutelage of a nurturing teacher and parents who surrounded me with books, I fell in love with physics in the 2nd grade—when all my friends were enthralled with Batman, jets, and G.I. Joe. What drew me to it was the wonder of mysteries I couldn’t wrap my budding mind around, and chief among these was the notion that space, and time could be curved. I remember pouring over my parent’s Time-Life encyclopedia set which among other things, contained a full-color plate titled “Three kinds of space” featuring gridded surfaces shaped like a sphere, a pancake, and a saddle labeled +1, 0, and -1 respectively (the Friedmann constants, although of course, I didn’t know that then). I remember gazing at them struggling to understand… How can length, breadth, height, and duration be bent…? What does that even mean…? The question became even more mind-numbing when I later discovered that there can be spaces with more than three or four dimensions—indeed, an infinite number of dimensions—and these can all be curved as well. It wasn’t until well into graduate school that I started to get a shaky footing in that recondite landscape.

As three-dimensional beings, most of us grasp curvature visually. We can see curved lines and sheets against the backdrop of three dimensions because they bend into the other dimension/s. But how can three-dimensional space (or more properly four-dimensional space-time) bend when there are no other dimensions to bend into? The key to understanding this is to approach the question not by trying to visualize higher-dimensional spaces, but by exploring them with a mathematically based thought experiment physicists and mathematicians refer to as parallel transport. Let’s introduce two explorers: Flat Freddy who lives in a two-dimensional flat universe, and his sister Curved Cathy who lives in a curved one. For them, there is no third dimension much less any higher ones.

Parallel Transport

Let’s start with Freddy, placing him at the vertex of a triangle with two parallel vectors oriented along his direction of travel, one red and one green (Figure 1).

Figure 1

Now, let him go for a walk around the triangle’s perimeter in the direction the vectors are pointing, leaving the green vector behind, and taking the red one with him while ensuring that for the entire journey it remains oriented in the same direction (as we will soon see, this matters). Completing the first leg of the journey, he arrives at point B (Figure 2) with his red vector still parallel to the green one, and unchanged from its original orientation (light red).

 

Figure 2

Then, let’s have him journey an equal distance to the right at a 90-degree angle. When he arrives at point C, his red vector is still parallel to the green one and its previous orientations (Figure 3).

 

Figure 3

Finally, let’s take Freddy back home and reunite his two vectors. When Freddy checks his compass, he sees that point A is to his left and back at a 45-degree angle to the BC leg he just covered. When he arrives home again, he finds that his red and green triangles are still parallel to each other, exactly as they were when he began, and remained throughout his trek (Figure 4).

 

Figure 4

Getting his map out, Freddy sees that his journey traversed a right triangle with two 45-degree angles, the final leg of which covered a distance given by the Pythagorean Theorem,

\(\overline{AC} = \sqrt{({\overline{AB})^{2}} + ({\overline{BC})^{2})\)            [Eqn. 1]

And enclosed an area given by,

\(A = \frac{X^{2}}{2}\)           [Eqn. 2]

Where \(X\) is the length of \(\overline{AB}\) (or \(\overline{BC}\)). No surprises here. This is exactly what earth-bound three-dimensional creatures like us would expect.

Parallel Transport in Curved Space

Now, let’s have Cathy take the same journey in her universe. For clarity’s sake, let’s assume her universe is spherical with a “radius” that will better illustrate the outcome (more on why that word is in quotes soon). Like Freddy, we’re going to have her walk a triangular path beginning at point A with parallel red and green vectors, both tangent to the straightest path from point A to point B (Figure 5). As before, she will leave the green vector behind while carrying the red one with her, keeping it oriented in the same direction throughout. This time however, things are going to be a little more subtle. In Freddy’s universe the meaning of “straight” is clear enough. But as we will soon see, in Cathy’s this term will require a more precise definition.

 

Figure 5

When she completes the first leg of her journey at point B, her red vector hasn’t changed orientation. It is still pointing straight ahead, tangent to her path of travel (Figure 6).

Figure 6

Following in Freddy’s footsteps, she then journeys an equal distance to the right at a 90-degree angle, arriving at point C with her red vector still unchanged in direction (Figure 7).

 

Figure 7

Cathy has now travelled the same route from point A to point C that Freddy did in his universe and covered the same distance getting there. But now, something is amiss. When she checks her compass, she finds that point A isn’t to the left of her BC leg and 45 degrees back. Home is now 90 degrees to her left. Even more strangely, upon arriving home (Figure 8) she sees that her red vector is no longer parallel to the green one as it was when she started (light red). Now it is oriented at 90 degrees to it, even though it remained pointed in the same direction for the entire trip!

Figure 8

Furthermore, when she gets her map out, she sees that unlike her brother, she has traversed an equilateral triangle whose inner angles add up to 270 degrees rather than 180 degrees. And even though the final leg of her journey was noticeably shorter than Freddy’s, she traversed a larger region. Having studied higher mathematics at Flatland University, she is familiar with higher-dimensional spaces than the two dimensional one she knows, and an equilateral triangle with three 90-degree interior angles sounds suspiciously like a higher-dimensional sphere. Sure enough, when she measures the area enclosed by her journey, she finds that it is given by,

\(A = \frac{\pi R^{2}}{2}\)           [Eqn. 3]

Where \(R\) is a parameter that behaves mathematically like the radius of a three-dimensional sphere even though in her universe, there is no third dimension to contain one.

Note that Cathy’s conclusions were based only on measurements of distance and area, and the orientation of a vector she carried with her around a closed two-dimensional path. At no time did she step “outside” of her space into a third dimension from which the radius of a 3-D sphere could be observed. What she measured is simply a parameter that behaves like one in area calculations. Of course, Figures 5-8 are shown in 3-D perspective for heuristic purposes, but beyond that, there is no need for Cathy to postulate any higher dimensions to explain what she sees. As far as she knows, in her universe only two dimensions exist. How could Cathy’s two-dimensional universe be “spherical” when the sphere of our experience is a three-dimensional shape?

Straight vs. Geodesic

To answer this question, let’s go back to the turn of the 3rd Century B.C. when the Greek Mathematician Euclid published his Elements. In it, he laid the foundation of geometry in our three-dimensional space and Freddy’s two-dimensional one with five axioms, or postulates. Of these, four are interdependent in that each one can be formally derived from the remaining three. The remaining one, his fifth postulate, he stated as follows,

If a line segment intersects two straight lines forming two interior angles on the same side that are less than two right angles, then the two lines, if extended indefinitely, meet on that side on which the angles sum to less than two right angles. – (Heath, 1956)

It follows from this that if the two interior angles formed are equal to two right angles, those lines will never meet. Though Euclid doesn’t specifically say so, this would make the two lines parallel, which led 19th Century Scottish mathematician John Playfair to restate it in what today is perhaps its most popular version,

There is at most one line that can be drawn parallel to another given one through an external point.

For centuries, the fifth postulate troubled mathematicians because reasonable as it may seem, it’s entirely ad hoc. It has no interdependence with the other four and is superfluous to a complete formalism of Euclidean geometry. It was only a matter of time until people began to wonder what geometric doors would be opened if it were discarded.

The first step in that direction is a reexamination what we mean by straight and parallel. Like Freddy, most of us think of a line as straight if it is one-dimensional in the sense of having no curvature—or more formally perhaps, if all points on it share a common tangent vector in one direction. Likewise, we think of two lines as parallel if they lie within a common two-dimensional plane and are aligned in the same direction with no intersection point. Indeed, this is how mathematicians defined both terms for many centuries, and to this day Euclid’s fifth postulate is often referred to as his parallel postulate. But if it proves to be superfluous to the formalism of Euclidean geometry then non-Euclidean geometry becomes possible and these definitions will need to be revisited.

Without the fifth postulate, on an N-dimensional manifold (or space), a curve connecting any two points \(A(x_1, x_2, …, x_N)\) and \(B(x_1, x_2, …, x_N)\) is said to be straight if, and only if it is the shortest distance between them on the manifold. In a flat space like Freddy’s (or ours), this reduces to our intuitive definition above, but that definition alone does not constrain manifolds to be flat. This suggests that if we want to quantify how paths between events are traversed in universes like Cathy’s, our mathematical descriptions need to be revised, and our parallel transport thought experiment gives us a clue as to how.

Modeling Curved Geometry

A full mathematical treatment of general relativity is beyond our scope today, but we can get our feet wet with an overview of the tools it will require. To model any N-dimensional space, be it flat or curved, there are two fundamental requirements we must meet.

First, we need a way to describe not only distances, but angles. To do that we will need to define at least two vectors at every point on it, \(r^{\mu}\) and \(r^{\nu}\), where the indices \(\mu\) and \(\nu\) denote the N coordinates of each. Strictly speaking, they can be specified in any coordinate system of our choosing, and oriented in any non-parallel direction we like, but ideally, we want them to be orthogonal to each other (as shown in Figure 9) so that they define a coordinate system/s themselves. With these, we can then use a vector inner product, or dot product of them to define a matrix function \( g_{\mu\nu}\) whose squared diagonal terms can be summed to give the squared distance along any interval, and whose off-diagonal terms are the dot product projections of each vector’s components onto those of the other. This function, which is referred to as the metric tensor,1 contains within its \(N^{2}\) components a description of all lengths and trigonometric relationships between the two vectors.2  [Aron discusses this at length in his 2012 post All points look the same.]

Figure 9

Neglecting time for simplicity (we’ll get to this later), in a flat 2-D space like Freddy’s, the two vectors will not have components that lie along each other so the off-diagonal terms will be zero, and the vectors are chosen so that their lengths define units in our chosen coordinate system,3  the diagonal terms will be 1 and the sum of their squares defines the Pythagorean theorem. Thus,

            [Eqn. 4]

Second, we need to ensure that our models preserve one of the most sacred principles in physics—namely, that the universe exists independent of us, so its behavior should be independent of how we choose to describe it. If the most fundamental laws of physics are different here and now in this coordinate system and units than it is there and then in those coordinates and units, that would imply that we have an unreasonably unique status in it. In our hearts, we know that isn’t the case, so our descriptions of it should look the same in all frames of reference and units. In physics this is referred to as the principle of general covariance.

To do this we need to account for the fact that some quantities behave differently under a change of scale in coordinate system units. For instance, if the vector \(r^{\mu}\) is one meter long, it will have a length of 1 in a coordinate system specified in meters. But if the scale is changed to centimeters, its length will be 100. The same will apply to angles. The vector itself remains the same—what has changed is its representation in a rescaled coordinate system. Quantities that behave this way are said to be contravariant because their size will vary counter to variations in the scale of units they’re represented with.

On the other hand, there are quantities such as gradients for which this isn’t the case. A 6% grade is a 6% grade whether we specify it in meters/meter or cm’s/cm, so rescaling coordinate systems will vary length specifications along any coordinate axis, but not the gradient in that direction. The metric tensor \(g_{\mu\nu}\) is such an object. As we’ve seen, it’s effectively a generalized dot product between local coordinate system axes. Since its components give their projections onto each other, it behaves like a gradient under coordinate system transformations. Quantities like this are said to be covariant because they retain their values regardless of how their coordinate system scale is varied. The difference is shown in Figure 10 (Wikimedia, 2018).

Figure 10

This may seem like hair-splitting, but when we move from the realm of absolute flat spaces to that of curved geometries, the difference matters. Some quantities like vectors, lend themselves to a contravariant description whereas others, like gradients, lend themselves to a covariant one. In the parlance of general relativity, it’s customary to specify the indices of the former with superscripts (“upstairs”) and the latter with subscripts (“downstairs”). Each type of tensor can be converted into the other by multiplying with an appropriately dimensioned factor (which is referred to as “raising or lowering indices”), but things are a lot clearer when we stick to representing each in the form that is most natural to them. As such, objects like vectors whose specifications vary under a rescaling multiple coordinate axes are typically specified with “upstairs” indices and those like the metric that behave more like gradients use “downstairs” ones.

With these qualifications, let’s revisit our earlier parallel transport experiments and put some flesh on the bones. In Figure 9 we saw that in Freddy’s universe, \(r^{\mu}\) and \(r^{\nu}\) will be the same everywhere and so will \(g_{\mu\nu}\). It makes no difference where (or when) we place any coordinate system. But what about Cathy’s universe? At point A, a small surrounding region will be approximately flat and represented by a tangent plane containing \(r^{\mu}\), \(r^{\nu}\) centered on it (Figure 11). Now, let’s define a third tangent vector \(s^{\sigma}\) along our parallel transport path from A to B.

Figure 11

Once again, we walk the path from A to B in the direction \(ds^{\sigma}\) as in Figures 5 and 6, carrying the tangent plane and \(r^{\mu}\) and \(r^{\nu}\) with us (Figure 12).

 

Figure 12

At each point in the path, \(r^{\mu}\), \(r^{\nu}\), and \(s^{\sigma}\) are still oriented in the same directions with respect to any local coordinate system, and the latter remains parallel to the path we’re travelling. When we arrive at B, we see that things still look the same to us as they did when we started. But this time the local tangent plane and coordinate systems we carried with us have twisted with respect to where they were at A and no longer looks the same to an observer who stayed behind.

In Freddy’s universe, one tangent plane uniquely spans the entire space. All distances and angles look the same from any reference frame within it, and carrying vectors such as \(r^{\mu}\) and \(r^{\nu}\) from one point to another is just a matter of summing displacements along any given path between them. But in a curved space like Cathy’s, we need a mathematical object that not only describes displacements along a path, but also one that maps that path onto the local tangent planeas it rolls across the curved surface as shown in Figure 13 (Wikimedia, 2023).

Figure 13

This object, which mathematicians refer to as an affine connection, allows us to describe vectors along any path through a larger curved space in terms of a fixed coordinate system within the local tangent plane at any point. An infinite number of such connections are possible but there is one, known as the Levi-Civita connection, that is a natural choice for spaces that have a well-defined metric tensor at every point because it allows us to define a derivative (or rate of change) along a curved space path that generalizes the usual mathematical rules of vector calculus in locally flat tangent plane regions to the larger curved space. This covariant derivative (which we denote with the nabla symbol 4) will need to have two parts and is given by,

\(\nabla_{\mu} = \partial_{\mu} + \Gamma^{\sigma}_{\mu\nu}\)           [Eqn. 5]

For an infinitesimal displacement along any path, the first term on the right is the gradient with respect to the local tangent plane as defined in the usual flat space manner. The second term is the rate at which the tangent plane itself (and the covariant metric tensor embedded in it) is changing in the direction of a contravariant displacement \(ds^{\sigma}\) in the direction of a tangent vector to the path. As such, it will be matrix function with three indices, two of which are best represented as covariant and a third contravariant one which we will denote with the index \(\sigma\). This function, which per convention we designate with a capital Greek Gamma, is known as a Christoffel symbol. Since it requires three indices to fully capture the evolution of the metric tensor, in Cathy’s space it will have 23, or 8 components to her metric tensor’s 4. We refer to Christoffels as “symbols” because they aren’t true tensors in that they aren’t globally frame-independent until multiplied by an infinitesimal displacement in at least one direction. And as shown, equation 5 doesn’t make sense because the indices on the right and left sides don’t agree with each other. More properly, it defines a mathematical operator that must act on something to produce a meaningful equation. Applying it to \(ds^{\sigma}\) gives,

\(\nabla_{\mu}s^{\sigma} = \partial_{\mu}s^{\sigma} + \Gamma^{\sigma}_{\mu\nu}s^{\nu}\)           [Eqn. 6]

With the upstairs and downstairs \(\nu\) in the second term cancelling, this equation is now consistent across indices and the Christoffel term behaves like a tensor. This path derivative will look the same from every coordinate system in Cathy’s curved space. In flat spaces like Freddy’s, the tangent plane is the same everywhere and unchanging so the Christoffel term will vanish leaving us with the usual Euclidean directional derivative we learned in first-year vector calculus.

 

For today’s purposes we needn’t worry about how these equations were derived. The important thing is to understand why curved spaces require these kinds of mathematical tools rather than the familiar ones of Euclidean geometry, and how they reflect curvature in multiple dimensions without additional dimensions to “curve into.” If you’re like me, the latter point is the biggest stumbling block. It’s one thing to know that curved spaces are mathematically possible without additional background dimensions. But it’s another thing altogether for three-dimensional Euclidean space beings to visualize them. Space (or spacetime) can be curved in one of two ways: positive, or negative.5 Positively curved space is spherical and, if extended far enough, finite and closed. In our previous example, Cathy’s universe is a spherical one. And as we saw, the interior angles of a triangle in such a space add to greater than 180 degrees. Her space is finite in size, and travelling in a straight line in any direction will eventually return you to where you started from. Negatively curved space is saddle-shaped and has hyperbolic geometry. The interior angles of a triangle in it would add to less than 180 degrees, and like flat Euclidean space, it extends to infinity in all directions. Figure 14 shows both as compared to flat space.

Figure 14

It’s easy to visualize two-dimensional curved spaces like these in isometric views that show their contours in an additional dimension. But what would they look like where there was none?

In the case of a positively curved space, we can’t do this because there is no way to represent a path that returns to where it started in the same number of dimensions.6 But for negatively curved spaces that extend to infinity, we have a visual example in the art of 20th Century Dutch graphic artist M.C. Escher. Among other things, Escher was known for artistic renderings of mathematical concepts including symmetries and tessellation. His Circle Limit collection of wood carvings depict repeating image patterns whose changing shapes from the center outward are a tessellation of hyperbolic geometry on a disc into right triangles. His 1959 work Circle Limit III (Figure 15), widely regarded as the best in the series, does this with patterns of fish.

 

Figure 15

There are many ways to tessellate geometric spaces and none are perfect, including this one. But if Cathy’s two-dimensional space was negatively rather than positively curved, this would be a reasonable representation of how it would look to her. If she walked a parallel transport path through it as in figures 5-8 taking the size and orientation of the fish as indicative of distances and angles, upon returning to where she started, she would find that the distances and interior angles she traced would be like those in the negatively curved saddle in figure 14. And if she travelled a straight geodesic path in any direction indefinitely, she would asymptotically reach infinity as she approached the rim. The disc is two-dimensional, but the geometry embedded in it behaves as though it were a saddle-shaped sheet in three dimensions even though the third dimension isn’t there. The underlying mathematics of its hyperbolic (saddle) geometry are embodied in Equation 6. And while we have until now restricted ourselves to two-dimensional spaces for ease of illustration, notice that the indices in its terms can assume any number of values, not just two. As such, it generalizes to any number of curved dimensions, none of which need any “higher” dimension/s to curve into.

There is, however, one dimension that we’ve conspicuously ignored until now… time. We live in a universe where not only length, breadth, and height can be curved, but duration can be as well, and curved spacetime ups the ante in several important respects that we’ll dive into in Part II. So, stay tuned!
 
Curvature II: Spacetime
 

Footnotes

1)   In mathematics, tensors are matrix functions that define a multilinear relationship between sets of objects in a vector space that preserve their identity in any coordinate system or transformation. Vectors can be thought of as a one-dimensional tensor (that is, a tensor with only one column or row). The dimensionality of a tensor’s matrix array (as specified in the number of indices it requires) is referred to as its rank \(R\), and the number of components it will have in an N-dimensional space is given by \(N^{R}\). Thus, \(g_{\mu\nu}\) is a rank 2 tensor that in Freddy’s 2-D space will have four components, and in our 4-dimensional spacetime has 16.

2)   Strictly speaking, the metric tensor isn’t really a true dot product. Rather, it is a generalization of the familiar dot product of Euclidean geometry to the pseudo-Riemannian geometry constrained by special relativity, where time behaves differently than space (more on this in Part II). But for our current exploration of 2-D spatial curvature, this needn’t concern us.

3)   Mathematicians refer to this as an orthonormal basis that spans the space.

4)   In mathematics, the nabla symbol (\(\nabla_{\mu}\)) is known as the Laplace operator. It is a shorthand reference for the gradient (first derivative) in the direction of a vector defining the \(\mu\) coordinate system; That is, \(\nabla_{\mu} = \frac{\partial }{\partial x_0} + \frac{\partial }{\partial x_1} + \frac{\partial }{\partial x_2} + \frac{\partial }{\partial x_3}\) where the index \(\mu\) = 0, 1, 2, 3. This representation of a gradient in a particular direction is also referred to as the vector’s divergence.

5)   The reasons for this are mathematical and beyond the scope of this discussion.

6)   This is because spherically curved space has a different topology than flat and negatively curved spaces. In mathematics, topology is the study of a manifold’s geometric properties that are preserved when it is stretched or deformed without cutting or sewing, opening or closing holes, or passing it through itself. Negatively curved space has the same topology as flat space because a flat rubber sheet can be stretched to form a saddle. By contrast, a positively curved space cannot be flattened or deformed into a saddle without cutting and forming edges (e.g. – a Mercator projection). There is no way to create a flat representation of it that preserves great circle paths that end where they began without encountering an edge. Likewise, a toroid (donut) cannot be deformed into a sphere or a saddle without cutting and sewing edges, so it has a higher-level topology than negatively or positively curved spaces.

 

References

Heath, T.L. ed., 1956. The thirteen books of Euclid’s Elements. Courier Corporation. Online at https://books.google.com/books?hl=en&lr=&id=mvBIAwAAQBAJ&oi=fnd&pg=PP1&dq=euclid+elements&ots=ed2L7zetPz&sig=wPKfMQ22SZvf4gF_83USfDwb0oY#v=onepage&q=euclid%20elements&f=false. Accessed Sept. 28, 2023.

Wikimedia. 2018. Image courtesy of Jacob Bertolotti. Online at https://commons.wikimedia.org/wiki/File:Covariantcomponents.gif. Accessed Sept. 28, 2023.

Wikimedia. 2023. Image courtesy of Silly rabbit, CC BY-SA 3.0. Online at https://commons.wikimedia.org/w/index.php?curid=2615879. Accessed Sept. 28, 2023.

Breakthrough Panel Discussion on Time Travel

Everyone seemed to love the panel discussion on “Is time travel possible?”, featuring Veritasium‘s Derek Muller as host; and Nima Arkani-Hamed, Daniel Harlow, Daniel Jafferis, and myself as panelists. We had a lot of fun with it, but also there’s some profound physics involved, in what one might have thought was a pretty flippant choice of topic. So without further ado, here it is:

(Back up to 7:55 if you want to hear all the introductions to the event at the beginning.)

After our panel there were two others on “What are the limits of Science” and “Is there life in the Universe”, recorded in the same video. There were a lot of interesting people on these panels, although I don’t think the conversations cohered quite as well as ours did, perhaps because they involved people from different disciplines.

In the second panel, Andrei Linde is a fun speaker, but I think he overplayed how much we currently know for sure about the early universe after inflation happened. There are a lot of mysteries between the time inflation ended (about \(10^{-35}\) seconds after the Big Bang by his reckoning) and the time of the Higgs Phase Transition (about \(10^{-12}\) seconds, which corresponds to the highest energy scale we can measure at the LHC). Between these times there are a lot of mysteries, like what process produced more matter than antimatter, as needed for any matter to exist today. I also wish he’d mentioned Cosmic Variance, a pretty obvious Limit on Science in his field.

Gary Ruvkin, the guy who thinks life on earth came from outer space was also kind of interesting. Apparently after about a billion years of nothing, life shows up on Earth and it’s already pretty complicated. So maybe it came from elsewhere? The downside of this hypothesis, he said wittily, is that it “only buys you another 10 billion years” to evolve life (going back to the Big Bang). Since this is a physics and theology blog, I’ll mention that even though I generally think that Darwinian evolution suffices to explain the evolution of complex life from simpler life, it does seem bewildering how something as complicated as the first cell might have arisen naturally, without a miracle. But just because I can’t imagine it doesn’t necessarily mean it couldn’t have happened by some natural process. As a theist I am philosophically open to both supernatural and natural explanations, both of which are ultimately due to the Creator of all things.

That third panel should really have been: “Is there other life in the Universe”, otherwise I think the question is pretty easy. This panel includes Jocelyn Bell Burnell who received a special Breakthrough prize this year for her revolutionary discovery of pulsars. Scandalously, the Nobel was given to her advisor but not to her; either because of sexism, or because of a bias against graduate students, or some combination thereof.

This seems like a good time to mention that, if I understand the history correctly, it was Daniel Jafferis’ grad student Ping Gao who had the original idea to try to make a traversable wormhole in AdS/CFT. (Although as the 3rd person to join the collaboration, I can’t speak to the exact division of labor between Ping and Dan.) Now the New Horizons prize was awarded for our lifetime of work so far, and not just for this one article, so I’m not saying that Ping should have been eligible for this particular prize. But I do think it’s important for people to acknowledge junior collaborators, and not just assume the senior people did all the best work. So thanks Ping!

[Apparently I misunderstood the history, and is was Daniel who had the original idea and assigned it to Ping as a project.  I apologize for the mistake, but of course I’m still grateful to Ping for his hard work and insights!]

Prizes

I’ve recently won a pretty big prize in theoretical physics, called the New Horizons Prize.  This is a smaller version of the Breakthrough Prize which is awarded to more junior researchers.

My prize is shared with MIT’s Daniel Harlow and Harvard’s Daniel Jafferis, both of them excellent physicists.   Amusingly, each pair of us have written exactly 1 article together (but we have never collaborated as a trio).

I hope it is not too vain to share some news articles about the prize, in case people want to know more:

There is also going to be a prize ceremony today [i.e. the day I am writing this post], Sunday Nov 4th, in Mountain View.  You can find more information about the broadcasting of the event, and the other prize winners, here.  There will also be an all-day symposium at UC Berkeley this Monday, at which I will be getting the actual trophy and also we will be speaking at a panel on whether time travel is possible.  You can watch it live streamed here.

I’ve also recently received the 2018 Philippe Meyer Prize and the IUPAP Young Scientist Prize [alt link], both of whose award ceremonies will be in the future.

After all this shameless self-promotion, I have some even better news that makes me even prouder: in January Nicole and I are expecting our first child, a son!  We are so pleased by this, and hope that he will find this world hospitable as God’s will is accomplished in his life.  I don’t feel prepared yet to be a father, but then again no one ever is.  I understand that children are very good at training up their parents, so hopefully it will turn out all right!

Interpreting the Quantum World II: What Does It Mean?

In the first installment of this series, we immersed ourselves in the quantum realm that lies beneath our everyday experience and discovered a universe that bears little resemblance to it. Instead of the solid, unambiguously well-behaved objects we’re familiar with, we encountered a unitary framework (\(\hat U\)) in which everything (including our own bodies!) is ultimately made of ethereal “waves of probability” wandering through immense configuration spaces along paths deterministically guided by well-formed differential equations and boundary conditions, and acquiring the properties we find in them as they rattle through a random pinball machine of collisions with “measurement” events (\(\hat M\)). This is all very elegant—even beautiful… but what does it mean? When my fiancé falls asleep in my arms, her tender touch, the warmth of her breath on my neck, and the fragrance of her hair hardly seem like mere probabilities being kicked around by dice-playing measurements. The refreshing drink of sparkling citrus water I just took doesn’t taste like one either. What is it that gives fire to this ethereal quantum realm? How does the Lord God breathe life into our probabilistic dust and bring about the classical universe of our daily lives (Gen. 2:7)? We finished by distilling our search for answers down to three fundamental dilemmas:

1)  What is this thing we call a wave function? Is it ontologically real, or just mathematical scaffolding we use to make sense of things we don’t yet understand?

2)  What really happens when a deterministic, well-behaved \(\hat U\) evolution of the universe runs headlong into a seemingly abrupt, non-deterministic \(\hat M\) event? How do we get them to share their toys and play nicely with each other?

3)  If counterfactual definiteness is an ill-formed concept, why are we always left with only one experienced outcome? Why don’t we experience entangled realities?

Physicists, philosophers, and theologians have been tearing their hair out over these questions for almost a century, and numerous interpretations have been suggested (more than you might imagine!). Most attempt to deal with 2), and from there, back out answers to 1) and 3). All deserve their own series of posts, so let me apologize in advance for only having time to do a fly-by of the more important ones here. In what follows I’ll give an overview of the most viable, and well-received interpretations to date, and finish with my own take on all of it. So, without further ado, here are our final contestants…

Copenhagen

This is the traditionally accepted answer given by the founding fathers of QM. According to Copenhagen, the cutting edge of reality is in \(\hat M\). The world we exist in is contained entirely in our observations. Per the Born Rule, these are irreducibly probabilistic and non-local,and result in classically describable measurements. The wave function and its unitary history \(\hat U\) are mere mathematical artifices we use to describe the conditions under which such observations are made, and have no ontic reality of their own. In this sense, Copenhagen has been called a subjective, or epistemic interpretation because it makes our observations the measure of all things (pun intended :-) ). Although few physicists and philosophers would agree, some of the more radical takes on it have gone as far as to suggest that consciousness is the ultimate source of the reality we observe. Even so, few Copenhagen advocates believe the world doesn’t exist apart from us. The tree that falls in the woods does exist whether we’re there to see and hear it or not. What they would argue is that counterfactuals regarding the tree’s properties and those of whatever caused it to fall don’t instantiate if we don’t observe them. If no one sees the tree fall or experiences any downstream consequence of its having done so, then the question of whether it has or not is irreducibly ambiguous and we’re free to make assumptions about it.

Several objections to Copenhagen have been raised. The idea that ontic reality resides entirely in non-local, phenomenologically discrete “collapse” events that are immune to further unpacking is unsatisfying. Science is supposed to explain things, not explain them away. It’s also difficult to see how irreducibly random \(\hat M\) events could be prepared by a rational, deterministic \(\hat U\) evolution if the wave function has no ontic existence of its own. To many physicists, philosophers, and theologians, this is less a statement about the nature or reality than the universe’s way of telling us that we haven’t turned over enough stones yet, and may not even be on the right path.

For their part, Copenhagen advocates rightly point out that this is precisely what our experiments tell us—no more, no less. If the formalism correctly predicts experimental outcomes, they say, metaphysical questions like these are beside the point, if not flat-out ill-formed, and our physics and philosophy should be strictly instrumentalist—a stance for which physicist David Mermin coined the phrase “shut up and calculate”.

Many Worlds

One response to Copenhagen is that if \(\hat U\) seems to be as rational and deterministic as the very real classical physics of our experience, perhaps that’s because it is. But that raises another set of questions. As we’ve seen, nothing about \(\hat U\) allows us to grant special status to any of the eigenstates associated with observable operators. If not, then we’re left with no reason other than statistical probability to consider any one outcome of an \(\hat M\) event to be any more privileged than another. Counterfactuals to what we don’t observe should have the same ontic status as those we do. If so, then why do our experiments seem to result in discrete irreducibly random and non-local “collapse” events with only one outcome?

According to the Many Worlds (MWI) interpretation, they don’t. The universe is comprised of one ontically real, and deterministic wave function described by \(\hat U\) that’s local (in the sense of being free of “spooky-action-at-a-distance”) and there’s no need for hidden variables to explain \(\hat M\) events. What we experience as wave function “collapse” is a result of various parts of this universal wave function separating from each other as they evolve. Entangled states within it will be entangled while their superposed components remain in phase with each other. If/when they interact with some larger environment within it, they eventually lose their coherence with respect to each other and evolve to a state where they can be described by the wave functions of the individual states. When this happens, the entanglement has (for lack of a better term) “bled out” to a larger portion of the wave function containing the previous entanglement, and the environment it interacted with, and states are said to have decohered. Thus, the wave function of the universe never actually collapses anywhere—it just continues to decohere into the separate histories of previously entangles states that continue with their own \(\hat U\) histories, never interacting with each other again. As parts of the same universal wave function, all are equally real, and questions of counterfactual definiteness are ill-formed.

The advantages of MWI speak for themselves. From a formal standpoint, a universe grounded on \(\hat U\) and decoherence that’s every bit as rational and well-behaved as the classical mechanics it replaced, certainly has advantages over one based on subjective hand grenade \(\hat M\) events. It deals nicely with the relativity-violating non-locality and irreducible indeterminacy that plague Copenhagen as well. And for reasons I won’t get into here, it also lends itself nicely to quantum field theory, and Feynmann path integral (“sum over histories”) methods that have proven to be very powerful.

But its disadvantages speak just as loudly. For starters, it’s not at all clear that decoherence can fully account for what we directly experience as wave function collapse. Nor is it clear how MWI can make sense of the extremely well-established Born Rule. Does decoherence always lead to separate well-defined histories for every eigenstate associated with every observable that in one way or another participates in the evolution of \(\hat U\)? If not, then what meaning can be assigned to probabilities when some states decohere and others don’t. Even if it does, what reasons do we have for expecting that it should obey probabilistic constraints?

And of course, we haven’t even gotten to the real elephant in the room yet—the fact that we’re also being asked to believe in the existence of an infinite number of entirely separate universes that we can neither observe, nor verify, even though the strict formalism of QM doesn’t require us to. Physics aside, for those of us who are theists this raises a veritable hornet’s nest of theological issues. As a Christian, what am I to make of the cross and God’s redemptive plan for us in a sandstorm of universes where literally everything happens somewhere to infinite copies of us all? It’s worth noting that some prominent Christian physicists like Don Page embrace MWI, and see in it God’s plan to ultimately gather all of us to Him via one history or another, so that eventually “every knee shall bow, and every tongue confess, and give praise to God (Rom. 14:11). While I understand where they’re coming from, and the belief that God will gather us all to Himself some day is certainly appealing, this strikes me as contrived and poised for Occam’s razor.

In the end, despite its advantages, and with all due respect to Hawking and its other proponents, I don’t accept MWI because, to put it bluntly, it’s more than merely unnecessary—it’s bat-shit crazy. According to MWI there is, quite literally, a world out there somewhere in which I, Scott Church (peace be upon me), am a cross-dressing, goat worshipping, tantric massage therapist, with 12” Frederick’s of Hollywood stiletto heels (none of that uppity Victoria’s Secret stuff for me!), and D-cup breast implants…

Folks, I am here to tell you… there isn’t enough vodka or LSD anywhere on this lush, verdant earth to make that believable! Whatever else may be said about this veil of tears we call Life, rest assured that indeterministic hand grenade \(\hat M\) events and “spooky action at a distance” are infinitely easier to take seriously. :D

De Broglie–Bohm

Bat-shit crazy aside, another approach would be to try separating \(\hat U\) and \(\hat M\) from each other completely. If they aren’t playing together at all, we don’t have to worry about whether they’ll share their toys. Without pressing that analogy too far, this is the basic idea behind the De Broglie-Bohm interpretation (DBB).

According to DBB, particles do have definite locations and momentums, and these are subject to hidden variables. \(\hat U\) is real and deterministic, and per the Schrödinger equation governs the evolution of a guiding, or pilot wave function that exists separate from particles themselves. This wave function is non-local and does not collapse. For lack of a better word, particles “surf” on it, and \(\hat M\) events acting on them are governed by the local hidden variables. In our non-local singlet example from Part I, the two electrons were sent off with spin-state box lunches. All of this results in a formalism like that of classical thermodynamics, but with predictions that look much like the Copenhagen interpretation. In DBB the Born Rule is an added hypothesis rather than a consequence of the inherent wave nature of particles. There is no particle/wave duality issue of course because particles and the wave function remain separate, and Bell’s inequalities are accounted for by the non-locality of the latter.

There’s a naturalness to DBB that resolves much of the “weirdness” that has plagued other interpretations of QM. But it hasn’t been well-received. The non-locality of its pilot wave \(\hat U\) still raises the whole “spooky action at a distance” issue that physicists and philosophers alike are fundamentally averse to. Separating \(\hat U\) from \(\hat M\) and duct-taping them together with hidden variables adds layers of complexity not present in other interpretations, and runs afoul of all the issues raised by the Kochen-Specker Theorem. We have to wonder whether our good friend Occam and his trusty razor shouldn’t be invited to this party. And like MWI, it’s brutally deterministic, and as such, subject to all the philosophical and theological nightmares that go along with that, not to mention our direct existential experience as freely choosing people. Even so, for a variety of reasons (including theories of a “sub-quantum realm” where hidden variables can also hide from Kochen-Specker) it’s enjoying a bit of a revival and does have its rightful place among the contenders.

Consistent Histories

As we’ve seen, the biggest challenge QM presents is getting \(\hat U\) and \(\hat M\) to play together nicely. Most interpretations try to achieve this by denying the ontological reality of one, and somehow rolling it up into the other. What if we denied the individual reality of both, and rolled them up into a larger ontic reality described by an expanded QM formalism? Loosely speaking, Consistent Histories (or Decoherent Histories) attempts to do this by generalizing Copenhagen to a quantum cosmology framework in which the universe evolves along the most internally consistent and probable histories available to it.

Like Copenhagen, CH asserts that the wave function is just a mathematical construct that has no ontic reality of its own. Where it parts company is in its assertion that \(\hat U\) represents the wave function of the entire universe, and it never collapses. What we refer to as “collapse” occurs when some parts of it decohere with respect to larger parts leading, it is said, to macroscopically irreversible outcomes that are subject to the ordinary additive rules of classical probability. In CH, the potential outcomes of any observation (and thus, the possible histories the universe might follow) are classified by how homogeneous and consistent they are. This, it’s said, is what makes some of them more probable than others. A homogeneous history is one that can be described by a unique temporal sequence of single-outcome propositions, such as, “I woke up” > “I got out of bed” > “I showered” … Those that cannot be, such as ones that include statements like “I walked to the grocery store or drove there” are not. These events can be represented by a projection operator \(\hat P\) from which histories can be built, and the more internally consistent they are (per criteria contained in a class operator \(\hat P\)), the more probable they are.

Thus, in CH \(\hat M\) is not a fundamental QM concept. The evolution of the universe is described by a mathematical construct, \(\hat U\) that can be interpreted as decohering into the most internally consistent (and therefore probable) homogeneous histories possible for it to. The paths these histories take give us a framework in which some sets of classical questions can be meaningfully asked, and other can’t. Returning to our electron singlet example, CH advocates would maintain that the wave function wasn’t entangled in any real physical sense. Rather, there are two internally consistent histories for the prepared electrons that could have emerged a spin measurement: Down/Up, and Up/Down. Down/Up/Up/Down isn’t a meaningful state, so it’s meaningless to say that the universe was “in” it. Rather, when the entire state of us/laboratory/observation is accounted for, we will find that the universe followed the history that was most consistent for that. There is no need to discriminate between observer and observed. Decoherence is enough to account for the whole history, so \(\hat M\) is a superfluous construct.

CH advocates claim that it offers a cleaner, and less paradoxical interpretation of QM and classical effects than its competitors, and a logical framework for discriminating boundaries between classical and quantum phenomena. But it too has its issues. It’s not at all clear that decoherence is as macroscopically irreversible as it’s claimed to be, or that by itself it can fully account for our experience of \(\hat M\). It also requires additional projection and class operator constructs not required by other interpretations, and these cannot be formulated to any degree practical enough to yield a complete theory.

Objective Collapse Theories

Of course, we could just make our peace with \(\hat U\) and \(\hat M\). Objective collapse, or quantum mechanical spontaneous localization (QMSL) models maintain that the universe reflects both because the wave function is ontologically real, and “measurements” (perhaps interactions is a better term here) really do collapse it. According to QMSL theories, the wave function is non-local, but collapses locally in a random manner (hence, the “spontaneous localization”), or when some physical threshold is crossed. Either way, observers play no special role in the collapse itself. There are several variations on this theme. The Ghirardi–Rimini–Weber theory for instance, emphasizes random collapse of the wave function to highly probably stable states. Roger Penrose has proposed another theory based on energy thresholds. Particles have mass-energy that, per general relativity, will make tiny “dents” in the fabric of space-time. According to Penrose, in the entangled states of their wave function these will superpose as well, and there will be an associated energy difference that entangled states can only sustain up to a critical threshold energy difference (which he theorizes to be on the order of one Planck mass). When they decohere to a point where this threshold is exceeded, the wave function collapses per the Born Rule in the usual manner (Penrose, 2016).

For our purposes, this interpretation pretty much speaks for itself and so do its advantages. Its disadvantages lie chiefly in how we understand and formally handle the collapse itself. For instance, it’s not clear this can be done mathematically without violating conservation of energy or bringing new, as-yet undiscovered physics to the game. In the QMSL theories that have been presented to date, if energy is conserved the collapse doesn’t happen completely, and we end up with left-over “tails” in the final wave function state that are difficult to make sense of with respect to the Born Rule. It has also proven difficult to render the collapse compliant with special relativity without creating divergences in probability densities (in other words, blowing up the wave function). Various QMSL theories have handled issues like this in differing ways, some more successfully than others, and research in his area continues. But to date, none of the theories on the table offers a slam-dunk.

The other problem QMSL theories face is a lack of experimental verification. Random collapse theories like Ghirardi–Rimini–Weber could be verified if the spontaneous collapse of a single particle could be detected. But these are thought to be extremely rare, and to date, none have been observed. However, several tests for QMSL theories have been proposed (e.g. Marshall et al., 2003; Pepper et al., 2012; or Weaver et al., 2016 to name a few), and with luck, we’ll know more about them in the next decade or so (Penrose, 2016).

Conclusion

There are many other interpretations of QM, some of which are more far-fetched than others. But the ones we’ve covered today are arguably the most viable, and as such, the most researched. As we’ve seen, all have their strengths and weaknesses. Personally, I lean toward Objective Collapse scenarios. It’s hard to believe that something as well-constrained and mathematically coherent as \(\hat U\) isn’t ontologically real. Especially when the alternative bedrock reality being offered is \(\hat M\), which is haphazard and difficult to separate from our own subjective consciousness (the latter in particular smacks of solipsism, which has never been a very compelling, or widely-accepted point of view). Of the competing alternatives that would agree about \(\hat U\), MWI is probably the strongest contender. But for reasons that by now should be disturbingly clear, it’s far easier for me to accept a non-local wave function collapse than its take on \(\hat M\). Call me unscientific if you will, but ivory towers alone will never be enough to convince me that I have a cross-dressing, goat-worshipping, voluptuous doppelganger somewhere that no one can ever observe. Other interpretations don’t fare much better. Most complicate matters unnecessarily and/or deal with the collapse in ways that render \(\hat M\) deterministic.

It’s been said that if your only tool is a hammer, eventually everything is going to look like a nail. It seems to me that such interpretations are compelling to many because they’re tidy. Physicists and philosophers adore tidy! Simple, deterministic models with well-defined differential equations and boundary conditions give them a fulcrum point where they feel safe, and from which they think they can move the world. This is fine for what it’s worth of course. Few would dispute the successes our tidy, well-formed theories have given us. But if the history of science has taught us anything, it’s that nature isn’t as enamored with tidiness as we are. Virtually all our investigations of QM tell us that indeterminism cannot be fully exorcized from \(\hat M\), and the term “collapse” fits it perfectly. Outside the laboratory, everything we know about the world tells us we are conscious beings made in the image of our Creator. We are self-aware, intentional, and capable of making free choices—none of which is consistent with tidy determinism. Anyone who disputes that is welcome to come up with a differential equation and a self-contained set of data and boundary conditions that required me to decide on a breakfast sandwich rather than oatmeal this morning… and then collect their Nobel and Templeton prizes and retire to the lecture circuit.

The bottom line is that we live in a universe that presents us with \(\hat U\) and \(\hat M\). As far as I’m concerned, if the shoe fits I see no reason not to wear it. Yes, QMSL theories have their issues. But compared to other interpretations, its problems are formalistic ones of the sort I suspect will be dealt with when we’re closer to a viable theory of quantum gravity. When we as students are ready, our teacher will come. Until then, as Einstein once said, the world should be made as simple as possible, but no simpler.

When I was in graduate school my thesis advisor used to say that when people can’t agree on the answer to some question one of two things is always true: Either there isn’t enough evidence to answer the question definitively, or we’re asking the wrong question. Perhaps many of our QM headaches have proven as stubborn as they are because we’re doing exactly that… asking the wrong questions. One possible case in point… physicists have traditionally considered \(\hat U\) to be sacrosanct—the one thing that above all others, only the worst apostates would ever dare to question. Atheist physicist Sean Carroll has gone so far as to claim that it proves the universe is past-eternal, and God couldn’t have created it! [There are numerous problems with that of course, but they’re beyond the scope of this discussion.] However, Roger Penrose is now arguing that we need to do exactly that (fortunately, he’s respected enough in the physics community that he can get away with such challenges to orthodoxy without being dismissed as a crank or heretic). He suggests that if we started with the equivalence principle of general relativity instead, we could formulate a QMSL theory of \(\hat U\) and \(\hat M\) that would resolve many, if not most QM paradoxes, and this is the basis for his gravitationally-based QMSL theory discussed above. Like its competitors, Penrose’s proposal has challenges of its own, not the least of which are the difficulties that have been encountered in producing a rigorous formulation \(\hat M\) along these lines. But of everything I’ve seen so far, I find it to be particularly promising!

But then again, maybe the deepest secrets of the universe are beyond us. Isaac Newton once said,

“I do not know what I may appear to the world, but to myself I seem to have been only like a boy playing on the seashore, and diverting myself in now and then finding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undiscovered before me.”

As scientists, we press on, collecting our shiny pebbles and shells on the shore of the great ocean with humility and reverence as he did. But it would be the height of hubris for us to presume that there’s no limit to how much of it we can wrap our minds around before we have any idea what’s beyond the horizon. As J. B. S. Haldane once said,

“My own suspicion is that the Universe is not only queerer than we suppose, but queerer than we can suppose.” (Haldane, 1928)

Who knows? Perhaps he was right. God has chosen to reveal many of His thoughts to us. In His infinite grace, I imagine He’ll open our eyes to many more. But He certainly isn’t under any obligation to reveal them all, nor do we have any reason to presume that we could handle it if He did. But of course, only time will tell.

One final thing… Astute readers may have noticed one big elephant in the room that I’ve danced around, but not really addressed yet… relativity. Position, momentum, energy, and time have been a big part of our discussion today… and they’re all inertial frame dependent, and our formal treatment of \(\hat U\) and \(\hat M\) must account for that. There are versions of the Schrödinger equation that do this—most notably the Dirac and Klein Gordon equations. Both however are semi-classical equations—that is, they dress up the traditional Schrödinger equation in a relativistic evening gown and matching handbag, but without an invitation to the relativity ball. For a ticket to the ball, we need to take QM to the next level… quantum field theory.

But these are topics for another day, and I’ve rambled enough already… so once again, stay tuned! 

 

References

Haldane, J. B. S. (1928). Possible worlds: And other papers. Harper & Bros.; 1st edition (1928). Available online at www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=Possible+worlds%3A+And+other+papers. Accessed May 17, 2017.

Marshall, W., Simon, C., Penrose, R., & Bouwmeester, D. (2003). Towards quantum superpositions of a mirror. Physical Review Letters, 91 (13). Available online at journals.aps.org/prl/abstract/10.1103/PhysRevLett.91.130401. Accessed June 9, 2017.

Pepper, B., Ghobadi, R., Jeffrey, E., Simon, C., & Bouwmeester, D. (2012). Optomechanical superpositions via nested interferometry. Physical review letters, 109 (2). Available online at journals.aps.org/prl/abstract/10.1103/PhysRevLett.109.023601. Accessed June 9, 2017.

Penrose, R. (2016). Fashion, faith, and fantasy in the new physics of the universe. Princeton University Press, Sept. 13, 2016. ISBN: 0691178534; ASIN: B01AMPQTRU. Available online at www.amazon.com/Fashion-Faith-Fantasy-Physics-Universe-ebook/dp/B01AMPQTRU/ref=sr_1_1?ie=UTF8&qid=1495054176&sr=8-1&keywords=penrose. Accessed May 16, 2017.

Weaver, M. J., Pepper, B., Luna, F., Buters, F. M., Eerkens, H. J., Welker, G., … & Bouwmeester, D. (2016). Nested trampoline resonators for optomechanics. Applied Physics Letters, 108 (3). Available online at aip.scitation.org/doi/abs/10.1063/1.4939828. Accessed June 9, 2017.