In modern times, computing trillions of digits $\pi$ has become a rite of passage for the latest parallel computer clusters. This obviously frivolous feat obscures the difficulty and importance of computing the first few digits of $\pi,$ the ratio of the distance around a circle to the distance across it. Engineers need six or eight digits to make precision machinery work properly, while the standard double precision floating point accuracy of about sixteen digits is adequate for even high precision scientific experiments or interplanetary navigation. But if you had only a pencil and paper, how could you work out the value to even a few digits? When I read that Leonhard Euler calculated the first twenty digits of $\pi$ in one hour, I decided to see how long it would take me.
Twenty two centuries ago, Archimedes proved that $\pi$ is between $3\tfrac{1}{7}$ an $3\tfrac{10}{71}$ by estimating the perimeters of 96-sided polygons circumscribed and inscribed around a circle. Careful measurement produces about this accuracy, but Archimedes's method is purely abstract reasoning, and his accuracy was limited only by his patience in selecting how many sides he was willing to choose for his polygons. In decimal fractions, Archimedes put $\pi$ between $3.1429$ and $3.1408,$ which is accurate to two places for either bound, or three places if you average the two. Archimedes was working under the extreme handicap that the only way he had to write numbers was Roman numerals or the equivalent, so I am completely unable to judge the human effort required for his calculation in his own time.
For me or a proficient high school student today, repeating the Archimedes calculation requires about an hour's effort (I did it many years ago) - something like what Euler needed to get twenty places. But the technique of computing perimeters of polygons has a very deep flaw: You need a fabulous number of sides to get even a few more digits of $\pi$. That is, the perimeter of a many-sided polygon converges very slowly to the perimeter of a true circle. All techniques for computing $\pi$ involve infinite series (because $\pi$ is transcendental). What distinguishes them is how rapidly they converge, that is, how many iterations until further corrections are smaller than the accuracy you want. This is a trade-off engineers and scientists, as well as mathematicians, face daily: How much time should I spend improving the technique before I dive in and do the work? For computing $\pi$ to twenty places, the answer is a lot. Archimedes's direct method is hopeless.
The advent of calculus in the late 1600s provided a very different way to approach calculating $\pi$. In simple terms, calculus allows you to begin with an infinite-sided polygon, whose sides exactly match an infinitesimal section of circular arc, and work backwards to a finite angle. The idea is to develop a power series for a function with values related to $\pi$. By far the simplest such series is for the inverse tangent function $\tan^{-1}t,$ which is the angle measured in radians of a right triangle with ratio of opposite to adjacent sides equal to $t$: $$\tan^{-1}t = t - t^3/3 + t^5/5 - t^7/7 + ...$$ Both sides of the equation are obviously $0$ when $t=0$. Furthermore, $t$ is the length of the side of a polygon circumscribed around the unit circle, so for small $t,$ $\tan^{-1}t \approx t$ simply says the angle of an arc of the unit circle in radians is equal to the length of the side of the polygon. If you know how to take derivatives, you can check the inverse tangent series by differentiating both sides: $$1/(1+t^2) = 1 - t^2 + t^4 - t^6 + ...$$ This is perhaps the best known infinite series, which you can easily verify by multiplying both sides by the denominator on the left side.
Plugging $t=1$ into the inverse tangent series produces the celebrated formula $$\pi/4 = 1 - 1/3 + 1/5 - 1/7 + ...$$ For actually calculating $\pi,$ this is a huge step backwards from Archimedes. It is hard to find any series which converges more slowly than this one; no one would be foolhardy enough to attempt to use it to compute $\pi$.
Nevertheless, the series would converge very quickly and become a useful calculation tool if we could somehow relate $\tan^{-1}t$ to $\tan^{-1}1=\pi/4$ for some $t$ substantially smaller than 1. John Machin's claim to fame is that he found a way to do that. His idea is to use the tangent sum formula from trigonometry: $$\tan(\alpha+\beta) = (\tan\alpha+\tan\beta)/(1-\tan\alpha\tan\beta)$$ Recast in terms of inverse tangents, this is $$\tan^{-1}t + \tan^{-1}u = \tan^{-1}\frac{t + u}{1 - tu}.$$ Thus, for example, if we can find fractions $t$ and $u$ for which $t+u=1-tu$ (for example $t=1/2$ and $u=1/3$) we can realistically use the inverse tangent series to compute the angles $\tan^{-1}t$ and $\tan^{-1}u$ in radians, which we know sum to $\tan^{-1}1=\pi/4.$
But Machin realized he could do substantially better, both by finding smaller $t,$ and by finding values of $t$ which are easy to multiply by hand in base ten. Once you see the potential of the tangent sum formula for finding angles whose sum is $\pi/4,$ and you realize you can use them to efficiently compute $\pi,$ you naturally begin to try some combinations of simple fractions. You are rewarded quickly when you try multiples of the angle whose tangent is $t=1/5.$ By accident, four times that angle is exceptionally close to $45^\circ$: $$\begin{align*} \tan^{-1}\frac{1}{5}+\tan^{-1}\frac{1}{5} &= \tan^{-1}\frac{5+5}{25-1} = \tan^{-1}\frac{5}{12} \\ \tan^{-1}\frac{5}{12}+\tan^{-1}\frac{5}{12} &= \tan^{-1}\frac{60+60}{144-25} = \tan^{-1}\frac{120}{119} \\ \tan^{-1}\frac{120}{119}-\tan^{-1}1 &= \tan^{-1}\frac{120-119}{120+119} = \tan^{-1}\frac{1}{239} \\ \frac{\pi}{4} = \tan^{-1}1 &= 4\tan^{-1}\frac{1}{5}-\tan^{-1}\frac{1}{239} \end{align*}$$ This is an exceedingly clever formula, and Machin deserves his fame for its discovery. However, it's in the category of results that I probably could have found myself, given the idea to use the tangent sum formula to look for small $t$ that are easy to multiply by hand and use in the inverse tangent series.
Machin did, on the other hand, demonstrate far more resolve than I could possibly have mustered, by deciding to use his inverse tangent decomposition to compute $\pi$ to one hundred digits, which he did in 1706. I have been unable to find out how long this calculation took him, but anything less than a couple of months (or more than a year) would be surprising. On the other hand, my guess would be that he would have needed at most a day or two to discover his $1/5$ plus $1/239$ tangent decomposition -- the calculation for that specific case is the work of a few minutes as we have seen. The really interesting question, then, is how long did he spend trying to find an even better decomposition before he decided to bury himself in the hundred digit marathon calculation?
It's pretty easy to estimate how much work you'll need to do for each of the two series, and Machin must have worked that out in advance: How else did he choose a hundred digits as a reasonable goal? For the $1/5$ series, each term is a factor of a bit more than $5^2=25$ less than the previous. Now $5^{-10}=2^{10}/10^{10}$ is just a bit more than $10^{-7},$ which means that each $5$ (that is, $10/2$) terms of the series adds another $7$ digits. To get a hundred digits, therefore, Machin needs about $75$ ($100/7\times 5$) terms of the $1/5$ series (up to about $t^{150}$). For the $1/239$ series, $1/239\sim 4/1000$ which means $239^{-5}\sim 10^3/10^{15} = 10^{-12},$ so every $5$ terms of that series adds another $24$ digits. A hundred digits therefore requires about $21$ terms of the $1/239$ series, in addition to about $75$ terms of the $1/5$ series.
Each term requires one multiply by $t^2$ (which is dividing by $239^2$ for that series), plus one small integer divide (for the coefficient of the inverse tangent series), and one add to accumulate the final result. All up, we're talking about $96$ hundred digit adds, $75$ hundred digit multiplies, and $117$ hundred digit divides ($21$ of which are by the daunting $239^2$). I imagine each of those divides might take a couple of hours, so the whole process ought to be possible with several hundred hours work. That's why I'm guessing it took Machin a few months to perform and check his calculation. It also seems to me that he would have spent the bulk of his time on the $21$ terms of the $1/239$ series. Even though it has three and a half times fewer terms, the arithmetic work is considerably more than three and a half times as great per term.
We can all appreciate the level of effort Machin expended to complete this calculation, although the sustained level of concentration required for the task is something most of us will never experience.
That brings us to Euler, one of the two or three greatest mathematicians of all time. His one hour calculation of twenty digits of $\pi$ probably took place in or near 1766, sixty years after Machin's hundred digit calculation. Euler's method is based on what Machin did - he certainly would have known of the hundred digit calculation. His major improvement was to find an inverse tangent decomposition into two parts, for which both series involve small terminating base ten decimals. That is, Euler eliminates the nasty $1/239$ series that Machin probably spent most of his time on.
Unlike Machin, Euler's hour long $\pi$ calculation hardly deserves a footnote in the catalog of his achievements - he spent an hour on it compared to the staggering 92 volumes of his collected works. Euler didn't set out to compute $\pi$ at all. Instead, he was working on a general technique for what we now call accelerating the convergence of any alternating series. His technique is now called the Euler transform, which he published in 1766. It is a way to convert any infinite sum whose terms alternate in sign into a new infinite sum which converges faster than the original. The Euler transform is far too complicated to explain here, but it is a very important result, unlike a few digits of $\pi.$
Since the inverse tangent series is one of the most famous alternating infinite sums, Euler certainly would have used it as an example of his general method. Euler's transformed series for the inverse tangent function is: $$\tan^{-1}t = \frac{x}{t}\left(1 + \frac{2}{3}x + \frac{2\cdot 4}{3\cdot 5}x^2 + \frac{2\cdot 4\cdot 6}{3\cdot 5\cdot 7}x^3 + \frac{2\cdot 4\cdot 6\cdot 8}{3\cdot 5\cdot 7\cdot 9}x^3 +...\right),$$ where Euler defines $$x = t^2/(1+t^2).$$ Plugging in $t=1$ and $x=1/2$ yields a usable series for computing a few digits of $\pi$ immediately, unlike the original inverse tangent series. However, Machin's pair of series ($1/5$ and $1/239$) both converge far more quickly.
But Euler also discovered a two term inverse tangent decomposition which is tailor-made for his different inverse tangent series: $$\frac{\pi}{4} = \tan^{-1}1 = 5\tan^{-1}\frac{1}{7} + 2\tan^{-1}\frac{3}{79}$$ Now for base ten calculations using the original inverse tangent series, $1/7$ and $3/79$ would be terribly ugly choices, easily as bad as $1/239.$ But look at their values of $x$! $t=1/7$ becomes $x=1/50=0.02$ and $t=3/79$ becomes $x=9/6250=0.00144.$ Not one, but both of the two series involve powers which are very easy to calculate by hand in base ten!
Like Machin's decomposition, the fact that any decomposition as nice as Euler's exists is wildly improbable serendipity. In the case of Machin's decomposition, even a hack mathematician like me could easily have stumbled across it and recognized its significance. Euler's decomposition, on the other hand, is far beyond anything I can imagine discovering myself.
I haven't been able to find any description of how people hunt for such decompositions. At least several thousand such inverse tangent decompositions are known today (perhaps many more), but they are rare enough that probably only a few had been discovered in Euler's time, and as the greatest mathematician of his age, he probably knew of all of them. An important historical note I found dates Euler's inverse tangent decomposition to 1755 - about ten years before the work on Euler transforms which led to his inverse tangent series. My guess, therefore, is that Euler became interested in inverse tangent decompositions, which belong to an intriguing branch of number theory involving complex numbers with integer real and imaginary parts. While studying them, he added his own $1/7$ plus $3/79$ decomposition (and perhaps others) before he ever began to think about accelerating series convergence.
If Euler had the decomposition first, and had studied them enough to know they weren't all that common, then it is easy to understand how he would have simply checked all the known decompositions to see if any happened to have nice values of $x,$ like Machin's nice value of $t=1/5.$ If so, he would have been just a shocked as we are to find that both $x$ values were nice numbers. Even better, it is a decomposition that he himself discovered a decade before. If this is an accurate reconstruction of the history, it really brings home Euler's dominance of eighteenth century mathematics.
That still doesn't explain how Euler discovered his decomposition. Unlike the Machin decomposition, it isn't just a matter of noticing that some multiple of a small fraction is extremely close to $45^\circ,$ and subtracting to find the remainder. The factor of $2$ in front of the $3/79$ term breaks that simple approach. And unlike four times $1/5,$ five times $1/7$ is not unusually close to $45^\circ.$ I wouldn't have found this decomposition in a century. Euler too must have had some method he discovered to narrow his search, but I've been unable to find out what it was.
To give you an idea how much harder Euler's decomposition is than Machin's, here is my work just to prove that it is true: $$\begin{align*} \tan^{-1}\frac{1}{7}+\tan^{-1}\frac{1}{7} &= \tan^{-1}\frac{7+7}{49-1} = \tan^{-1}\frac{7}{24} \\ \tan^{-1}\frac{7}{24}+\tan^{-1}\frac{7}{24} &= \tan^{-1}\frac{2\cdot 7 \cdot 24}{24^2-7^2} = \tan^{-1}\frac{14\cdot 24}{17\cdot 31} \\ \tan^{-1}\frac{14\cdot 24}{17\cdot 31} + \tan^{-1}\frac{1}{7} &= ... = \tan^{-1}\frac{2879}{7\cdot 479} \\ \tan^{-1}\frac{3}{79}+\tan^{-1}\frac{3}{79} &= \tan^{-1}\frac{6\cdot 79}{79^2-3^2} = \tan^{-1}\frac{3\cdot 79}{41\cdot 76} \\ \tan^{-1}\frac{2879}{7\cdot 479} + \tan^{-1}\frac{3\cdot 79}{41\cdot 76} &= \tan^{-1}\frac{3\cdot 7\cdot 79\cdot 479 + 41\cdot 76\cdot 2879} {7\cdot 41\cdot 76\cdot 479 - 3\cdot 79\cdot 2879} \\ 3\cdot 7\cdot 79\cdot 479 + 41\cdot 76\cdot 2879 &\stackrel{?}{=} 7\cdot 41\cdot 76\cdot 479 - 3\cdot 79\cdot 2879 \\ (41\cdot 76 + 3\cdot 79)\cdot 2879 &\stackrel{?}{=}\ (41\cdot 76 - 3\cdot 79)\cdot 7\cdot 479 \\ 3353\cdot 2879 &\stackrel{?}{=}\ 2879\cdot 7\cdot 479 \\ 3353 &\stackrel{?}{=}\ 7\cdot 479 \end{align*}$$ Since the final equality is indeed true, $\pi/4 = 5\tan^{-1}1/7 + 2\tan^{-1}3/79,$ as Euler says. If you search for "Machin-like formulas" you'll find a lot of research on these decompositions, continuing to the present day.
It would be pointless for me to carry out a twenty digit calculation of $\pi$ by hand if I had not also proven for myself that the series I was using does in fact converge to $\pi.$ So here is a specific derivation of Euler's series for inverse tangent based on some ideas I found in my web research. Remember that Euler would have applied his much more powerful and general Euler transform; like the above proof of his $45^\circ$ decomposition, this proof is constructed backwards from the series itself.
First, note that if $t=\tan\alpha,$ then $x=\sin^2\alpha.$ Hence, $\alpha=\tan^{-1}t=\sin^{-1}\sqrt{x}.$ Consider the function $$f(x) = \frac{\sin^{-1}\sqrt{x}}{\sqrt{x(1-x)}}.$$ Unlike its numerator, $f$ has no $\sqrt{x}$ singularity at $x=0,$ so it has a power series expansion $$f(x) = \sum_{n=0}^{\infty}b_n x^n$$ We can find the $b_n$ by the following steps: $$\begin{align*} (x(1-x))^{1/2} f &= \sin^{-1}x^{1/2} \\ (x(1-x))^{1/2} f' + \tfrac{1}{2}(x(1-x))^{-1/2}(1-2x)f &= \tfrac{1}{2}(x(1-x))^{-1/2} \\ 2x(1-x) f' + (1-2x) f &= 1 \\ (2xf' + f) - 2(x^2 f' + x f) &= 1 \\ \sum (2n+1)b_n x^n - \sum 2(n+1) b_n x^{n+1} &= 1 \\ \sum ((2n+1)b_n - 2n b_{n-1}) x^n &= 1 \\ b_0 &= 1 \\ b_n &= \frac{2n}{2n+1} b_{n-1} \end{align*}$$ This recurrence for $b_n,$ together with the fact that $\sqrt{x(1-x)}=x/t,$ produces the Euler series for inverse tangent.
However much serendipity was involved, Euler's method is arguably the easiest way to compute an accurate value of $\pi$ with pencil and paper which has ever been discovered. If Euler could get twenty digits in an hour as an afterthought, I decided I would try my hand at it.
The $t=1/7$ series has $x=0.02$ and we need to multiply it by $5x/t=0.7.$ The $t=3/79$ series has $x=0.00144$ and we need to multiply it by $2x/t=0.07584$: $$\frac{\pi}{4} = 0.7\sum b_n (0.02)^n + 0.07584\sum b_n (0.00144)^n$$ I decided to use the recurrence on $b_n$ to compute successive terms in the series: $$\begin{align*} b_0 &= 1 \\ b_n x^n &= \frac{2n}{2n+1} x b_{n-1}x^{n-1} \end{align*}$$ Using the recurrence, computing each term requires a simple multiply $2nx$ followed by the full precision decimal multiply by $b_{n-1}x^{n-1},$ followed by a small integer divide by $2n+1.$
Before starting, we need to decide how many decimal points we want in our final result, because we need to cut off the long division and multiplication operations for each term somewhere. That also determines how many terms of each series we need to compute. With the alternating Taylor series, it is obvious that each successive term alternately overshoots or undershoots the exact answer, so we simply need to go until the next term is smaller than the accuracy we want. However, the Euler series does not alternate, so we need a way to bound the sum of all remaining terms. Fortunately, because $b_n/b_{n-1}$ approaches $1$ from below as $n$ increases, we know that: $$\begin{array}{l} \sum_{n=N}^{\infty} b_n x^n \lt b_N x^N (1 + x + x^2 + x^3 + ...) \\ \sum_{n=N}^{\infty} b_n x^n \lt b_N x^N / (1 - x) \\ \sum_{n=0}^{N} b_n x^n \lt \sum_{n=0}^{\infty} b_n x^n \lt \sum_{n=0}^{N} b_n x^n + (b_N x^N) x / (1 - x) \end{array}$$ Hence, after computing the $N$-th term, we can multiply it by $x/(1-x)$ to put a limit on the sum of all subsequent terms.
We can also estimate the magnitude of the $n$-th term of the sequence without computing it exactly. I estimated an upper bound for $b_6 = 2^{10}/(3\cdot 7\cdot 11\cdot 13) \sim 0.4$ while an upper bound for $b_{10} = 2^{18}/(11\cdot 13\cdot 17\cdot 19\cdot 21) \sim 0.3.$ In other words, assuming $b_n \sim 0.4$ is not going to be a huge overestimate for any $n\ge6$ we are likely to want to compute. To get 20 digits of $\pi$ after the decimal point, we need the sum of all omitted terms to be less than about $3\times 10^{-21}.$ But to get $\pi$ we need to multiply the first series by $2.8$ and the second series by roughly $0.32.$ (Note the sum $3.12$ is just a bit less than $\pi$!) Thus the tolerable errors in the series themselves are about $10^{-21}$ for the $t=1/7$ series, and $t=10^{-20}$ for the $t=3/79$ series.
So how many terms of each will we need? The sixth term of the $t=3/79$ series is roughly $0.4\cdot 0.00144^6,$ which is roughly $3\times 10^{-18}.$ The sum of all remaining terms is thus roughly $4.5\times 10^{-21},$ which is about half our acceptable limit. The tenth term of the $t=1/7$ series is roughly $0.3\cdot 0.02^{10},$ also about $3\times 10^{-18}.$ The sum of all the remaining terms is thus about $6\times 10^{-20},$ which is too big. Even after the eleventh term, the sum of what remains would be about $x$ times smaller, or $10^{-21},$ which is right on the borderline of our target. Hence if we want to be absolutely sure of 20 places after the decimal point of $\pi,$ we need 6 terms of the $t=3/79$ series, and 11 or possibly 12 terms of the $t=1/7$ series. Since we only need to keep a few significant digits of the last couple of terms, it doesn't cost a lot to tack on one more.
So here are the results I got by hand (on the second attempt in a few cases) for the two series. I show only the results; the actual calculations filled about seven sheets of binder paper. (For comparison, the algebra to derive all the formulas and error estimates filled five sheets.) If you give it a try, you can check your results against these to spot and correct your errors. We begin with the $t=3/79$ series: $$\begin{align*} b_1 x = 2/3 x &= 0.00096 \\ b_2 x^2 = 4/5 x b_1 x &= 0.00000110592 \\ b_3 x^3 = 6/7 x b_2 x^2 &= 0.0000000013650212571428\ 57 \\ b_4 x^4 = 8/9 x b_3 x^3 &= 0.0000000000017472272091\ 42 \\ b_5 x^5 = 10/11 x b_4 x^4 &= 0.0000000000000022872792\ 56 \\ b_6 x^6 = 12/13 x b_5 x^5 &= 0.0000000000000000030403\ 22 \\ \sum b_n x^n &= 1.0009611072867707746717 \\ b_6 x^6 x/(1-x) &= 0.0000000000000000000044\ \text{error} \end{align*}$$ Note that some of the $2nx/(2n+1)$ factors have cancellations which simplify the calculation, because $x=0.00144=16\cdot9\cdot10^{-5}.$ For example, $8x/9=128\cdot 10^{-5}.$
And here is the $t=1/7$ series: $$\begin{align*} b_1 x = 2/3 x &= 0.0133333333333333333333\ 33 \\ b_2 x^2 = 4/5 x b_1 x &= 0.0002133333333333333333\ 33 \\ b_3 x^3 = 6/7 x b_2 x^2 &= 0.0000036571428571428571\ 43 \\ b_4 x^4 = 8/9 x b_3 x^3 &= 0.0000000650158730158730\ 16 \\ b_5 x^5 = 10/11 x b_4 x^4 &= 0.0000000011821067821067\ 82 \\ b_6 x^6 = 12/13 x b_5 x^5 &= 0.0000000000218235098235\ 10 \\ b_7 x^7 = 14/15 x b_6 x^6 &= 0.0000000000004073721833\ 72 \\ b_8 x^8 = 16/17 x b_7 x^7 &= 0.0000000000000076681822\ 75 \\ b_9 x^9 = 18/19 x b_8 x^8 &= 0.0000000000000001452918\ 75 \\ b_{10}x^{10}= 20/21 x b_9 x^9 &= 0.0000000000000000027674\ 64 \\ b_{11}x^{11}= 22/23 x b_{10}x^{10} &= 0.0000000000000000000529\ 42 \\ \sum b_n x^n &= 1.0135503900297423058050 \\ b_{11} x^{11} x/(1-x) &= 0.0000000000000000000011\ \text{error} \end{align*}$$ The only cancellation here is in the $14/15$ term.
All that remains is to multiply the $t=3/79$ series by $0.07584$ and the $t=1/7$ series by $0.7$ and add them, showing 22 decimal places: $$\begin{array}{rl} 3/79\ \text{part}& 0.0759128903766286955511 \\ 1/7\ \text{part}& 0.7094852730208196140635 \\ \pi/4= & 0.7853981633974483096146 \\ \pi = & 3.1415926535897932384584 \end{array}$$ The upper bounds on the contributions of the remaining terms are $1.3\times 10^{-21}$ for the $t=3/79$ part and $3.1\times 10^{-21}$ for the $t=1/7$ part, for a total of $4.4\times 10^{-21},$ putting an upper bound of $3.1415926535897932384638.$ The correct value to 22 places is $3.1415926535897932384626,$ which is indeed in our calculated range. Rounding our result to 20 places gives 20 correct decimal places $\pi = 3.14159265358979323846,$ using 11 $t=1/7$ terms and 6 $t=3/79$ terms, although we need to round up the final digit.
The diameter of the solar system is about $1.5\times 10^{10}$ km, so our error of $10^{-21}/\pi$ would produce an error in its perimeter of $0.5\times 10^{-11}$ km, or $5$ nm, which is about 50 atoms. So I've verified the value of $\pi$ myself, by hand, to an accuracy far greater than any practical need, courtesy of the genius of Euler as inspired by Machin. I've also established that my arithmetic skills are within two (but not one) orders of magnitude of the great Euler, no doubt closer than my batting average would be to Ted Williams, which is something. But seriously, a day or two spent this way has taught me things about the history of mathematics and the development of algorithms which would be hard to appreciate otherwise.