Monte Carlo simulation is the use of random numbers to study scientific problems. It is a methodology that’s used to explore non-random behaviour in everything from mathematics to physics to chemistry to biology. Deterministic, symbolic analysis and reasoning is generally preferable because of better accuracy and understanding (e.g. explicit cause and effect). There are times, however, when deterministic methods are not applicable, usually due to lack of understanding of underlying principles or simply too many ‘dimensions’. If there are many confounding variables, it often leads to a ‘combinatorial explosion’, where formulaic methods are quickly overwhelmed. In plain English, estimates that may take days of computation using classical methods can sometimes be done in seconds using Monte Carlo methods (although accuracy and error range may suffer). In a sense, Monte Carlo methods replace formulae with algorithms.
There are two parts to this. First is the simulation, which is the generation of random numbers, a subject that can be fairly deep on its own, especially where computers are involved. In any case, the random generation of multi-dimensional points is generally a far simpler and quicker task than is classical multi-dimensional analysis. Second is the selection or design of algorithms. These are not the result of years of incremental academic scientific study, but rather something that is more like artistic insight. While the solution is doctrinal and scientific, the development of the path is not. Monte Carlo simulation can open up the fun of such investigations to ‘the rest of us’.
As an example, we’ll take a problem from mathematics, namely the calculation of the number π, and apply a very simple Monte Carlo simulation. This example uses Forth, and integer arithmetic only (not floating point). Forth is a language that runs beautifully on the Raspberry Pi (see “Forth on the Pi”), which makes it even more fun.
Consider a circle (radius r) which fits exactly into a square. The area of the circle is πr² and the area of the square is (2r)² so the ratio of the areas is just π/4
This enables us to calculate π = 4 (Acircle/Asquare). To simplify the problem, we consider only the upper right quadrant, and by symmetry the ratio remains the same. To estimate this ratio, we’ll generate lots of random points (x,y), all of which lie within the square. Those points which satisfy the condition x² + y² < r² will also lie within the circle (because the equation of a circle in Cartesian coordinates is x² + y² = r²).
Forth is best used as an integer-only language, with floating point being a fancy addition that old-school Forth purists try to stay away from. This is not as big a limitation as one might think, especially in a 32-bit Forth, which can directly handle very large integers. We’ll use the integers 0-999 to represent the real numbers 0.000-0.999 (an arbitrary but convenient choice). Also, r² = 1000² = 1000000 in this integer-only form. Similarly, the estimated value of π will need to be scaled by the number of points generated by the simulation. For example, if we use 10,000 points (105), we’ll need to read a number like 314159 as 3.14159 by mentally shifting the decimal point left 5 places. It’s a small price to pay for avoiding all that floating point overhead, with the benefit of greatly increased computation speed. It should be noted that Forth is not primarily used for raw number crunching, but its logical and concatenative nature make it an excellent choice for inferencing and artificial intelligence.
A good platform to try this out on is the Raspberry Pi. Although most tutorials and examples are written in Python, installing Forth on the Pi is easy and worthwhile.
Here is the code, with comments in green. The code can be directly entered in the Forth interpreter (don’t type in the comments).
[This code was lost in a site crash earlier this year. I’ll post it when I find or rewrite it]
Here are some test runs:
1000 pi 3232 10000 pi 31472 100000 pi 314096
- the generated numbers are only ‘pseudo-random’ (the same ‘seed’ will generate identical runs)
- the choice of 3 significant digits for the point x & y values is very low precision
- increasing this will also require changing the r² constant in the
in-circle?
word - the thickness of the circle’s boundary is theoretically (mathematically) infinitesimally small, but this not so in our little 1000×1000 model
- even with a 32-bit Forth, there is a limit to point precision and simulation size
- the estimate for π doesn’t converge uniformly, in fact it will get very
close to the proper value at certain points (see if you can find some), but then diverge away from it again for a while - other sources of error are left to you as an exercise 🙂