Saturday, October 1, 2011

Building Numerical Approximations In Erlang

This post is partially complete (I don't have a lot of time to work on it)...

I'm writing this post for a few reasons:
1) I keep telling people that I'm actually going to redo projects or homework that I've done for class, but in Erlang (because Erlang is cool)
2) I might have volunteered to give a short talk on Erlang, so I need to have something prepared in just in case


Erlang documentation is limited due to its low number of users.  There are two excellent books and a great language walk-through at http://learnyousomeerlang.com/contents.


Root finding (a.k.a., where does a function cross zero) is common in scientific computing: it is concernted with finding a number r such that f(r) = 0.  There are several common root finding algorithms: Newton's Method, the Finite-Distance Newton's Method, the Secant Method, and the Bisection Method.  Let's start by considering Newton's Method; it is defined by the sequence xn+1 = xn - f(xn)/f'(xn).  When, you might ask, should we stop getting new xn+1's?  When we are "close" to zero (the root of the equation); "close" is not really codable though.  Or is it?  On a floating-point system (i.e., any computer ever built), there are numbers so small (i.e., close to 0) that, when added to another number, the equal that other number.  By convention, we say that a number n is less than the machine epsilon when (1 + n) = 1.  The machine epsilon is the smallest number emsuch that (1 + em) != 1.

Ok, let's code finding the machine epsilon and determining if a number is smaller than it.

Erlang is simple and elegant.  Let's walk through the lines of code to see what they do.

Line 1: Declare this file as a module (so that we can reuse it later)

Lines 3 & 4: Export three functions with 0, 1, and 1 argument each, respectively

Line 7: If no argument is passed to calculate_machine_epsilon, pass 2 in (base 2 is a simple algorithm)

Line 8 & 9: Make an initial guess about the machine epsilon; we use 1 because base^0 = 1 for any base.  The no indicates to our later code that we are not done guessing machine epsilons.  The no is called an atom in the Erlang language.  Any word starting with with a lower-case letter is an atom.  Note that function names are atoms; this is on purpose.


Lines 11 & 12: An Erlang function doesn't take in variables in the way you'd see in most languages.  Instead, functions are pattern matchers.  At runtime, when an Erlang function is called, the virtual machine looks for any matching functions, if none are found, it throws a bad_match error.  So when you we see

we can guess that the developer indended this function to only have two valid values for the final argument of this function: yes and no (this is a good bet since I wrote the code).  Thus, if you called calculate_machine_epsilon(_Base, LastGuess, 100.5) your code would crash.  In practice, this is helpful because you code for specific patterns and testing will indicate bugs pretty quickly.  You'll note that the first definition of this function  has a _Base argument for the first variable; whereas, the second has Base.  Erlang's compiler warns if you do not use every variable that you've set; the _ prefix to a variable name tells the compiler that you don't intend to use the variable (so it doesn't warn).  Keeping the name the same (except for the _) is useful when somebody tries to figure out what code does months later.   Note that, in this case, the function has the same number of arguments and they appear to be (from the definition) the same "type."  Functions do not need to have the same number of arguments, and the types of each function with the same number of arguments don't need to be the same (think patterns instead of traditional c-style functions).

Line 11 (specific): The second variable is the last guess; the third variable should be yes if the last guess is less than the machine epsilon.  Thus, if it is yes, we're done, so return the last guess.

Line 12 (specific):  The last guess of machine epsilon isn't small enough, so do more.  Note that line 9 passed no to this function with the initial guess of 1.

Line 13: The next guess of the machine epsilon is the current guess divided by the number base.

Line 14: Call this same function again, passing in the guess we just calculated and the value of is_less_than_machine_epsilon.

Lines 16 - 19: A number n is less than the machine epsilon if 1 + n = 1.  These three lines of code show Erlang's pattern matching at its strongest.  I call machine_epsilon_test(number + 1) and pattern match it against machine_epsilon_test(1.0) and machine_epsilon_test(_Other).  In another language, this would be done with an if statement.  Erlang, however, allows you to do it via pattern matching.  This is both elegant and reusable.


You may have noticed that the algorithm is correct for base 2 but not for any other meaningful base.  It doesn't find the true machine epsilon.  Let's update the algorithm to do so:


Now that we have a mechanism for determining if a number is "close" to zero (it is less than the machine epsilon), let's build Newton's Method:


Now, an Emakefile:


And an ant build file (because Ant is great):



It's time for me to run some errands with my wife and daughter, so I'm going to finish the commentary later.
calculate_machine_epsilon(_Base, LastGuess, yes) -> ...
calculate_machine_epsilon(Base, LastGuess, no) -> ...
calculate_machine_epsilon(_Base, LastGuess, yes) ->...
calculate_machine_epsilon(Base, LastGuess, no) -> ...