Table of Contents
Algorithms Redux
The Example: Euclid's Algorithm
Using a Loop to Code Euclid's Algorithm
Using Recursion to Code Euclid's Algorithm
General Recursive Structure
Advantages of Recursion
Once again, here's an official definition of an algorithm: "a finite sequence of unambiguous, executable steps that ultimately terminate if followed" (J. Glenn Brookshire, Computer Science: An Overview, 4th edition, p. 132).
That means that the instructions on your shampoo bottle aren't an algorithm, are they? Although there's a finite number of steps there (wash, rinse, repeat), it's the last one that goofs us up. If you follow the directions literally, you'll be washing your hair forever.
But once you do manage to boil your programming (or hair-washing) problem down to an algorithm that terminates successfully, you still have some choices to make about how to write your Perl code. It's possible to write the same algorithm several different ways.
The goal of this page is to help you decide whether you want to use a loop or recursion.
Let's work through an example, figuring out how to do it both ways.
Sometimes scienticians (that's their technical title, as The Simpsons has taught us) have to do math stuff. Here's one example. Take two numbers (say 10 and 15) and find the largest number that divides both of them. Well, 10 is divisible by 1, 2, 5, and 10. And 15 is divisible by 1, 3, 5, and 15. So I guess the answer to this problem is 5. (Homer: "Whoo hoo!") A scientician would say that "5 is the greatest common divisor of 10 and 15."
We solved that problem the hard way by doing all the math. We factored both numbers and then did a direct comparison, looking for numbers from each list that match. But an ancient Greek mathematician whose name shall remain anonymous devised an algorithm so unambiguous and executable that it would terminate if even a dolt followed it. (Homer: "Whoo hoo!") Instead of 10 and 15, let's call our numbers X and Y. Here's the process:
As long as the value of neither X nor Y is zero, continue dividing the larger of the values by the smaller and assigning X and Y the values of the divisor and the remainder, respectively. The final value of X is the greatest common divisor.
Let's try the process with 15 and 10:
Step 1. Set X = 15 and Y = 10.
Step 2. Is X or Y equal to 0? No. Whew. Keep going.
Step 3. Divide the larger by the smaller: 15 / 10 = 1 with 5 remaining.
Step 4. Set X = 10 (divisor) and Y = 5 (remainder). (Sounds like Step 1 again, doesn't it?)
Step 5. Is X or Y equal to 0? No. Whew. Keep going. (Hmmm. Step 2 again.)
Step 6. Divide the larger by the smaller: 10 / 5 = 2 with 0 remaining. (Step 3.... I sense a pattern.)
Step 7. Set X = 5 (divisor) and Y = 0 (remainder). (Sounds suspiciously like Step 4.)
Step 8. Is X or Y equal to 0? YES! STOP! X marks the spot... er, answer.
And the answer is in variable X, which now equals 5. (Homer: "Whoo hoo!")
Did you notice that, beyond finding the remainder, we don't actually have to do any division at all? Even though 10 goes into 15 one time (Step 3) and 5 goes into 10 two times (Step 6), we just throw away those answers. That's a good clue towards deciding how many variables we'll need to declare in our program. And to make things even easier, there's a single operator that we can put into our program precisely to find only the remainder left over after two numbers are divided, so we probably won't ever need to divide anything at all.
And did you notice that letting X equal the divisor simply means setting X equal to the smaller of the two numbers? Again, no division necessary. Neat, eh? Can you also see how we worked through the steps of the algorithm a little more than two times in order to find our answer? Kinda makes you think we could write the thing in Perl using a loop, doesn't it?
And we could. In a kind of pidgin Perl, it might look like the following. Notice that I've added the modulo command here, by the way, which is computer-speak for "the remainder left over when the first number is divided by the second number."
get X and Y while ( X != 0 ) and (Y != 0) do: divisor = the smaller of X and Y remainder = X modulo Y (if X is bigger), otherwise Y modulo X Set X = divisor and Y = remainder Do the test for zero again; if we fail, it means the loop is done. Now here, outside the loop, the answer magically is X
I know that's not very good Perl OK, it's not Perl at all but I think you get the point. Here's the real Perl (and notice the syntax of the modulo command, which uses the % symbol):
while ( $x != 0 and $y != 0) { # neither is zero if ( $x > $y) { $divisor = $y; # do this if $y is the smallest number $remainder = $x % $y; } else { $divisor = $x; # otherwise $x is the smallest number $remainder = $y % $x; } $x = $divisor; $y = $remainder } # loop is finished; jump up to do the test again print "The greatest common divisor is $x\n";
That's it. It works. Notice, though, that we could streamline it a bit. For example, when the if test fails, we set $divisor = $x and then turn around and set $x = $divisor. Harumph. Well, when we write our final version of the code, we can fix that.
Euclid's Algorithm is a good candidate for recursion because it's so repetitive. We just took advantage of that quality using a while loop. Now we'll write the algorithm again but use a recursive subroutine instead and remember, a recursive subroutine is just a subroutine that calls itself.
Remember the steps we went through to find the greatest common divisor of 15 and 10? We stepped through the algorithm a few times: in the first loop, X = 15 and Y = 10. The second time through, X = 10 and Y = 5. And finally, when X = 5 and Y = 0, we've met the terminating criteria and the algorithm can finish. The answer was magically stored in the X variable.
In effect, we could say, then, that finding the greatest common divisor of 15 and 10 is somehow equivalent to finding the greatest common divisor of 10 and 5, which is also somehow equivalent to finding the greatest common divisor of 5 and 0 (kinda we never finished all the steps in that one, did we?).
This is the essence of recursion: reduce a big problem down into smaller parts that are, in effect, equivalent to the bigger problem. Each time we simplify the problem a little bit, we send the new, simplified data to the subroutine again. Eventually, the problem can't get any simpler, and we've magically arrived at the answer.
Let's see if doing the problem in pidgin Perl will help. Remember, same algorithm, just different Perl strategy.
get X and Y; send them to the subroutine
subroutine takes X and Y if ( X != 0 ) and (Y != 0) do: divisor = the smaller of X and Y remainder = X modulo Y (if X is bigger), otherwise Y modulo X Set X = divisor and Y = remainder Send X and Y to the subroutine again.
Otherwise, when the "if" fails, we have arrived at the answer.
Here, outside the subroutine, the answer is magically "returned" to us.Can you see the similarity between this version and the earlier version? The algorithm looks very similar. Notice, though, that the while loop is gone and now we have an if statement. That's the gist of writing a recursive subroutine: the subroutine is basically a big if-else statement, and instead of writing any kind of control loop, we use a subroutine that keeps calling itself.
Here's how it looks in Perl. Notice that I've included two lines prior to the subroutine the subroutine call itself and the print line.
$GCD = ÷ ($x, $y); # $GCD will contain the answer; & marks the subroutine call print "The greatest common divisor is $GCD\n"; sub divide { my ($x, $y, $divisor, $remainder); # these variables are scoped to the subroutine ONLY ($x, $y) = @_; # assignment in list context from the default @_ array if ($x != 0 and $y != 0) { # Big Loop: if successful, do some processing... if ( $x > $y ) { $divisor = $y; $remainder = $x % $y } else { $divisor = $x; $remainder = $y % $x } ÷ ($divisor, $remainder) # ...and call the subroutine again with new values } else { # Otherwise, the "if" fails and we... return $x # ...return with the solution } }Notice three things:
- I shifted $x and $y off the default @_ array using an assignment in list context. Remember, all the arguments passed to a subroutine get "flattened" and lined up sequentially in the @_ array. As a programmer, I have to remember to take them off in the same order that I put them on way back in the main part of the program.
- This version of the algorithm has two nested if statements. The if ($x > $y) is the same as the one in the earlier version of the program, the one that used the while loop. But the old while loop has now turned into if ($x != 0 and $y != 0), and it now has an else clause. That's new. This is the machine that makes the recursive subroutine work. It controls how many times the subroutine calls itself, and it also controls what to do when the subroutine stops. When the test fails, the subroutine stops calling itself and it...:
- ...proceeds to the else clause where it executes the return $x command. This command tells the subroutine to "return" from whence it came, and to take the current value of $x with it. In short, the recursive subroutine will keep calling itself again and again, falling into a black hole as it were, until the if ($x != 0 and $y != 0) test fails. The subroutine then does a U-turn and bootstraps itself back out of the bottomless pit, dragging the value of $x with it, kicking and screaming all the way.
So in general, we could say that the structure of your recursive subroutine will look something like this:
if ( successful test) do: process stuff call the subroutine again with new values else return with the answerNot so difficult, is it?
But notice: it's the if-else statement that transforms our subroutine from an infinite hair-washing session into an algorithm that actually terminates. This structure is crucial because if we omit it, the computer will simply recurse until it eats up all available memory, at which point it will die with an "out of memory" error.
You might be asking why anyone would want to use recursion. After all, it's a much more abstract concept than a simple loop and so it requires a little extra brainpower to figure out the Perl coding. We also had to use a subroutine (declaring some new variables along the way as well as pretty much being forced to use an if-else statement in it), and the code (at least in our example here) turned out to be a bit longer than the same code in a while loop. As if that's not enough, recursion takes some extra computer memory each time the subroutine calls itself, Perl must store away the current status of the subroutine (current line being executed, values of all the variables, etc.) in a special internal and invisible data structure called a "stack." (If you "recurse" a long time, the stack will get very big and Perl will give you a warning message specifying that you're doing some "deep recursion.") So what gives?
A couple of things. Recursive subroutines generally solve problems in the most general way possible, which often means that recursive subroutines can be more readable than other coded versions of the same algorithm. Remember how we were tempted to "streamline" the while loop version of Euclid's algorithm because we set $divisor = $x and then turned around and set $x = $divisor? We were starting to tweak the code to accommodate certain special cases. Notice that we can't really streamline the recursive subroutine in the same way because we're not trying to economize the manipulations of $x and $y; we're using $divisor and $remainder instead. Recursive routines shun tweaking in favor of the most general answer possible. In fact, I think that the recursive version is a bit clearer to read even with its nested if statements because it's obvious that we're sending the $divisor and the $remainder to the subroutine each time and it's equally obvious that the returned answer will be stored in $x. The version that used a while loop gave us no such clues about its behavior.
Another advantage to recursive subroutines is that they can be used to solve very complex problems that have repetitive structures everything from finding paths through mazes to solving complex equations to playing games like chess to sorting massive lists of data to deciphering HTML files.
Yep, you read that right. An HTML file can be represented in a structure that computer scientists call a tree. It might look like this but remember, this is just a "for instance." Not all web pages follow this exact structure. (And yes, computer scientists always draw trees upside down.)

Ever wonder why you had to close the </head> tag before you could open the <body> tag? Because if you didn't, the HTML file would violate the tree structure. You can't put information from two different tree branches together on the same branch. Tags that are on the same level horizontally are discrete and separate from one another. Tags lower vertically are completely enveloped by tags higher up. That's why you have to close the </title> tag before you can close the </head> tag: <title> is lower on the tree and has to be wholly contained inside everything that's above it. And you can't close the </html> tag until the end because it is the "root" of the whole tree and so must envelope everything below it.Well, it turns out that the most efficient, straightforward way to "parse" HTML (that is, to" read" the document so that the computer can represent it internally as a tree structure) is to use a recursive subroutine. And then the newly formed HTML tree can be "walked" or "traversed" by you guessed it another recursive subroutine. The basic algorithm says something like "Look at everything on the left branches first, and then look at everything on the right branches." Since the left branches might have their own subsets of left and right branches, recursion is the perfect tool for the job. Write the "look left, look right" code once, and presto! the code will traverse the whole tree flawlessly and almost magically.