For many people, the first stumbling block when they are learning to program is that they find variables very confusing. Why is this?
It doesn't seem quite so surprising that novice programmers find functions and objects hard to understand. (These topics seem to form the next two major stumbling blocks for novice programmers.) Functions and objects both draw on an understanding of abstraction, and the idea of abstraction seems to be hard for some people to grasp; they seem to find it difficult, when presented with particular examples of abstraction, to understand that these are all illustrations of the same general concept. It's as though you need to already understand abstraction before you can see it in the examples.
So, functions and objects seem like major ideas, but ... variables? Why is it so difficult to use variables? To expert programmers, variables seem entirely obvious. It's really hard to see what the problem is. I'd like to suggest a possible explanation, which is that the novices tend to confuse one variable with another because they are suffering from a short-term memory overload. Maybe novices can only hold one or two variables in mind as they think about their program.
At first it seems unlikely that this could be the case. We all know about George Miller's seminal paper The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information (Psychological Review, 1956, vol 63, pp81-97). We expect our short-term memory to be able to hold around five to nine "things" before we start confusing or forgetting them. But that number depends upon how easily we recognise the things, based on "categorical" knowledge from our long-term memory. If we don't have that knowledge, then all the things tend to look similar, and the number we can hold in short-term memory can be a lot less. It can be as low as only one thing!
For example, Henrik Olsson and Leo Poom, in their paper Visual memory needs categories (PNAS, 2005, 102(24), pp8776-8780), describe experiments in which they presented their subjects with visual patterns which were easy to distinguish when viewed side-by-side, but which were confusing when viewed one after another because they were drawn from the same categories. (For example, differently oriented ovals within ovals.) They found that for these kinds of patterns, the subjects' short-term visual memory could only hold one thing: the last thing they saw. This was quite a surprise to the participants in the experiment, who expected to be able to do a lot better than that.
This kind of confusion is not restricted to visual memory. A similar effect is described by Malcolm Gladwell in his book Blink:
"Have a friend pour Pepsi into one glass and Coke into another and try
to tell them apart. Let's say you succeed. Congratulations. Now, let's
try the test again, in a slightly different form. This time have your
tester give you three glasses, two of which are filled with one
of the Colas and the third with the other. In the beverage business,
this is called a triangle test. This time around, I don't want you to
identify which is Coke and which is Pepsi. All I want you to say is
which of the three drinks is not like the other two. Believe it or
not, you will find this task incredibly hard. If a thousand people
were to try this test, just over one-third would guess right —
which is not much better than chance; we might as well guess."
(Blink, p185.)
Unless you are an expert taster, unless you have practiced and have the categorical knowledge in your long-term memory which lets you recognise and distinguish these different tastes, your short-term memory only holds one thing, only one taste, the last thing you tasted. And we don't expect that.
So what I'm suggesting is that a similar effect might happen with variables in programs. When experts look at variables, they are applying categorical knowledge from their long-term memory. They recognise how variables are used and what their purpose is in the program. They recognise what each variable is for. Experts choose much better names for variables than novices. When an expert chooses a name for a variable, they tend to choose a name which reflects the purpose of the variable. Given a poorly written program, when an expert has understood it, one thing they will certainly do is to choose better names for the variables.
Novices, on the other hand, don't do any of this. They can't. They don't have the categorical knowledge in their long-term memory about patterns of variable use. Of course, they can tell one variable from another, just like people can tell Pepsi from Coke. But if I'm right, they will find it extremely hard to understand programs which require them to hold in their mind more than just a single previous variable. We might therefore expect that x = x + y would be just on the limit of understanding for a novice. (There are two variables, but x actually means two different things on each side of the assignment, which might push it over the edge.)
If novice programmers cannot understand their programs because they cannot intuitively distinguish variables, this might explain some otherwise mysterious features of their code. For example, I'm sure I'm not the first programming teacher to notice the remarkable enthusiasm of novices for inventing new variables and scattering their code with redundant assignments, as if they hope that one more might do the trick and make their program work. (And by accident, it sometimes does.) It might also partly explain a frequent misunderstanding when we come to functions and objects: the idea that every variable with the same name is actually the same variable. (As experts we have no problem with the idea that we can use the same name for local variables in different functions, even ones that call one-another. Or that this or self in a method is different when invoked on different objects. But if novices have to fall back on merely comparing the textual names of variables, they are bound to be confused when they try to reason about different variables that happen to have the same name.)
Of course, without intuitive understanding, many novice programmers would just give up, and many do. However, if they persist, after a while novices will start to unconsciously pick up the long-term categorical knowledge that they need. (Or at least those who are ultimately successful must do this.) But rather than leaving this to chance, there are some things we can do to help them on their way: we can at least try to explicitly teach some categorical knowledge about variables through worked examples. I have found that explaining the roles of variables seems to help a lot. (But just talking about roles is not quite enough, you have to get students to engage more and to practice describing the roles of variables in example programs and in their own programs.)
You mention that functions are hard because of abstraction but variables are also an abstraction: instead of concrete values you are using something which (in general) has a concrete value only at runtime, at program writing time you have to think about a set of possible values together with relationships to other such sets -- not a simple thing to do. Main complication added by functions is parameters (and in imperative programming also the fact that evaluation of the body is postponed).
ReplyDeleteSeems that for some students abstract thinking is so hard that you don't have to wait until functions for them to hit the wall. Papert seems to believe that there is another way for reaching them in programming class http://www.papert.org/articles/EpistemologicalPluralism.html