Sunday, 21 April 2013

Three Good Questions

Good questions are important. It's often fairly straightforward to find a good answer, if you're given a good question. It's a lot harder to find a good question, given nothing. Over years of doing system design, I found three "meta" questions particularly useful. These have sometimes helped me to imagine things that might go wrong during the construction or operation of a system. Maybe they could help you too.

I was reminded of these "meta" questions by an email from Aaron Sloman following my post on Learning to Program. He said:

"I could have added that I noticed something different when teaching philosophy. Doing conceptual analysis well requires the ability to dredge out of your vast store of knowledge examples that illustrate, test or refute, some concept, distinction, or thesis.

"I found over many years of teaching that I could teach some students to produce examples merely by doing it myself in discussions: they somehow got the idea and were able to do something similar. With other students they seemed to lack the ability to access their long-term memory in an appropriate way, and I was not able to think up a way of teaching them that helped them to develop that ability. I suspect this is related to the ability to do well in the Jeopardy game, but is more complicated, and less a matter of speed.

"More importantly it may also be related to some high level requirements for being a good system designer: too often systems are designed with flaws that the designer should have noticed without waiting for users to discover them. That requires the ability to trawl through what you already know, and I fear some designers are very bad at that, and design bad systems."

So maybe my Three Good Questions help me to "trawl through what I already know", and to imagine what might be. Anyway, here are the questions:
  • Does it scale? Usually it doesn't. When you take a prototype system that works fine in the lab and try to scale it up, you almost always run into problems. Sometimes these are latency or bandwidth problems. (A system which works fine in the lab on a lightly loaded local-area network is going to feel very different to users on a congested wide-area network.) Sometimes there are problems with algorithms that have unfortunate growth characteristics. There might even be an accidental reliance on particular people who are available to make the prototype system work, but who will not be available in a larger deployed system. (For example, I am suspicious that the currently over-hyped "Massively Open Online Courseware" will fail to scale in exactly this way.) There are lots of ways that a small system can fail to scale and only one of them has to be true for it to be a fatal problem for your system.

  • Does it have a start-of-day problem? A "start-of-day problem" is a familiar concept to system designers, but it is not a term in everyday use. A system has a start-of-day problem if it can reproduce itself once it exists, or keep itself going when it is going, but which needs an entirely different mechanism to start it going from nothing. For example, think of the problem of powering-up a ship from nothing. When its main engines are running, electricity comes from generators driven by those engines. (In some ships like the QM2, that's actually all that the main engines do: power transmission to the propellers is entirely electric.) However, the control systems and starter motors for the main engines need electricity to work. What provides that electricity if the main engines are not already going? This is a start-of-day problem, and it is solved in this case by having auxiliary generators to provide that electric power. (But they themselves have their own start-of-day problem, and they need to be provided with further mechanisms to start them from cold ...) We see this problem over and over in system design. For example, it is traditional to write compilers in their own language. But in that case how do you compile the first compiler for that language? (By having a bootstrap compiler written in something else, that you run once then never need again.)

  • What are the security implications? If we are doing system design properly, then we think about security all the way through, not just at the end. We think about exactly what security guarantees we are going to provide and to whom. We think about the traditional security properties of Confidentiality, Integrity and Availability (and maybe a few others). But this question is asking you to go a step further, and to consider the unintended consequences which might result from correctly implementing the security guarantees that you think you want. For example, to prevent an online password-guessing attack, it's usual to lock-out an account after, say, three wrong attempts. That's great for confidentiality, because a locked-out attacker can't now get access to your data. However, the unintended consequence is that neither can you! The lock-out mechanism which delivers better confidentially has also delivered worse availability: if all the attacker in fact wanted was a denial-of-service attack, you have handed it to him on a plate.
As I have been writing this, it struck me that the idea of trying to produce a counter-example is also very strong technique in system design. To phrase this as a "meta" question, we might say: If you think it can't work, can you prove it can't work? Sometimes by trying (and failing) to prove that it can never work, you can actually solve the original problem. For example, I was once working on a project with an apparently insurmountable architectural problem. If we couldn't solve it, the project we were working on would almost certainly be closed down. We'd been trying to solve it for a couple of months, but with no success. So one Friday afternoon, I decided to finally put the issue to rest and I tried to prove to a couple of colleagues that a solution was impossible. I didn't manage to do that, because over the space of about an hour, by trying to construct that proof I was finally able to find a solution to our problem.

System design is always an exercise in imagination. We often know the answer, but we don't know that we know it because we need someone to ask the right question, a useful and productive question which calls forth the answer in a flash of insight. Good system designers distinguish themselves by being able to ask exactly those fruitful and productive questions. Can you teach that? I'm not sure.

Sunday, 14 April 2013

Some Useful Things

Little things can sometimes make a big difference. Here are a few things that I've found useful.

Food: Kuhn Rikon Epicurean Garlic Press

If you are going to use a garlic press, then this is the one to pick. It's heavy duty, made from stainless steel. It's probably the best garlic press in the world. To see a demo, have a look at America's Test Kitchen Equipment Review: Garlic Press where they come to the same conclusion and show off this little device. (Kuhn Rikon have another garlic press called the Easy Squeeze, which is a lot cheaper. It has a slightly different action and plastic handles. It's not nearly as good.)

Of course, if you are being macho you can just crush garlic with a kitchen knife. For example see this YouTube video. (Though the cook in this video makes a bit of a performance of removing the skin. The trick is to just trim off the ends of the garlic first with a knife, like you would while topping-and-tailing mange-tout peas. Then use the flat of your knife and gently lean your whole weight on the garlic until it just goes "click!". You can then just peel off the whole of the skin in one go rather than tediously picking pieces of skin out of the mashed up garlic.)

Coding: redemo.py

This little Python GUI program is endlessly useful when you find yourself tweaking a regular expression to make it parse your data correctly. In fact I would nowadays usually start by copying some of my example data into redemo.py's input box and then start writing my regular expression directly into its regexp box. That way I can check as I go along that the regexp is doing what I want, and when I make a mistake I'll get immediate feedback. (Redemo.py highlights the data that currently matches, reacting character-by-character as you change the data or the regexp.) When you get your regexp to work, you can just copy it straight into the program that's going to use it. And you're finished.

(On Windows, you can find redemo.py in the standard Python distribution, under Tools/scripts or Tools/demo depending on the version you are using. However, on Ubuntu Linux you have to sudo apt-get install python-examples or python3-examples, then find the mysterious place where it's been hidden, using say find /usr -type f -name redemo.py. Worth the effort though.)

Food: Star Anise

This is a useful ingredient that you might not think of adding to meat dishes. (I use it in my chilli recipe.) Obviously star anise plays a big part in Chinese cooking — it's one of the five spices — but it's less common in Western cooking. But as Harold McGee says in McGee on Food and Cooking, when you cook star anise with onions "the result is the production of sulfur-phenolic aromatics that intensify the meatiness of the dish." (p430)

Coding: Inconsolata font

Earlier this year I switched to Inconsolata for all my program editing, command shells and so on. It's not a huge change, but I think it's just a bit nicer than the alternatives. For example you get a zero with a slash though it, easily distinguishable from an uppercase letter "O", and you can quite easily tell the difference between numeral one, lowercase "L" and uppercase "I".

Sunday, 7 April 2013

Why are variables so confusing?

(This post is a follow-up to Learning to Program.)

For many people, the first stumbling block when they are learning to program is that they find variables very confusing. Why is this?

It doesn't seem quite so surprising that novice programmers find functions and objects hard to understand. (These topics seem to form the next two major stumbling blocks for novice programmers.) Functions and objects both draw on an understanding of abstraction, and the idea of abstraction seems to be hard for some people to grasp; they seem to find it difficult, when presented with particular examples of abstraction, to understand that these are all illustrations of the same general concept. It's as though you need to already understand abstraction before you can see it in the examples.

So, functions and objects seem like major ideas, but ... variables? Why is it so difficult to use variables? To expert programmers, variables seem entirely obvious. It's really hard to see what the problem is. I'd like to suggest a possible explanation, which is that the novices tend to confuse one variable with another because they are suffering from a short-term memory overload. Maybe novices can only hold one or two variables in mind as they think about their program.

At first it seems unlikely that this could be the case. We all know about George Miller's seminal paper The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information (Psychological Review, 1956, vol 63, pp81-97). We expect our short-term memory to be able to hold around five to nine "things" before we start confusing or forgetting them. But that number depends upon how easily we recognise the things, based on "categorical" knowledge from our long-term memory. If we don't have that knowledge, then all the things tend to look similar, and the number we can hold in short-term memory can be a lot less. It can be as low as only one thing!

For example, Henrik Olsson and Leo Poom, in their paper Visual memory needs categories (PNAS, 2005, 102(24), pp8776-8780), describe experiments in which they presented their subjects with visual patterns which were easy to distinguish when viewed side-by-side, but which were confusing when viewed one after another because they were drawn from the same categories. (For example, differently oriented ovals within ovals.) They found that for these kinds of patterns, the subjects' short-term visual memory could only hold one thing: the last thing they saw. This was quite a surprise to the participants in the experiment, who expected to be able to do a lot better than that.

This kind of confusion is not restricted to visual memory. A similar effect is described by Malcolm Gladwell in his book Blink:

"Have a friend pour Pepsi into one glass and Coke into another and try to tell them apart. Let's say you succeed. Congratulations. Now, let's try the test again, in a slightly different form. This time have your tester give you three glasses, two of which are filled with one of the Colas and the third with the other. In the beverage business, this is called a triangle test. This time around, I don't want you to identify which is Coke and which is Pepsi. All I want you to say is which of the three drinks is not like the other two. Believe it or not, you will find this task incredibly hard. If a thousand people were to try this test, just over one-third would guess right — which is not much better than chance; we might as well guess." (Blink, p185.)

Unless you are an expert taster, unless you have practiced and have the categorical knowledge in your long-term memory which lets you recognise and distinguish these different tastes, your short-term memory only holds one thing, only one taste, the last thing you tasted. And we don't expect that.

So what I'm suggesting is that a similar effect might happen with variables in programs. When experts look at variables, they are applying categorical knowledge from their long-term memory. They recognise how variables are used and what their purpose is in the program. They recognise what each variable is for. Experts choose much better names for variables than novices. When an expert chooses a name for a variable, they tend to choose a name which reflects the purpose of the variable. Given a poorly written program, when an expert has understood it, one thing they will certainly do is to choose better names for the variables.

Novices, on the other hand, don't do any of this. They can't. They don't have the categorical knowledge in their long-term memory about patterns of variable use. Of course, they can tell one variable from another, just like people can tell Pepsi from Coke. But if I'm right, they will find it extremely hard to understand programs which require them to hold in their mind more than just a single previous variable. We might therefore expect that x = x + y would be just on the limit of understanding for a novice. (There are two variables, but x actually means two different things on each side of the assignment, which might push it over the edge.)

If novice programmers cannot understand their programs because they cannot intuitively distinguish variables, this might explain some otherwise mysterious features of their code. For example, I'm sure I'm not the first programming teacher to notice the remarkable enthusiasm of novices for inventing new variables and scattering their code with redundant assignments, as if they hope that one more might do the trick and make their program work. (And by accident, it sometimes does.) It might also partly explain a frequent misunderstanding when we come to functions and objects: the idea that every variable with the same name is actually the same variable. (As experts we have no problem with the idea that we can use the same name for local variables in different functions, even ones that call one-another. Or that this or self in a method is different when invoked on different objects. But if novices have to fall back on merely comparing the textual names of variables, they are bound to be confused when they try to reason about different variables that happen to have the same name.)

Of course, without intuitive understanding, many novice programmers would just give up, and many do. However, if they persist, after a while novices will start to unconsciously pick up the long-term categorical knowledge that they need. (Or at least those who are ultimately successful must do this.) But rather than leaving this to chance, there are some things we can do to help them on their way: we can at least try to explicitly teach some categorical knowledge about variables through worked examples. I have found that explaining the roles of variables seems to help a lot. (But just talking about roles is not quite enough, you have to get students to engage more and to practice describing the roles of variables in example programs and in their own programs.)

Sunday, 31 March 2013

Recipe: Aubergines stuffed with chickpeas

Only after I wrote this up did I notice a very weak and accidental pun: this is a recipe for Egg Plant, and today is Easter.

Ingredients

3 Aubergines, about 20cm long
Lots of salt
5t Extra virgin olive oil
1 Onion, chopped (about 150g)
4 Cloves garlic, crushed
400g Tinned tomatoes
400g Tinned chickpeas
(preferably in plain water, drained weight = 250g)
60g Fresh coriander leaf, chopped fine
2T Tomato purée
3T Lemon juice
1t Salt
Ground black pepper, to taste)
(1T = one tablespoon = 15ml; 1t = one teaspoon = 5ml)

Method

Cut the aubergines in half and slice out the insides. Leave as little flesh as you can manage on the skins, but make sure the skins are still intact. Put the skins into a colander and liberally sprinkle with salt. Then chop up the inside flesh into pieces less than 1cm and put them in the colander too, sprinkling them with salt as you go. The salt draws out the bitter taste from the aubergines by osmosis, so leave the colander in the sink or on the draining board to drip while this happens. It takes about half-an-hour.

Thoroughly rinse the salt off the aubergine, then dry it off. (I would usually pat the skins dry with kitchen paper and centrifuge the inside flesh in a salad-spinner.) Fry the skins in the olive oil for a minute or two on each side so that they soften, then lay them them in a baking tin to fill with the rest of the mixture later.

Preheat oven to 200°C (= gas mark 6 = 400°F).

Peel and chop the onion. Crush the garlic. Fry the onion and garlic for 5 minutes or so, until brown. (You may need to add more olive oil to the pan: the aubergine skins tend to suck it up.)

Add the aubergine flesh to the pan and cook for another 5 minutes.

Drain the tomatoes and slice them up (about 5mm thick slices). Add them to the pan.

Drain the chickpeas and rinse them off in the colander. Add them to the pan.

Add the chopped fresh coriander, the tomato purée, lemon juice, salt and pepper to the pan. Heat the mixture until it is bubbling.

Pile the mixture into the aubergine skins and cook in the oven for 30 minutes at 200°C (= gas mark 6 = 400°F).

Discussion

This recipe is improved by using fresh tomatoes rather than tinned: remove their skin the usual way using boiling water, then slice them into 5mm slices sideways. However, that takes the overall effort just over the edge of what I can be bothered to do, so I tend to just use tinned tomatoes instead.

Sunday, 24 March 2013

Write-me-a-program

(This post is a follow-up to Learning to Program.)

I'd like to describe a particular teaching technique that has worked well for me, which I'll call "write-me-a-program". The first time I tried this, I guess the planets were aligned and it worked brilliantly. I was so bowled over that I had to call in a colleague, to show them how much fun we were having. (It's not always worked out quite so well, but it's worked pretty well and I've used it for both Python and SQL teaching.) It's a small-class technique: good for 15-20 students, and I think it would work as well for up to 30. Like a magic trick, I think that the precise details are important, so I'll try to give the recipe in sufficient detail that you can replicate it yourself. But first I'll give some background, which might be more interesting than the recipe itself, especially if you don't intend to use the recipe.

When I came up with the "write-me-a-program" idea, I wanted something that used the "assertive questioning" classroom technique along with a sequence of questions that had something of the flavour of Michel Thomas' language courses, one question building on another, using spacing for reinforcement. If you don't know about spacing, it's a very good way to reinforce learning: rather than repeating the same material immediately, or after a long time, it works best to come back to it at short intervals, with other material in between. Spacing has been known about for well over 100 years, but it's been used very patchily. See for example: The spacing effect: A case study in the failure to apply the results of psychological research by Frank Dempster (American Psychologist, Vol 43(8), August 1988, pp627-634).

Michel Thomas' language courses are the best examples of spacing I know, and they take a distinctly different approach to most language courses. The worst of those make their students remember disconnected phrases which, if they manage to recall them at all, are not very useful in everyday life. See for example Eddie Izzard's comments on learning French from his show Dress to Kill. (He complains that his school textbook contained things which were "difficult to get into conversation, things like: 'The mouse is underneath the table' ".)

In contrast, Michel Thomas doesn't give you phrases to memorise. In fact he insists that you don't try to memorise the material. Instead he builds tiny piece on tiny piece and asks you to make, in real-time, translations of increasingly complicated phrases built from these simple pieces. Worked examples are extended and extended, earlier material is incorporated with the new, folded together and repeated with variations. As David Sedaris notes in Easy Tiger: Language Lessons on the Road (New Yorker, 11 July 2011), after only a few minutes you graduate from "Give it to me", via "I would like for you to give it to me" to "I can't give it to you today, because I cannot find it". This is a long way from "The mouse is underneath the table". And much better practice for saying genuinely useful things in real life. As Sedaris says, "You're engaging with another language, not just parroting it."

So I wanted questions that built in the same way, and placed similar demands on the students. What I describe here is only part way to that objective, but the feedback that I've had from students has been good and we've mostly all had fun. Hopefully you can improve on what I've got here. Let me know! (Oh, and you might also try out a Michel Thomas language course, if only to experience first-hand the feel of his method.) Here's my recipe for my write-me-a-program technique:

Ingredients

A class of students, not too big.
A computer with a projector.
A set of slides (in e.g. Powerpoint) with the questions, one per slide.
     (See end of this post for an example set of questions on Python.)
Printed handouts with all questions + answers.
     (N.B. Don't hand out these until the END.)
A whiteboard with pens.
Jotting paper/pens for the students.

Method

First explain to the class that the lesson is going to be a quiz, with students divided into teams. The best size for a team is three people (four is okay) so get the students to divide themselves up. However, you need to avoid having all the slowest students on one team. It's also better if the fastest students are spread across different teams. If you feel that you can simply order the students to join different teams, then you can sort out any problems afterwards. Otherwise, it might be better to use some kind of lottery to divide the students into groups, and rely on randomness to even things out. Students never object to that.

Explain to the students that they will each need paper and a pen. (And wait until they have sorted this out.)

Next explain to the students what is going to happen: "I am going to show you a question on the computer projector. The question will ask you to write a few lines of program. Each of you will do this on your own. Do your best and don't copy from your team-mates. I'll give you a minute or two for this, then I'll ask you to compare your answer with the ones that your team-mates have written. Try between you to make the best answer you can. I'll give you a couple of minutes for that, then I will pick one person to come to the front and write on the whiteboard their team's best answer."

Put up the slide with the first question. I think it's best if this first question is really trivial, so no-one will have any problem answering it. That way you can at first concentrate on getting the process right. For example, an opening Python programming question might be:

Make variable i have value 5.

Remind the students to write their own answer, don't copy their team mates. Also tell the students not to bother writing the question down. Show them your stack of handouts with questions and answers. "Just concentrate on thinking of your answer and writing it down. I'm going to hand these out at the end." (I find that some students seem to use copying down the question as a displacement activity, so they can look busy when they can't think what to do. Some of the later questions are much longer and copying is a waste of their time. They'll still do it, though, if you don't explicitly tell them not to.)

With this simple first question, writing down the answer should take about 10 seconds, but for some of the later questions people will need a minute or two. It's worth reminding the students that you want them to practice answering the questions out of their own head, not by searching through notes or books. "This is practice at doing programming, and you practice doing by trying. Don't worry if you can't quite remember something. Try work it out, and do what you can. But don't go looking things up."

There's a point when most of the students are clearly finished, but not all. Leave them a bit longer, then tell the students to now compare their answers with their team-mates and together to work out their best answer. The precise point to do this is important, but difficult to describe. It's a bit like in a recipe where it says "cook until it's done". You don't want to wait for absolutely everyone to finish, because the faster students will be bored and frustrated, and the slower ones just embarrassed. But equally, you don't want to cut students off who are in the middle of writing something, because this is one point where they are certainly learning, as they practice pulling information out of their own brains. (After a few questions, you'll probably find that the faster groups start to quietly confer when they have all finished, and I think this is ok, but it's one reason for not wanting all the faster students on one team.)

You then need to give the teams of students time to discuss their answers. For the simple first question this is perhaps only 10 seconds, but for the later questions it will be at least a couple of minutes, maybe longer. Again, there's a point where most teams seem to have reached a common verdict, maybe signalled by a lull in the chatter. Leave it just a moment longer, then select one student from the class to come to the front and write their team's answer on the whiteboard. (I would usually walk to them and hand them the whiteboard marker, while they are in their seat.)

They might want to bring their notes, or just write from memory, either way is fine. While they are writing on the whiteboard, stand to one side, and when they have finished take back the whiteboard marker and say "Thank you". (I think that's important.) It's possible that as they are writing, one of their team-mates will call out a correction. This is fine. Don't correct them yourself, though. If they have a minor query, you can answer that, but if they try to recruit you into co-writing the answer, just say "Write down what you've got and we'll see if we can improve it."

The student returns to their seat, and you now ask the class "Can we improve this in any way?" (Again, I think the precise choice of words is important. You need to show that you are respecting the effort that's gone into this attempt, but you want the whole class, if they can, to make it better.) Exactly how to play this next depends on the question and the answer. Maybe it's perfect, but even so you want to give the class a chance to come to that conclusion for themselves. This part of the process becomes a sort of communal code-inspection.

Usually, there are some "typos". In Python there's often an an opportunity for the "colon police" to call out the location of missing colons (a frequent mistake). If a student has a simple suggestion, you can write it on the board. If it's complex (maybe a whole different way to answer the question) you can get them to come up and write it themselves. Perhaps a student can see that the answer isn't going to work, but they are not sure how to mend it themselves. That's still a valuable contribution.

If there are no more suggestions, you might go on to ask "Does our answer do what the question asks?" If you see that there's a problem, this phrasing deflects the blame from the original student's attempt to "our answer". But you can still ask this even if the answer is right. (Make the students think!) When you see that something has been misunderstood, take the opportunity to explain how it actually works. (But if you find you are doing that a lot, probably the questions are too hard for this class.) Note that in all this we are using a computer only as a slide-projector, to show the question, which is in view the whole time. Everything else is hand-written.

When you've got an answer that works ok, leave it on the whiteboard and go on to the next question, with the same drill. (Since the questions build on one-another, it's often the case that the answer gives a strong hint on how to do the next question. The problem for the students becomes how to change that answer.) Select students from around the class to write at the whiteboard in a way which appears fair and which is sufficiently random. (In order to maintain the feeling in students that "It could be me".)

See the end of this post for a collection of Python questions that have worked well for me.

Discussion

The students must already have been taken through worked examples of the material covered in the questions, and sufficiently recently that they have adequate recall. It's vital to choose questions at just the right level. Too easy is better than too hard, because there's a certain pace that has to be maintained, building from one question on to the next.

It's ok to introduce a slight twist on material that students already know, but it doesn't work if you have to break off and introduce a lot of new material, particularly if it's actually only new to some of the students. For me, write-me-a-program has worked best as a way to confirm and reinforce learning in combination with lab-sessions (where students work though examples and problems in pairs) and small programming assignments (where students feel there is some pay-off to knowing how to write code, so they have made some effort out of class).

Where this technique has worked less well for me, I think that it was because there was too much disparity between the fastest and slowest students, or because I over-estimated what the students as a whole were capable of doing.

Example set of questions

Here's a set of questions which I've used when teaching Python. I've found that going through this set takes about 3 hours in total. You might reasonably say that "If your students can do these questions, then you have no problems". Quite! But see how the questions build upon each other, and how they repeatedly call upon earlier material in the answers to later questions.
  1. Make variable a have value 5.
  2. Write an if statement that prints "too big" if variable a has a value more than 99, and otherwise does nothing.
  3. Increment variable i by 1.
  4. Use a while loop to print all the integers from 1 to 99.
  5. Print the odd numbers from 1 to 101.
  6. Define a function odds() which prints the odd numbers from 1 to 101.
  7. Define a function odds1(x) which prints the odd numbers from 1 to x.
  8. Define a function sum_odds(x) which prints the sum of the odd numbers from 1 to x.
  9. Define a function sum_odds1(x) which returns the sum of the odd numbers from 1 to x. (And does not print anything.)
  10. Make variable q have as its value the empty list.
  11. Append the integer 5 to the list q. Print q. (What exactly is printed?)
  12. Define a function odds_list(x) which returns a list odd numbers from 1 to x. (And does not print anything.)
  13. You are given a functionprime which takes an integer and returns True if it that integer is a prime number and False otherwise.
    Define a function prime_list(x) which returns a list of all the prime numbers from 2 to x. (And does not print anything.)
  14. Use the range function to write an expression whose value is the list [0, 1, 2, 3, 4, 5, 6, 7].
  15. Use the range function to write an expression whose value is the list [1, 2, 3, 4, 5, 6, 7, 8].
  16. Use the range function to and a for loop to print the integers from 1 to 99.
  17. Use your earlier prime_list function and a for loop to print all the primes from 1 to 999.
  18. The built-in sum function takes a list and returns the sum of its elements.
    Use the built-in sum function to print the sum of the integers from 1 to 99.
  19. Print the sum of the primes from 1 to 999.
  20. Define a function vertical(s) which uses a for loop to print the characters of the string s, one per line.
  21. The built-in ord function takes a single-character string and returns the integer which is its Unicode code.
    Define a function codes(s) which uses a for loop to print the Unicode codes of the characters in the string s, one per line.
  22. Define a function code_list(s) which returns a list containing the Unicode codes for each character in the string s.
  23. Print the sum of the Unicode codes of the characters in the string "hello world".

Monday, 18 March 2013

Recipe: Fish Chowder

This is a very chunky soup, a meal in itself.

Ingredients

500g White fish fillet, in 2cm cubes
8-10 Cellery stalks, diced
8-10 Spring onions, chopped
400ml Brown chicken stock (see here for this component)
250ml Double cream
1t Salt
Ground black pepper (to taste)
pinch Grated nutmeg (optional)
(1t = one teaspoon = 5ml)

Method

Put the stock and the diced cellery into a saucepan. Add the salt and pepper. (I like about 10 turns from our pepper-mill, which makes about 1/10 of a teaspoon of ground pepper.) Cover and simmer gently for 50 minutes.

Add the double cream and chopped spring onion to the pan, turn the heat to high and bring to a rolling boil. Now add the fish and cook for a further 2 minutes, stiring very gently. (Stop when the fish is cooked: you can tell because it changes colour to an opaque white.)

If you want to add the pinch of grated nutmeg, stir that in just before you serve.

Discussion

A teaspoon of salt may seem like a lot, but salt probably isn't as bad for you as you have been led to believe. The evidence is distinctly mixed. Eating more salt may or may not increase your blood pressure. We really don't know. But there is evidence that eating more salt makes you less likely to die. (Yes, that's right: less likely! See Open Season on Salt: What the Science on Hypertension Really Shows.)

Cod is really the fish of choice for this recipe. Using other white fish the recipe comes out nearly, but not quite as good. (Cod used to be so plentiful, and now it's sadly nearly wiped out. There's a lesson for us here about pretend science and real science: listen to what Dean Bavington has to say about the Newfoundland fishery.)

Sunday, 10 March 2013

Learning to Program

The time has come, I think, to talk about learning to write programs, and about how we can help people to learn. If you have ever tried to teach people how to program, you will know that it comes very easily to some people, and ... I want to say "slowly to others", but really I should tell the truth and say "not at all to others". Or at least that's often how it seems.

Do you remember when you first learned to program? If you are like me, it came very easily. I remember sitting at home one weekend reading in fascination the ICL BASIC Pocket Edition, then rearranging my A-level physics practicals so that I could be there at the booked times and use the school's dial-up terminal to run my code on some distant mainframe. I learned Fortran from McCracken's A Guide to Fortran IV Programming, and APL from from an APL\1130 Primer that I begged from IBM. (Though it was a couple of years before I actually came in contact with a machine where I could use that arcane knowledge.) In my first year at university, before I officially started studying computing, I learned BCPL, Z80 assembler and Algol68.

Now for you, the precise list of technologies is probably different, but I bet the outline is the same. It came very easily. You mostly taught yourself. It was fascinating, addictive even. Like building a machine out of Mechano parts, all perfect. Like giving instructions to a magic genie, or writing the rules for a game that played itself. Surely anyone could do this and have just as much fun? But consider: could we be making the "false consensus" error? Could we be wrong when we imagine that what we find easy is typical?

The assumption that it's going to be easy underpins a lot of current efforts. For example, if you haven't seen it already, it's worth watching the promotional video for the code.org website. This is a very dense video — in just over 5 minutes it makes just about every case for learning to program. It opens with a lofty quote from Steve Jobs: "Everybody in this country should learn how to program a computer ... because it teaches you how to think." (Which, if we take it seriously, gets into dangerously political territory. See Not a Tool but a Philosophy of Knowledge.) The video then wheels out a sequence of substantial computer geeks, including Bill Gates and Mark Zuckerberg. They enthuse about how great it is to be able to program; how wonderful and empowering it was when they first learned. The video wraps up with a starkly utilitarian message, aimed at those who are still not convinced: people who can program get better jobs. They get ping-pong tables and free food. At work! The lesson is clear. For all these reasons, everyone should learn to program. But "should" depends on "can". Can everybody learn how to program a computer?

The evidence is not very encouraging. For example, in their 2006 paper The camel has two humps Saeed Dehnadi and Richard Bornat noted that:

"Despite the enormous changes which have taken place since electronic computing was invented in the 1950s, some things remain stubbornly the same. In particular, most people can't learn to program: between 30% and 60% of every university computer science department's intake fail the first programming course. Experienced teachers are weary but never oblivious of this fact; bright-eyed beginners who believe that the old ones must have been doing it wrong learn the truth from bitter experience; and so it has been for almost two generations, ever since the subject began in the 1960s."

This paper caused considerable controversy at the time. Not, however, because of its thesis that most people found programming too hard. Few people disagreed with that assertion, because everyone who has tried to teach programming has been confronted, in their direct experience, with the same evidence. The controversy concerned whether a test which the authors proposed could really do what they claimed, and sort the programming sheep from the non-programming goats. (Such "programming aptitude tests" have a poor track-record and their results usually correlate very weakly with subsequent direct measures of programming ability. The results from Dehnadi and Bornat's test were mixed, and perhaps all we can say for sure at the moment is that if the test subject has had any previous exposure to an attempt to teach programming, this appears to make their test ineffective. Correction: it doesn't make the test ineffective, but you need to take previous experience into account. See comment below from Richard Bornat, and also see Meta-analysis of the effect of consistency on success in early learning of programming.)

However, there's a deeper issue here, which has so-far hardly been noticed. Even if we had a pre-course test for programming ability, where would that leave us? Sure, it would permit university computing departments to filter their prospective students and to reduce a lot of unnecessary suffering by turning away those who would find programming too hard. But if we really believe everyone should learn to program, if that's more than just a slogan, then this approach doesn't help at all, does it?

The challenge here is to work out techniques for teaching programming to those people who do not find that it comes easily. A test that showed ahead of time who was a "natural" and who wasn't would be helpful, but not to filter out and discard those who have difficulty. It would be more akin to a diagnosis of dyslexia. We don't say to a child "You have dyslexia. You will never be able to read." Instead, when we find that a child has difficulties with reading we put extra effort into helping them, and to a large extent we now know how to be successful. With dedication, it's possible to get literacy rates in excess of 98%. (Although governments seldom consider it worth trying that hard, it is possible.) Personally, I believe that if we wanted, the same could be true of programming.

But how? Surely, as Richard Bornat said to me last autumn, "We have tried everything." What can we do that's different? Now, I'm no "bright-eyed beginner" — I've been teaching programming classes for several years — but I think there are things we could try, but don't, because of who we are. Mostly, the people who teach programming are first of all expert programmers, not expert programming teachers, and they mostly aim their teaching at students who are fundamentally the same as themselves. The teachers are almost always the people for whom it came easily. The people who have difficulty — most people — need a very different approach, and not just an approach which is louder and slower. It is, in fact, our experience that is not typical.

Now, I'm sure you would like to see some concrete examples, and I'd like to give them, but I think this post is long enough already. In a future post I will certainly take the opportunity to talk about what's worked and what's failed for me, and to put some teaching materials online. I certainly don't have all the answers but maybe I can help us get closer.

Sunday, 3 March 2013

Recipe: Yet Another Chilli

Over the years, I've gone through several chilli recipes. This is the latest, and I think the best.

Ingredients

400gTinned chopped tomatoes
2T Tomato purée
4T Extra virgin olive oil
450g Beef mince
2 Chipottle dried chillies, de-seeded and finely chopped
3 Star anise
2 Onions, chopped
2 Red peppers, de-seeded and chopped into 2cm pieces
100ml Brown chicken stock (see here for this component)
75ml Red wine
200ml Water
400g Tinned kidney beans, rinsed
0.5t Salt
pinch Ground black pepper
0.5t Chilli powder (to taste)
(1T = one tablespoon = 15ml; 1t = one teaspoon = 5ml)

Method

Put the tinned chopped tomatoes and the tomato purée in a saucepan and reduce on a gentle heat.

While that's happening, heat the extra virgin olive oil in a heavy frying pan. When it is hot, add half the mince and brown it. (This may take 3 to 5 minutes depending on how wet it is. It has to fry, not just stew in its juices.) Add that browned mince to the tomatoes in the saucepan, leaving as much oil as you can behind in your frying pan. Brown the other half of the mince in the same way and add that to the saucepan too, but again leave the oil behind.

Turn the heat on your frying pan to low and add the star anise and the finely chopped chipottle chillies to the oil. Cook gently for 2 minutes, then turn the heat up to high and add the chopped onions. Cook for 3 to 5 minutes. (The onion wants to be just brown around the edges.)

Turn off the heat on your frying pan and go look for the star anise. Remember, there were three of them? Go find them and throw them away. Then tip all the contents of the frying pan (including the oil this time) into the sauce pan and mix with the tomatoes and beef.

Also add the red wine, brown chicken stock and chopped red pepper to the saucepan.

Now put the frying pan back on a high heat and pour in the water to deglaze the pan. Bring this to an enthusiastic boil and scrape all the brown bits in the pan so that they dissolve. Pour this solution into the saucepan with everything else.

Add the final ingredients to the saucepan: the rinsed tinned kidney beans, salt and pepper. I usually add half a teaspoon of chilli powder at this point too, because although the chipottle chillies have a lovely smokey taste, they aren't particularly hot.

Stir well and simmer gently for 1 hour with the lid on. Remove the lid, stir and simmer without the lid for another 30 minutes.

Discussion

You can use white wine rather than red; this works well too. (You could also deglaze the pan with the wine rather than water, and just add the water straight to the saucepan. I tend not to do that because I often use wine from the freezer, prepared in appropriately sized portions.)

You might wonder: why de-seed the chipottle chillis (reducing hotness), then add chilli powder later (increasing hotness)? Why not just leave the seeds in? Indeed. That might work out fine, I just haven't done the experiment.

Obviously for a more homespun feel you could use fresh tomatoes, but peeling and chopping those is too much trouble for me.

However, it is worth taking the trouble to use a proper stock. It's really not at all the same if you use those hopeless stock-cubes. So, at the risk of making this chilli look more frightening than really it is, here also is my recipe for a suitable stock. The stock gets made ahead of time and used in many other dishes. So in practice, when making the chilli, you just reach into the freezer and toss a few cubes of stock into the saucepan.

Component: Brown Chicken Stock

This is a relatively rustic stock, with little effort devoted to making it clear. It takes a long time, but it's not much work. Using a slow-cooker helps a lot, because it looks after maintaining a very gentle simmer for itself and you don't end up with the smell of chicken stock penetrating to every corner of the house. (Which is what seems to happen when I make stock in a stock-pot on the top of the the stove.)

Ingredients

1.2 kgChicken carcases, bones & scraps
2 Onions, quartered
1.6 litre Water

Method

Preheat oven to 200°C (= gas mark 6 = 400°F).

Spread the chicken parts out in a roasting tray with the quartered onions. Place in the oven for 30 minutes. Turn the pieces over and place in the oven for a further 30 minutes. They should be a nice golden brown.

Tip the contents of the roasting tray into a pre-heated slow cooker along with 1.3 litre of boiling water.

Deglaze the roasting tray by pouring on 300ml boiling water, heating on a hob and scraping to dissolve the brown stuff. Pour this into the slow cooker. Put the lid on and set so that it will maintain a very gentle simmer. Leave to simmer for around 4 to 6 hours (it's not critical).

Strain the contents through a fine sieve into a bowl or pan. You can strain this again through muslin if you are keen, or if you are lazy like me, you can just leave it to settle for a while and simply decant it into another container, leaving most of the fine sediment behind. (If you want to skim the fat off the top you can do it at this stage or you can put the stock in the fridge overnight and just peel it off the top tomorrow.)

You end up with about 1.2 litres of stock which you can freeze into useful sizes: I usually make some ice cubes and fill some small containers.

Sunday, 24 February 2013

Meta-II: An Early Meta-Compiler

What is a meta-compiler? Before answering that question we need to take two steps back. First, what's a compiler? A compiler reads a program written in some source language and transforms it into a (more-or-less) equivalent version expressed in a target language. For example, a C compiler takes the usual textual representation of a C program and generates object code in some target language such as assembler code or virtual machine instructions. The resulting object code is equivalent (in some sense) to the source code, but it can be run directly on a physical machine or in an interpreter.

A compiler-compiler reads a "program" which specifies the grammar of a source language and some transformation rules. The object code it generates is a compiler for that source language. (So a compiler-compiler is also compiler, it's just that its source code is your grammar-plus-transformation-rules and its output is object code which directly executes those rules and transformations.) The idea is that writing a grammar and some transformation rules is a lot less work than writing a whole compiler. Instead you write a more declarative description of what you want to happen and the compiler-compiler generates the code that deals with the fiddly details of pattern-matching and code-generation.

So now we can answer the question: "What's a meta-compiler?" A meta-compiler is a compiler-compiler written in its own source language. So when the meta-compiler reads its own description, it generates exactly its own object code, a fixed-point if you want to think of it that way. It's a neat trick, and clearly an easier way to build a compiler-compiler — provided you already have a compiler-compiler to start with.

I first came across Meta-II when I read Experimenting with Programming Languages by Alessandro Warth. In this PhD thesis Warth explains some interesting implementation details of the meta-compiler OMeta. (This meta-compiler has been used to implement many different languages at the Viewpoints Research Institute and elsewhere. These languages and the systems written with them are key evidence to support Alan Kay's assertion that we can build systems with orders of magnitude less source code than is customary today.)

Now, OMeta started out as an experiment based on the much earlier Meta-II, so when I wanted to play with a meta-compiler, this earlier, simpler program seemed like a good place to start out. (I figured that by playing with this program I would come to understand what was wrong with it, what needed changing to make the more advanced OMeta. That's certainly been the case, but I'm not entirely sure it was worth the effort.)

Meta-II is described in Meta II: A Syntax-Oriented Compiler Writing Language by Val Schorre (1964), and although this paper is a nice read, I found that the best place to start playing is James Neighbors' Tutorial: Metacompilers Part 1 and the accompanying Metacompiler Workshop. This "workshop" is a web-page with an extended version of Meta-II implemented in Javascript, so you can put your grammar description in one window, and get the code for a compiler in another window, and then use that compiler in turn to compile a program. If you want to do what I did, you can work your way through Neighbors' tutorial and then port a version of Meta-II off the web-page and into a self-contained program. (You can find the results of my slightly-too-clever-for-its-own-good port to Python here on github.)

I had decided that I wanted a version very close to Schorre's original Meta-II, rather than the extended version constructed by Neighbors. This choice in turn forced me to implement a virtual machine in my Python runtime, rather than generating native Python control constructs. That's because the original Meta-II's output operations are tuned to generating assembler code line-by-line, with only two levels of indentation. Obviously that's not going to fit well with Python, where indentation indicates block structure. So instead my code reads the generated "assembler code" and effectively assembles, loads and links it into a form where it can be interpreted directly. Quite a lot of extra work to stay with the original version of Meta-II, and in the end not worth the effort, other than for the learning experience.

So, what did I learn? And what would I change about Meta-II so that it was more useful but still simple, without going all the way to the complexity of OMeta? (Which adds a full programming language to do the output transformations.)
  • It would be better to have a somewhat more expressive language in which to write the transformation rules, and that way avoid having to put so much work into a virtual machine to interpret the meta-compiler's output code. (Since Neighbors and Warth both came to the same conclusion, this does seem blindingly obvious in retrospect.) Probably the first step would be to follow Neigbors' example and introduce a mechanism for more general indentation of output, so it would then be possible to output nicely formatted representations of a program — for example S-expressions.

  • Both the original Meta-II and Neighbors' extension only have access, in their output operations, to the last token matched. OMeta is a lot more flexible here, since you can give names to the values of any matching rules and then use the values later in a different order. This facility would also remove the need for a special-purpose token-recognition sub-language which Neighbors introduces in his implementation.

  • OMeta uses backtracking when a rule doesn't match, but Meta-II has to make its decisions based only on the next token. In practice, you can still parse lots of languages, but it's an annoying restriction, best removed. Fortunately this is relatively straightforward. Neighbors explains how to do it, though he doesn't give an implementation. OMeta's implementation also improves backtracking performance using a caching technique: it's a "packrat parser". (See for example Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking by Bryan Ford.)

  • Lastly, Meta-II cannot handle left-recursive rule definitions, because they would not terminate. If you have a grammar that is naturally left-recursive, you therefore have to rewrite it without the left-recursion, which you can always do, but this isn't really very satisfactory. How confident can you be that the subsequent parse tree and your interpretation of it really is equivalent to the intent of the original grammar? It's a problem waiting to bite you later. So a further useful improvement would be to use Warth's neat technique that makes left-recursion work in a packrat parser.
You can only really understand things properly by making them yourself, so my next step will be to implement some of these ideas, and I'll report on how that went in some future post. In the mean time I would encourage you to look at the original Meta II paper and to have a play with Neighbors' Metacompiler Workshop. It's great fun!

Sunday, 17 February 2013

Recipe: Flemish Beef

This recipe is based on a dish cooked for me a long time ago by Mary Sheeran. It is supposed to be Irish — hence the Guiness — but that doesn't really explain why it is called "Flemish" Beef.

Ingredients

1 Large onion
4T Oil
225g Unsmoked bacon
450g Beef
1T Black treacle
3T Vinegar (preferably cider vinegar)
300ml Guiness
(1T = one tablespoon = 15ml)

Method

Slice the onion into slices about 5mm thick and fry in the oil for 5 minutes on a medium heat.

Slice the bacon rashers cross-wise into pieces about 1cm wide. Add to the pan and fry for another 3 minutes.

Slice the beef into pieces about the same size as the bacon. Fry for another 3 minutes or so. (The onion should be slightly browned and the bacon and beef should have a cooked-colour.)

Add the black treacle, vinegar and guiness. Bring to a boil then turn the heat right down so that the sauce is just simmering. You want the sauce to slowly boil down and thicken. This should take about 30 minutes.

And also ...

You might also enjoy reading Functional and dynamic programming in the design of parallel prefix networks by Mary Sheeran (Journal of Functional Programming 21(1), 2010, pp59-114).

Sunday, 10 February 2013

The Clean Slate Movement

It would be nice, wouldn't it, if our programs were shorter. They would be quicker to write, easier to understand, easier to maintain. The way this is supposed to work is that we take code made by other people and build on it, so that the small part we write draws together and coordinates the code from libraries and other self-contained programs. But in practice this has not worked out quite as well as we hoped. As James Hague points out in We Who Value Simplicity Have Built Incomprehensible Machines, the code that we try to re-use has become encrusted with layer after layer of cruft. How did that happen?

"... all the little bits of complexity, all those cases where indecision caused one option that probably wasn't even needed in the first place to be replaced by two options, all those bad choices that were never remedied for fear of someone somewhere having to change a line of code...they slowly accreted until it all got out of control, and we got comfortable with systems that were impossible to understand."

Impossible? We all know what it's like. Maybe the power of search-engines and communal help over the Internet has mitigated the problem, but it's still a problem. We can get stuff done, but it gets more and more like black magic. (In another post, A Complete Understanding is No Longer Possible, Hague points out that for a reasonably thorough understanding of a MacBook Air, you would have to digest around 11,000 pages of documentation. It's not going to happen is it?)

So what can we do? The short answer to that question is: "Start again, with a clean slate". It's a daunting prospect, and one that until recently no-one would have taken seriously. But over the last few years something that we might call "The Clean Slate Movement" has started to gain credibility.

For some in this movement, like Peter Neumann or Robert Watson the central issue is security. Our systems have become so complex that it's impossible to make credible guarantees about security. Only by building new systems, re-thought from the hardware up, can we really make any progress. (If we try to do the right thing on current hardware, our systems usually end up being far too slow. If you want to understand how processor designs will be different in 10 years time, you might take a look at The Cambridge CAP Computer. It's 30 years old, but it's actually from the future.)

Other people in the movement, notably Alan Kay and his colleagues at Viewpoints Research Institute have a slightly different aim. They envision building systems which are dramatically simpler because they are dramatically smaller — with maybe 100 times or even 1000 times less code for the same functionality. They talk about writing code for a whole system, from applications down to the "bare metal" in perhaps 20,000 lines. Now if that were possible, we could certainly imagine understanding that whole system. If we printed it out 20 lines to a page, surrounded by commentary like the Talmud, it would only be 1000 pages long. Something like the size of the Oxford Users' Guide to Mathematics. Something you would expect a professional to be intimately familiar with, though they might have to go look up the odd detail.

You might reasonably ask "Is all that really possible?" It does seem rather ambitious, even unlikely. But Alan Kay obviously has a formidable reputation and his associates have already had considerable success. A central part of their strategy is to use meta-compilers to define different, powerful languages in which to implement the different parts of their system. This does seem to lead to a substantial reduction in code size. (Meta-compilers are another old technology brought up-to-date, and one that I'll return to in a future post.)

Overall, can we really afford to throw away old systems and rebuild from scratch? The marginal cost of patching-up a system is always less than the cost of starting from scratch. But if we only pay attention to short-term costs, then each repair will only barely do what's needed today, and afterwards the system as a whole will be a little worse. Unless we spend extra effort today and keep refactoring a system so that it continues to be maintainable, over time it tends to degenerate into a ghastly mess: fragile and dangerous.

Too see where this ends, you only have to look at the notoriously crufty and fragile systems employed by the financial services industry. Take for example the market maker Knight Capital, which on one day in August 2012 suffered a 440 million dollar loss due to a "technology issue". (Their software offered to trade shares at stupid prices, an offer which other people were only too happy to accept.) In the long run, it's worth paying for craftsmanship. In the long run, crufty software is never the cheapest option.

Sunday, 3 February 2013

Recipe: Red Braised Chinese Leaf

This is a variation on a Kenneth Lo recipe "Hung Shao Pai Ts'ai" that we have been cooking for decades. As he says, "Unlike a meat dish, which one tires of continual eating, this vegetable dish can be eaten day after day." It's really nice.

Ingredients

1 Chinese leaf (1 to 1.5 kg)
2T Soy sauce
1T Dry sherry / rice wine
3T Chicken stock
4 Chinese dried mushrooms, soaked (at least 1 hour)
5T Oil (preferably groundnut)
2 Dried chillies
2 Spring onions, coarsely chopped
(1T = one tablespoon = 15ml)

Method

(Note that you need to cook this in a wok, and it needs to have a tight fitting lid. It doesn't really work otherwise.)

Cut the Chinese leaf in half length-ways and then cut the halves into 1-cm slices. Discard the base. Wash, drain and dry.

Combine the soy sauce, sherry and stock in a bowl.

Drain the soaked Chinese dried mushrooms and slice finely, around 1-2mm thick. (The dried mushrooms are best soaked for several hours in cold water, but if you're in a hurry you can get away with soaking them for less time in hot water, but 1 hour is about the minimum.)

Heat the oil in a wok over medium heat and cook the chillies and spring onions for between 1 and 2 minutes. (The longer, the more spicy the resulting dish. You know you have gone too far when the air is so full of chilli vapour that you stagger out into the garden like the victim of a world-war one gas attack ...) Discard the chilli and spring onion. Keep the flavoured oil in the pan.

Add the Chinese leaf and stir fry over a high heat for 4 minutes. (The Chinese leaf has to wilt before the next step. It's ok if it scorches slightly in places, but as with all stir-frying, you have to keep it moving all the time.)

Turn the heat down to medium and add the mushrooms and the soy sauce mixture. Mix well.

Turn the heat down to low and put a tight fitting lid on the wok. Leave to cook for 10 minutes.

(The original recipe thickens the sauce with cornflour at the end, but I think it's better without.)

And also ...

Those of you who remember Giles Murchiston will probably be as surprised as I was when I recently came across a submission by him, buried in the massive collection of written evidence gathered last year by the Parliamentary Joint Committee on the Draft Communications Data Bill. See pages 418-422 (or 420-424 of the PDF).

Sunday, 27 January 2013

Rapid Feedback

In his talk Inventing on Principle, Bret Victor sets the scene by explaining his own guiding principle, that "creators need an immediate connection with what they create." The interactive demonstrations in his talk are very impressive. You look at them and think "Why isn't my interaction with the computer like that? Why am I still struggling, still working by remote control?" The difference between his demos and the way we usually work is as dramatic as the difference between batch-processing and time-sharing.

"When you make a decision," says Victor, "you need to see the effect of that immediately — there can't be any delay, can't be anything hidden." The demos are lovely, and it's certainly nice to get rapid feedback, but "need"? Do we really need it? We can find a pretty clear answer to that question if we turn around and consult the paper Conditions for Intuitive Expertise: A Failure to Disagree by psychologists Daniel Kahneman and Gary Klein (American Psychologist, Vol 64(6), Sep 2009, pp515-526).

For many years, Kahneman and Klein have been the central figures in two opposed camps of psychologists studying human expertise. Kahneman, in the "heuristics and biases" camp, enjoyed demonstrating that human intuition is often flawed and that experts can often be overconfident about their decisions. Klein, in the "naturalistic decision making" camp, enjoyed celebrating the uncanny ability of experts to make good decisions in complex situations and under time pressure.

For example, Klein found from many interviews with firefighters that rather than thinking of a number of options (or even just two) and deciding between them, they usually only thought of one option — the right one — and then quickly checked that it was appropriate for the current situation. Their intuition really works. On the other hand, "experts" in other fields don't seem so capable. Unfortunately, their intuitions are still compelling and feel to them just as just as correct as the firefighters' — but this feeling is deceptive. For example, in Moneyball, Michael Lewis describes how baseball scouts fail time and again to choose the most effective players because they fall for the "representativeness" heuristic: they pick good lookers, rather than good performers.

Was it possible to reconcile these two conflicting points of view? Kahneman and Klein decided to try settle their differences by writing a paper together. Both camps were clearly right some of the time — experts are of course sometimes accurate and sometimes wide of the mark. But was there a pattern? "What are the activities," they asked, "in which skilled intuitive judgement develops with experience? And what are the activities in which experience is more likely to produce overconfidence than genuine skill?"

In the end, the two psychologists demonstrated "a failure to disagree". Reconciling the evidence from each camp, they describe in their paper the precise conditions for skilled intuition to develop. Firstly, there must be a "high-validity" environment, or in other words there must be cues in the observable environment which indicate the true nature of the current situation. For example, firefighters and paediatric nurses inhabit high-validity environments, because they can perceive early indications that a building will collapse or that a baby will suffer from an infection. They will often be unaware of the precise cues which guide their intuitive decisions, but the cues are there all the same.

Secondly, people must have an opportunity to learn the cues and to get feedback when they are right and when they are wrong. But humans are pretty good at this, provided they are in a high-validity environment. As Kahneman and Klein note: "Where simple and valid cues exist, humans will find them if they are given sufficient experience and enough rapid feedback to do so". But the key for us here is the feedback: to build skilled intuition this must be "both rapid and unequivocal".

So, let's now come back to Brett Victor and his principle of rapid feedback. Do we need it? Computer programming certainly forms a high-validity environment. We should be able to become intuitive experts, but as you can now see, that depends critically on whether the feedback we get is "both rapid and unequivocal". In most programming environments nowadays the feedback is neither.

(I leave it as an exercise for the reader to work out whether a kitchen is a high-validity environment. The point Victor was making about rapid feedback seems to have been missed by so many people that he felt it necessary to construct a web-page explaining it all over again. That account is certainly worth reading too, and you can find it here: Learnable Programming.)

Sunday, 20 January 2013

A Rather Strange Cookbook

Almost a review of The 4-Hour Chef by Tim Ferriss

The ideas I want to explore in this blog started to form while I was reading this book, so perhaps I should say a little bit about it. Tim Ferriss — how can I put it delicately? — is something of a self-publicist. The book has many five-star reviews on Amazon. One of the few one-star reviews asks: "Why does he feel the need to fake the ratings for his book? Over 50 five star reviews pop up the same day the book is published, almost at the same time, by reviewers who didn't review any other book." It certainly looks as though the techniques described in Trust Me, I'm Lying by Ryan Holiday have been used to promote this particular cookbook. So is there anything there beyond the hype? Well, actually, yes there is.

Ferris is a hacker. Not a computer hacker, of course, but a hacker nevertheless. He's obsessed with how to become an expert in something with much less than the normal effort. The conventional wisdom  is that to be an expert in something takes around 10,000 hours of effortful practice. (See, for example, Outliers by Malcolm Gladwell.) Effortful practice means working at something to the limit of your current ability, not just idling at a level which you find easy.) Ferriss claims that it's possible to hack expertise, to go from zero to the top 5% in much less time than this — maybe only 10% or 20% of the time. How?

The first 70 pages of the book are devoted to "meta-learning": principles and examples of how to make a programme for learning anything. (The examples include things like how to shoot basketball hoops.) The next section of the book can be read as a 120 page basic cookery course, but it can also be deconstructed and used as a detailed example of how to apply the meta-learning principles in practice.

Ferriss's meta-learning principles are for the most part not particularly novel — if you are familiar with educational theory you will recognise many of them as descriptions of best practice. His description of how to learn a foreign language will be very familiar to anyone who has used the Michell Thomas language courses. But the idea that top-performers might not be good examples of that best practice is nowadays a bit heretical. Ferris suggests that rather than look to superstars for tips on how to practice, we would be better off finding the outliers who have achieved some success despite not being well endowed by nature.

Something that Ferriss's meta-method also emphasises above the meta-learning principles is to seek help from expert tutors. (And this is a place where, because of his celebrity, he might be better placed to find help than you and me.) This is an interesting angle, because most educational theory looks at the problem the other way around, from the perspective of an expert wanting to teach novices. As a novice wanting to learn something, it might appear that the principles are all you need, but that's not the case: to build your learning programme most effectively you also need some hands-on expert advice. (Because you need to learn the expert's tacit knowledge, which the expert might not even know how to teach. Ferriss gives some ideas on how you might approach this.)

It's intuitively obvious that this would be a good idea, but perhaps not so obvious just how powerful it is. We know from educational research that personal tutors are unreasonably effective — this is the so-called "two-sigma problem". Nearly 30 years ago, Benjamin Bloom compared group teaching with one-to-one tutoring, and found that an average student with one-to-one tutoring performed at the same level as a top 2% student with group teaching. That's really quite astonishing, isn't it? See The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring in Educational Researcher, 13(6), pp4-16 (1984).

But how is all this relevant to you and me? Well Ferriss's ideas on cooking are quite interesting in themselves, and I'll come back to them another time. But as programmers we have our own expertise problem. As part of my "day job" I teach small classes of students how to program in Python, as part of an electrical engineering degree. Well, I try to. I am no longer surprised at how difficult this is. A few people just get it straight away, like I did when I originally learned to program. Most struggle. I've found a few things that seem to help (and I'll come back to these another time too), but surely we can do much better. If there really is a way to gain expertise 10 times faster, we could certainly use it.

On the other hand, experienced programmers could always learn to be better too. I'd certainly like to learn more and be a better programmer. It's a generally accepted idea that programming teams often have some programmers who are 10-times more productive than others, but there's been relatively little interest in systematically finding what the difference is and learning to be that 10-times programmer yourself.

If you're an expert and you want to improve, you will certainly have to make your own learning programme, and Ferriss's ideas are a fairly accessible place to start. So: can we apply Ferriss's principles? What 20% of expert programming knowledge will give 80% of the results? What things are common amongst best performers, but infrequently taught? What you you know that I need to know too? I think it's not just novices but also experts who could do a lot better.

So overall, yes, despite the hype, do I think it's worth reading The 4-Hour Chef, if only to help imagine what might be possible.

Sunday, 13 January 2013

Start of Day

Over the Christmas holidays I was reading a rather strange cookbook by Tim Ferris when some vague ideas about programming and cooking started to crystallise in my mind. Too vague really to turn into a paper or an essay. Perhaps a blog might be a better medium to explore these ideas? Let's see.

One of the vague ideas is that the way we do programming today, the way we build systems, is wrong. Surely we can do a lot better? Think for example of Bret Victor's talk Inventing on Principle or Alan Kay's talk Programming and Scaling. (If you haven't already watched these, you need to stop reading and watch them right now. Go on. Do it!) Is it possible that we could really develop all our programs like this? Is it possible that our programs could be 100 or even 1000 times shorter? Even if the answer to these questions is only "maybe", then surely we should still put a lot of effort into trying, rather than condemn future programmers to suffer as we have suffered?

So part of what I'd like to do here is to review these and other related ideas; to look at "the road not taken"; to understand the ideas better by trying to apply them in practice.

Another vague idea is that programming and cooking have more in common than programming and constructing buildings. Developers sometimes like to call themselves "software engineers" or even "software architects", but the analogy is a bit thin. As Glenn Vanderburg points out, "software engineering" is quite unusual: the term "engineering" is generally reserved for those techniques which actually work in practice. Not so with software. And, despite the popularity of "patterns" in the software world, anyone who has actually read Christopher Alexander should understand that trying to copy the current practice of architecture is not such a clever idea.

Think instead of the organisation of a commercial kitchen, run by a "chef" — literally the "chief" cook — and not by a "culinary architect". Think of the respect for technique and the role of apprenticeship in learning those craft skills. Think of a more domestic scene, maybe on a Sunday morning. You break off from hacking some code and you go to the kitchen to put together a chili for lunch. Are these really such different activities? Can we learn something from the comparison? Whatever their medium, hackers like to understand how things really work, and relish the opportunity to practice their skills. (Perhaps the main difference between food and coding is that it takes a hacker to appreciate someone else's code, but anyone can appreciate food.)

So another part of what I'd like to do here is to see what we can learn from the comparison. But Nathan Myhrvold is not the only hacker to find food interesting in its own right. So I'll also take the opportunity to look at tools, techniques, ingredients and of course recipies.

Let's see how this goes.