Saturday, February 9, 2008

Getting started

The mantra of this blog is, "computational systems are representational systems".


The aim is to provide a view of what computer science is all about. We computer scientists haven't been very good at explaining the intellectual basis of our field, and this may be one of the reasons we are struggling to interest students, and potential collaborators, in our field, even while the importance of computing in all aspects of life is growing by leaps and bounds.

I've been developing the representational view of computing for a while now, giving talks and having encouraging conversations with folks at conferences, and the blog is a way to spread the ideas around, and get feedback, pending the Big Paper, which of course will be available "real soon now." If you want to read more, while the blog takes shape, you can find some notes, and a link to a talk kindly recorded by the folks at the University of Toronto Faculty of Information Studies, on my website, http://www.cs.colorado.edu/~clayton (what's there are earlier versions of the ideas.)

Why do we need to work on what computer science is about? Jeanette Wing's influential paper on "computational thinking", and follow-on activities (see http://www.cs.cmu.edu/~CompThink/index.html) sets out the challenges. After a stimulating conference session on enrollment problems in CS at Microsoft Research a couple of years ago, one of the participants stood up at the closing to express enthusiasm, saying something like, "It's been great talking about how we can communicate better... we all need to get the word out about the key ideas in the field...", and then caught himself in midstream, "... but I'm not sure we know what they are!" In one of my favorite books, "Python Programming: An Introduction to Computer Science", John Zelle says, "[T]he fundamental question of computer science, is 'What can be computed?'" I don't agree! This question gets at little or nothing of why computer science is so important to all kinds of people doing all kinds of things. The importance of computing is what we need to explain.

Computational systems are representational systems.

Computational systems are important because they are used to represent all kinds of things people care about, and representations are really useful. Further, computational representations have a number of properties that make them especially useful.

So, what is a representational system? There's some theory of representation that we can draw on, best developed in the case of measurement systems, a special case. For theory of measurement see the three volume Foundations of Measurement, by Krantz et al., recently reissued in paperback; for the broader theory of representations, see papers by Mackinlay and Genesereth, including Mackinlay, J. D. (1986) Automating the Design of Graphical Presentations of Relational Information. ACM Transactions on Graphics, 5(2, April), 110-141.

A representational system consists of a target domain, in which there is something we want to accomplish, and a representation domain, with mappings connecting them. The point of representation is that we map work in the target domain into work in the representation domain, where it can be done more easily, or faster, or better in some other way. Then the results are mapped back to the target domain, where we need them.

The kind of representational system that’s best understood is measurement systems. You’ve been using them all your life, but if you are like most people you’ve never thought about how, let alone why, they work. Let’s look at an example.

Suppose we have some logs, and a chasm we want to bridge. We need to know whether one of the logs is long enough to bridge the chasm. We could answer this question by picking up the log, moving it over the chasm, and seeing whether it reaches the other side. This involves actual work: picking up the log, and moving it. The work could be hard, if the log is heavy, and it could be dangerous, if the chasm is deep. We might need to rig up some kind of derrick to swing the log over the chasm. It would be bad to go to all this trouble and then find that the log is too short.

We all know that there’s a way to avoid this possibility. We measure the log, and we measure the chasm, and we compare the measurements. If the number we get when we measure the log is bigger than the number get when measure the chasm, then the log will reach.

In this example, the target domain contains the logs and the chasm. The representation domain is numbers: when we measure the length of something we get a number.

Our question in the target domain is “Will this log reach across the chasm?” We map the question “Will this log reach across the chasm?” onto a question in the representation domain: “Is this number bigger than that one?” Finally, we map the answer we get in the representation domain, yes or no, onto an answer in the target domain, also yes or no (that mapping is easy.) Here’s a diagram showing all this:


The gain from doing all this (which we would normally do almost without thinking) is big. We don’t have to move the log!

For this or any other measurement system to work, the answer we get when we map our question into the representation domain, and then map the answer back to the target domain, has to be the same as the answer we would have gotten if we had done the work to answer the original question in the target domain. After all, how would we feel if we measured the log and the chasm, decided that the log was longer than the chasm, did all the work to swing the log across, and found out that it didn’t reach? The measurement systems we all rely on always do work, and if they appear not to, we assume we made a mistake of some kind, like not pulling the tape measure tight when we used it to measure something.

Measurement is interesting, and exceedingly important, but we are after even bigger game. Measurement systems are representational systems in which the representation domain is a fairly simple mathematical structure, like numbers. But in many representational systems the representation domain isn’t a simple mathematical structure, or a mathematical structure of any kind. For example, consider a simple bar chart, used to represent the fuel economy of cars. Here’s a diagram of this representational system:

Here the representation domain isn’t a mathematical structure, it’s marks on a piece of paper, or a pattern of colored dots on a computer screen.

In the representational systems we'll mainly be concerned with, the representation domain is neither simple mathematical structures, nor marks on paper (though patterns of dots on a computer screen come in around the edges.) Rather, our representational domains will contain computational stuff, whose nature will become clearer as we discuss it. Computational stuff has wonderful properties, so wonderful that computational representations are in use in virtually every sphere of human endeavor. As stressed earlier, the whole point of having a representation is to be able to do something more easily, or cheaper, or faster, or better in some way, and computational representations often have huge advantages over others.

Like the advantages of measurement, the advantages of computational representations often go unnoticed, even when we use them. Here are a few examples.

• If I create a computational representation of something, someone can access that representation, and use it to do work, on the other side of the world, incredibly quickly, and at incredibly little cost.

• Computational representations can easily be accessed in other times, as well as in other places. That is, they are easy and cheap to store and retrieve. The contents of all the books in all the libraries in the world can be stored in computers that fit in two standard shipping containers, each 20x8x8 feet; see http://www.archive.org/web/petabox.php.

• Many operations on computational representations can be automated, meaning that they can be carried out by a machine, rather than a person. This often offers huge advantages in speed, accuracy, and cost.

No comments: