Thursday, March 20, 2008

What difference does it make?

Does it make any difference to view computational systems as representational systems? Yes. Here is a comparison of two programs I wrote as instructor's solutions for the same homework exercise in an introductory course using Python. The problem is to read a list of sound samples from a WAV file, and record a new sound, consisting of the original plus an echo. The echo is delayed by .15 sec, and attenuated by a factor of .25. This is the same example discussed in the post, Computational Stuff, a Worked Example.

Program 1 is the solution discussed in the earlier post, with the portion of the solution that records the song with echo slightly modified for comparability with the solution presented as Program 2.

#Program 1
originalSong=readwave("song.wav").getLeft()
softerSong=[.5*e for e in originalSong] #line 1
intervalOfSilence=int(.25*sampleRate)*[0] #line 6
delayedSofterSong=intervalOfSilence+softerSong #line 7 #line 2
#line 3:
songWithEcho=[originalSong[i]+delayedSofterSong[i] for i in range(len(originalSong))]
outputWAV=makewave("modifiedsong.wav",44100)
for sample in songWithEcho:
outputWAV.record(sample,0)
outputWAV.close()

Program 2 is my solution to the same problem, written earlier without considering the representational perspective:

#Program 2
originalSong = readwave("song.wav").getLeft()
echospread = int( 44100 * .25 )
outputWAV = makewave("modifiedsong.wav", sampleRate)
for i in range(0, echospread):
outputWAV.record(originalSong[i], 0)
for i in range(echospread, len(originalSong)):
outputWAV.record( originalSong[i] + originalSong[i - echospread] * .5,0)
outputWAV.close()


Both programs produce correct results. Indeed, they produce the same results. But they work quite differently.

It is difficult to identify any computational operation in Program 2 with an operation in the target domain of sounds, or to identify any data structure in Program 2 with a sound, except for originalSong. The thrust of Program 2 is to record certain numbers in a file, not to construct a representation of a sound that is produced in a certain way.

In contrast, in Program 1, softerSong, delayedSofterSong, and songWith Echo all correspond to sounds, and they are created using operations on sequences of samples that correspond to operations on sounds: attentuation (in line 1), delaying (in line 2), and mixing the original song with the echo (in line 3).

Here are diagrams showing these relationships in the two programs.


Here is shown the original song, bouncing off a reflector shown at right, in attenuated form, being delayed, and being mixed with the original song. Color coding and dashed lines show which things in the representation domain correspond to which things in the target domain. Solid lines show which operations in the representation domain correspond to mappings in the target domain.

Notice that the computational representation produces only an approximation to the actual echo effect. As shown, when the recorded sound is played back the echo is truncated, because when mixing is done in the program the resulting sound is only as long as the shorter of the two sounds being mixed.
The similar diagram for Program 2 has many fewer correspondences to point out. The value of originalSong corresponds to the original song in the target domain, but there are no other simple correspondences. Because Program 2 builds up the recording sample by sample, the operations of attentuation, delay, and mixing, are hidden in the arithmetic of the construction of the samples. The prominent decomposition of the program into two loops make sense in view of the change in arithmetic needed to produce two groups of samples, but this split doesn't reflect anything about how echos are produced physically.

Notice also that there is a mapping, "record samples" shown as an arrow for Program 1 that appears only as a floating label in Program 2. In Program 1 there is a list of samples created that corresponds to the song with echo, and this list of samples is recorded, an operation easily understood as a mapping. In Program 2, there is no list of samples that is created and then recorded; rather, samples are created and recorded piecemeal. Recording is happening, but not in a way that can be understood separately from other operations in the program.

So what? For an experienced programmer, Program 2 isn't hard to write. One imagines the numbers one has to record, and figures out how to do the arithmetic that will produce these numbers, in the process of recording them. One's understanding of the physics of sound plays a role only in determining what the correct numbers are, not in writing a program to produce the numbers.

But for beginners, Program 2 is difficult to write. Experience is needed to relate what the numbers are to a method for producing them.

Program 1, on the other hand, is easier to write, if one understands the physics of echo production and the computational operations that correspond to operations on sounds, as discussed in the earlier post. Writing the program requires only that the computational operations that correspond to attenuation, delay, and mixing be deployed and combined, and that one knows how to record a list of samples.

Saturday, March 8, 2008

Human-Centered Computing and Representation

Here is the synopsis of the program description for the Human-Centered Computing Cluster at the National Science Foundation (see http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=500051):

This cluster, Human-Centered Computing (HCC), encompasses a rich panoply of diverse themes in Computer Science and IT, all of which are united by the common thread that human beings, whether as individuals, teams, organizations or societies, assume participatory and integral roles throughout all stages of IT development and use.

HCC topics include, but are not limited to:

  • Problem-solving in distributed environments, ranging across Internet-based information systems, grids, sensor-based information networks, and mobile and wearable information appliances.
  • Multimedia and multi-modal interfaces in which combinations of speech, text, graphics, gesture, movement, touch, sound, etc. are used by people and machines to communicate with one another.
  • Intelligent interfaces and user modeling, information visualization, and adaptation of content to accommodate different display capabilities, modalities, bandwidth and latency.
  • Multi-agent systems that control and coordinate actions and solve complex problems in distributed environments in a wide variety of domains, such as disaster response teams, e-commerce, education, and successful aging.
  • Models for effective computer-mediated human-human interaction under a variety of constraints, (e.g., video conferencing, collaboration across high vs. low bandwidth networks, etc.).
  • Definition of semantic structures for multimedia information to support cross-modal input and output.
  • Specific solutions to address the special needs of particular communities.
  • Collaborative systems that enable knowledge-intensive and dynamic interactions for innovation and knowledge generation across organizational boundaries, national borders, and professional fields.
  • Novel methods to support and enhance social interaction, including innovative ideas like social orthotics, affective computing, and experience capture.
  • Studies of how social organizations, such as government agencies or corporations, respond to and shape the introduction of new information technologies, especially with the goal of improving scientific understanding and technical design.

Can this broad area of work, defined mainly by example, be given conceptual coherence? Focusing on the role of people in the representational work of computational system suggests that it can. Further, the representational perspective shows that HCC is not, as some have thought, a peripheral aspect of computing, but rather a central aspect, not just in its practical importance, but also in the intellectual content it shares with other areas of computing.

As we have seen, some operations in representational systems can be automated, that is, carried out by machine. But some operations, such as making a judgment from a bar chart, are carried out by people. For a representational system to work, the cost structure of these human operations have to meet the same conditions as the cost structure of automated operations, including conditions on accuracy and reliability.

While the cost structures of automated and human-performed operations have symmetric status in designing or analyzing representational systems, the specifics naturally are different. For example, what computers like to do isn't a relevant or even well-defined consideration, while what people like to do certainly is. Here are a few examples that illustrate some of the factors that influence the cost structure of operations performed by people in various representational situations.

Bar charts: The bars are often drawn by computer, and then people make the actual judgments. When a person implements a mapping in this context, they perceive the bars, and then create an output. As discussed earlier, if the bars are base-aligned and parallel, the length comparison will be easy and accurate.

Picture processing: Recognizing objects or people in a scene, or judging aesthetic qualities, or many other everyday operations, can be done from pictures. That is, if a real scene isn't at hand, it can be represented by a picture, and judgments made from the picture can substitute for the judgments that would be made from the real scene. Conveniently, suitable pictures can be created and displayed by computer, taking into account the following characteristics of human vision.

Color perception: For homo sapiens the brightness of just three colors can be chosen to reproduce perfectly any color (but see http://www.purveslab.net/main/ for complications that are at work in color vision). A display for dogs would only need two colors (dogs only discriminate short from long wavelengths.)

Limited spatial resolution, and blending: Below a certain size, closely spaced dots merge into a smooth-appearing area. Exploiting this principle, and the principles of human color perception, most displays work by presenting a large number of tiny, closely spaced dots, of three colors. Even though this arrangement produces images that are physically very different from "real" scense in the world, to the human visual system these images look realistic, and they are processed effectively and easily.

Depth perception: There are many cues for depth, that is, distance from the viewer. Nearly all of these are included in two-dimensional images. For example, perspective, the effect by which more distant objects form smaller images, or aerial perspective, the effect by which more distant objects appear fainter or more hazy, because they are seen through thicker intervening atmosphere, are preserved in two dimensional images. A depth cue that is important for objects near at hand is stereopsis: images seen by the two eyes differ in a way that is related to distance. Displays that exploit this indication of depth present different images to the two eyes, for example by using differently polarized light for the two images, and placing differently polarized filters over the two eyes. As for images made of dots, notice that the resulting pattern of light is very different from the pattern of light in a real scene. But it is different in a way that the human visual system doesn't detect. There could be organisms whose visual system is sensitive to the polarization of light. For such organisms these displays wouldn't work.

Moving from pictures to moving scenes, presentation of movies and videos relies on further facts about how people see. Just as closely spaced dots merge into the appearance of smooth surfaces, so images closely spaced in time merge into the appearance of smooth motion. As we all know, a movie consists of a series of frames, each a static picture, changed very rapidly, but we don’t see it that way. One could imagine a Martian in a movie theater wondering why Earthlings like to sit in the dark and watch long series of very similar pictures flashed on a screen. Video is more complicated, in that the pictures aren’t even complete: they are collections of closely spaced lines, with only half the lines included in each frame. But this ridiculous presentation looks quite good to the human visual system.

These examples bring out what the point of human centered computing is. Effective displays are not based on faithful physical and geometric reproductions of the signals available from real scenes. Rather, they systematically exploit facts that are specific to the human perceptual apparatus. Representational systems in which humans play a role cannot be designed without at least implicit knowledge of these facts.

Input forms, created by people.

In the examples just discussed, operations are carried out by people on entities, like bars, that are presented to them by a computer. From the point of view of the computer these are output forms. But it’s usually also necessary for people to produce forms that are given to the computer, so as to provide data for the computer to operate on, or to control what the computer does. From the point of the computer these are input forms. Just as for output forms, input forms have to be shaped to fit people’s capabilities. For output forms the human abilities that matter center on perception. But since input forms have to be produced by people, the key human capabilities are those of action. What are the actions people can control, to produce forms usable by a computer?

Keypresses. Most people have dexterous fingers that can be moved quickly and accurately so as to apply force to appropriately sized objects. Keys and buttons are such objects, and are designed to fit people’s fingers in size, the force required to activate them, and (for high quality keys) the feedback they provide to confirm that they have been activated. Most people can move their fingers so as to press more than one key at a time, and this ability is exploited in many musical instruments and a few computer input devices, such as chord keyboards.

Text entry. Often keys are used in groups to allow people to specify sequences of characters making up a text. For people who know a written language, many pieces of text are familiar, and can be generated quickly and accurately, whereas random sequences of keypresses can be entered only slowly and with high error rates.

Drawing. Most people can use their hands and fingers to grasp a pointed object of appropriate size and shape, and move the point along a desired path. With more difficulty, most people can move an object that has no point (a mouse) so as to control a pointed marker whose movement traces a desired path. In either case, the path can be sensed and act as input to a computer.

We could add many examples to this list, and perhaps invent new ways to use actions to communicate. For example, HCC researchers are developing ways for people to use facial expression, or tone of voice, to create input forms.

HCC and social systems.

Humans are social animals: most things that people do are done in groups. This influences human-centered computing in a number of ways.

There are usually multiple people in a computational system. The designers of such systems therefore have to understand not just what individual people are likely to do, and can do, but what people working in groups will and can do. For example, sometimes systems fail because some users don’t do things, like entering information into a data base, that are needed to support other users. But sometimes, as in Wikipedia, people working strictly as volunteers put huge amounts of effort into really useful contributions. If you are designing a system for a lot of people to use, you have to try to understand what shapes contrasts like this. In our terms, we need to understand the cost structures of operations performed by groups, not just by individuals. Plainly, our ability to predict what will happen naturally (and hence cheaply) in a group is poor.

Decisions about system design have to reflect the needs and wants of groups of people, not just those of individuals. Historically, before computers became cheap enough for individuals to buy them, nearly all computational systems were created to meet the needs of organizations, especially businesses. Today, even though it is common for individuals to own one or more computers, facilities for communicating with other people, especially via the Internet, are now crucial for almost all users. People need to communicate, in many situations, and want to communicate in many others. The design of future computational systems will continue to be shaped strongly by these needs and wants. Think of MySpace, or FaceBook.

These considerations, too, shape the cost structures we are concerned with as designers of representational systems. Here we are seeing the value side of cost: systems need to deliver value to offset their costs.

HCC and particular user groups.

People have different capabilities. Because, as we've seen, representational systems are shaped by users’ capabilities, differences in capabilities need to be reflected in different designs. Assistive technology uses representations that are suited to the needs of people with limited vision, or limited ability to read text, to mention just two examples. Inclusive design seeks to create representational systems that can be used by people with the widest possible range of capabilities.

To come: representation and programs; philosophical perspectives.

Saturday, March 1, 2008

Computational Stuff, A Worked Example

Suppose you are listening to a sound, say a bird song, and you think, “I’d like to hear that song with an echo.” Good luck doing that in the target domain, the leafy glade in which you are enjoying the ambiance. You could try to wheel up some kind of sound reflector (if by some miracle there is one handy) but the bird would probably fly off while you do it. Even if it stays around it may not sing for you.

What you’ll need to do is map the bird’s song into some representation domain, say a pattern of magnetization in a film of oxide on a cassette tape. (Of course, doing a mapping of this kind is what we call "recording".) Now, with two cassette players (one to play the original song, and one to play the song again, starting a little later, at reduced volume) you can get your effect. But there’s a good deal of work (you have to copy the tape), and some dexterity to get the right delay, involved.

If you map the bird’s song into computational stuff, things are much easier. You can automate the production of the echo, for the bird’s song or any other sound. That is, given any sound, represented computationally, you can get the sound with the echo just by asking for it. No more work than that.

Representing Sounds using Sequences of Numbers

You hear a sound when certain cells in your ears detect rapid changes in air pressure, with the pressure going up, going down, going up, going down and so on, very rapidly. How big the swing in pressure is determines the loudness of the sound. How rapidly the pressure swings up and down determines the pitch or frequency of the sound. The more rapidly the swings happen, the higher the pitch. Human ears are sensitive to swings that go up and down between about 20 times per second and about 20,000 times a second.

If you measure air pressure, very rapidly and accurately, you’ll get a collection of numbers. If you have some way of working out when each measurement was made, you have captured a representation of the sound: you know when the pressure goes up, and how high, when it goes down, and how far, and so on. If you had some way of taking these numbers, and creating air pressures that match them, with the right timing, you could reproduce the sound, at least roughly. The roughness comes in because you don’t really know what is supposed to happen to the air pressure in between your measurements. But if our measurements are close enough together we can hope that the roughness won’t matter.

In theory, you could do all of this by keeping track separately of the time associated with each air pressure measurement. In practice a simpler scheme is used. If we know how many measurements per second we are making, evenly spaced in time, we can work out the time associated with each measurement, without having to store the time for each measurement.

If we want to represent a sound, then, we need a big collection of numbers (44,100 numbers for one second of sound), which are the pressure measurements. We also need to keep track of how often we made the measurements, 44,100 times a second in our example. We need to know this number to play back the sound: if recreate the air pressure assuming the measurements were collected more or less often, the sound we get won’t approximate the original, because the timing of the air pressure changes will be off.

The terminology that is used in talking about this stuff is different from what we’ve been using. Rather than “air pressure measurement” we’ll say sample from now on, and the number of measurements we collect per second when recording is called the sample rate. Using this terminology, we can represent a sound by a sequence of samples, also keeping track of the sample rate that was used when they were recorded, and that should be used to play them back.

Doing Work on Sounds

For our representation to be useful we have to be able to work on it that corresponds to work we might want to do on sounds. Here are some example operations.

Making a sound louder. You hear that bird song, and you want it louder. You could try to make the bird angry, but that isn’t very likely to work. Instead, you map the song to a sequence of samples, and you multiply each sample by some factor greater than one. If you now play back the new samples, you’ll get a louder version of the original song. Here’s a picture of the representation system at work here:


Making a sound fainter. If you multiply each sample by a number less than one, and play back, the sound will be softer (because the swings in pressure aren’t as big.)

Hearing one sound and then another. You hear one bird song, and then, after a while, another. You decide you’d like to hear them together, one right after the other. You map each song to a sequence of samples, and then you stick the two sequences together to get one long sequence. When you play back the long sequence you’ll hear what you want.

Delaying a sound. If you have a sequence representing some silence, and a sequence representing the sound you want to delay, you can stick the sequences together, with the silence first. When you play this sequence back, you’ll first hear the silence, and then, after a delay, the sound.

Hearing two sounds together. You hear one bird song, and then another. You think they might harmonize, so you wish the birds would sing together. Good luck trying to get them to do that! Using our representation, this is easy. Get the sequences of samples for each song, and then make a new sequence by adding together to first samples in each of the songs, then the second two, and so on. It may not be clear that this will work. But if you think about the physics of the original sounds, you may be able to see that the individual air pressures add up in just this way.

Making an echo. Suppose we have our favorite bird song, and we want to hear it with an echo. What happens in an echo is that the original song hits our ears, but it also goes off and hits some reflector, like a cliff, and bounces back to our ears from there. So what is coming to our ears at any moment consists of whatever we are hearing of the original song, plus the song that is bouncing back to us. Since it takes time for the song to get to the cliff and back to us, what we hear in the echo is delayed. Also, since the bounce off the cliff produces some scattering, sound that doesn’t get back to our ears, the delayed, bounced version of the song is softer that the original.

Here’s how we can do all this, step by step.

1. Record the original song, getting a sequence of samples. Let’s call this sequence originalSong.

2. Make a softer version of the original, by multiplying each sample by (say) .5. Call this sequence softerSong

3. Record .25 seconds of silence, getting a sequence called quarterSecondOfSilence.

4. Stick the sequence quarterSecondOfSilence onto the front of softerSong. Call the resulting sequence delayedSofterSong.

5. Add together the samples in originalSong and delayedSofterSong. If we play back the resulting sequence, we’ll hear the original song with an echo.

Representing Sounds using Computational Stuff

In theory you could do all of this with pencil and paper, if you had some way to read the sample numbers, and to get the samples from your paper into some kind of player. But at 44,100 samples in a second of bird song, that’s a lot of paper and pencil. You want to get a machine to do the work.

To do this, we are going to represent the sequences of numbers we are using to represent the sounds, using computational stuff. Notice the three levels here: sounds are what we are interested in, and we use sequences of numbers to represent them. But then to be able to automate work on the sequences of numbers, we represent them using computational stuff.

We’ll do this using the Python programming language.

We are going to used things called lists in our Python programs to represent our sequences of samples.

To do the above work, the things we need to do are: get a list of samples representing the original bird song, multiply each sample in a list by a factor, get a list of samples representing .25 sec of silence, stick two lists together, and produce a new list by adding the samples in two lists together. We’ll also need to play back a list of samples. Here’s how to do these things in Python.

Basics of lists. A list is a sequence of things, numbers in our case, written like this: [1,22,47,29]. You can give a list a name, using an assignment statement, like this:

eggplant=[1,22,47,29]

Now we can refer to any of the elements of the list eggplant by using a number as an index. The indices start with 0, so eggplant[0] is 1, eggplant[1] is 22, eggplant[2] is 47, and eggplant[3] is 29. Indices are also often called subscripts, from mathematical notation).

Get a list of samples representing the original bird song. Rather than make a field recording, we’ll get our samples from a WAV file. Because computational representations of sounds are so handy, machinery has been created for recording sounds and storing them as collections of information on computer systems. Collections of information that can be filed away on the computer for future use are called files. A WAV file is a file that contains samples of a sound, stored in a particular way.

Here is a program that gets a list of samples from a WAV file named "birdsong.wav", and prints the first 10 of them:

originalSong=readwave("birdsong.wav") for i in range(10): print originalSong[i]

Multiply each sample in a list by a factor. This statement is all we need:

softerSong=[.5*e for e in originalSong]

This statement builds a new list, each of whose elements is made from an element of originalSong by multiplying it by .5. You can read it by thinking of the first thing in the brackets, .5*e, as a kind of pattern that shows how to make an element of the new list from an element of the old one. We know that e is an element of the list originalSong, because we wrote “for e in originalSong”. The square brackets tell us that we are making a list.

Get a list of samples representing .25 sec of silence. We could look for a WAV file that contains a recording of .25 sec of silence, but it will be easier just to make our own list of samples. This code will do it:

quarterSecondOfSilence=int(.25*44100)*[0]

This takes a little explaining. The number .25*44100 is the number of samples we need for a quarter second of sound, since we need 44,100 samples for each second of sound. The “int” with parens around it makes this into a whole number, or integer (you and I know that that product is already a whole number, but because .25 isn’t a whole number Python worries that the product might not be.) Then comes something funny looking: we are “multiplying” the list [0] by that number. When you “multiply” a list in Python by a number, Python sticks together that number of copies of the original list. So we are getting a list of .25*44100 copies of the list with just 0 in it. The result is a sequence of as many samples as we need to get .25 sec of sound, and each sample is 0.

Stick two lists together. This one is really easy. In Python, if I have two lists, I can just “add” them together. For example, [1,2,3]+[11,12] is [1,2,3,11,12]. So to produce a delayed version of softerSong we can just write

delayedSofterSong=quarterSecondOfSilence+softerSong

Produce a new list by adding the samples in two lists together. We’ve just seen that Python won’t do this if we just “add” the two lists with +... it will concatenate them. So this is a little more involved. Here is Python code that will do the trick.

songWithEcho=[originalSong[i]+delayedSofterSong[i] for i in range(len(originalSong))]

The expression originalSong[i]+delayedSofterSong[i] is the pattern for elements of the new list. The values of i used in the pattern come from the list range(len(originalSong)). The function range produces the list [0,1,...] with as many elements as the number you specify. Finally, len(originalSong) is the number of elements in the list, originalSong. This means that the values of i will step up from 0 until we have used as many values as there are things in originalSong, so in the pattern, originalSong[i] will get to be all of the elements in the list originalSong, which is what we want. Note that delayedSofterSong[i] picks out the corresponding elements of that list.

Play back a list of samples. We now hope we have a version of the song with the echo added, but we can’t hear it. Unfortunately we don’t have a way to convert a list of samples directly to sound from Python. What we can do, though, is put our samples into a WAV file, and then play the WAV file on our computer. Let's assume that the statement makewave("modifiedsong.wav",44100,songWithEcho) will put the samples in the list songWithEcho into a WAV file named "modifiedsong.wav". (Note: functions readwave and makewave aren't part of standard Python, but they can be implemented.)

Pulling these pieces together, here is a complete program for adding an echo to a sound, represented by a WAV file:

originalSong=readwave("birdsong.wav")
softerSong=[.5*e for e in originalSong]

quarterSecondOfSilence=int(.25*44100)*[0]
delayedSofterSong=quarterSecondOfSilence+softerSong songWithEcho=[originalSong[i]+delayedSofterSong[i] for i in range(len(originalSong))] makewave("modifiedsong.wav",44100,songWithEcho)

Below is a tabular summary of the methods we’ve discussed. The table distinguishes three domains, the sounds, the sequences of samples, and the lists (the computational stuff).



What is Implementation?

As was mentioned earlier, the key difference between computer science and mathematics is that mappings used in computing are (or could be) implemented, whereas mappings in mathematics need not be. But what does it mean to implement a mapping?

An implementation of a mapping is a thing with two ends, an input end and an output end. In the case of computer implementation, the thing will be a technical device of some kind. For example, an adder is a device whose input end can be set to correspond to a pair of numbers, say 2 and 3. After a little delay, the adder sets its output end to correspond to the sum of the two numbers on the input end, 5 in this case. Any computer will have one or more adders in it.

Similarly, to draw a bar chart we will want a device of some kind whose input end can be set to represent a length, and whose output end will produce a colored bar. In practice, you won’t find a bar-drawer as a separate device inside a computer. Rather, simpler devices are hooked together so as to act like a bar-drawer when needed.

The program above, when installed on an appropriately configured computer, implements some of the key stages of the production of a song with an echo. When an WAV file is presented as input, after some delay an appropriate WAV file appears as output. The steps connecting the original bird song in the leafy glade to a WAV file (recording), and connecting the output WAV file to a sound (playback) are not implemented by the program. If we were to examine their implementation, in practice, we would find that recording is probably implemented by a person operating a piece of special equipment, while playback is implemented by another program on a computer. The fact that mappings are often implemented by people leads into the topic of human-centered computing, to be taken up later.

Input and output forms.

We'll often need to refer to the states of the input and output ends of implemented mappings. Because implementations are very diverse, we'll want a generic term: we'll describe these states as input and output forms. For example, a bar graph displayed by a program may act as an output form the the program, and as an input form for a person viewing it.

Aside: Communication and Storage as Implemented Mappings

Many of the mappings one thinks of are like addition, in that the output is different from the input. But there are two very useful families of implementations of the identity mapping, the mapping whose output is the same as its input. Because the ends of a physical device can be separated in space, we can create a communication system, whose input end is in New York and whose output end is in Singapore. Here it is an advantage that the output is identical to the input.

Similarly, the ends of a device can be separated in time. A device whose input end is in September, 2006, but whose output end is in September, 2016, can be a storage system. Again, we are happy if the output is the same as the input.

Reprise: Advantages of computational representations.

The example we've discussed here provides concrete illustrations of the characteristic advantages of computational representations, representations using "computational stuff", over other forms. A sound represented as a sequence of samples can be stored in very little space, and accurately retrieved at a later time. It can also be communicated, that is, made available in another place, very rapidly and cheaply. These advantages give us the iPod, and the ability to acquire all kinds of material to play on it at very low cost. Finally, we can automate all kinds of transformations on sounds, incomparably more easily than can be done in other representation domains, such as magnetic tape, let alone in the target domain, the leafy glade.