Thursday, March 20, 2008

What difference does it make?

Does it make any difference to view computational systems as representational systems? Yes. Here is a comparison of two programs I wrote as instructor's solutions for the same homework exercise in an introductory course using Python. The problem is to read a list of sound samples from a WAV file, and record a new sound, consisting of the original plus an echo. The echo is delayed by .15 sec, and attenuated by a factor of .25. This is the same example discussed in the post, Computational Stuff, a Worked Example.

Program 1 is the solution discussed in the earlier post, with the portion of the solution that records the song with echo slightly modified for comparability with the solution presented as Program 2.

#Program 1
softerSong=[.5*e for e in originalSong] #line 1
intervalOfSilence=int(.25*sampleRate)*[0] #line 6
delayedSofterSong=intervalOfSilence+softerSong #line 7 #line 2
#line 3:
songWithEcho=[originalSong[i]+delayedSofterSong[i] for i in range(len(originalSong))]
for sample in songWithEcho:

Program 2 is my solution to the same problem, written earlier without considering the representational perspective:

#Program 2
originalSong = readwave("song.wav").getLeft()
echospread = int( 44100 * .25 )
outputWAV = makewave("modifiedsong.wav", sampleRate)
for i in range(0, echospread):
outputWAV.record(originalSong[i], 0)
for i in range(echospread, len(originalSong)):
outputWAV.record( originalSong[i] + originalSong[i - echospread] * .5,0)

Both programs produce correct results. Indeed, they produce the same results. But they work quite differently.

It is difficult to identify any computational operation in Program 2 with an operation in the target domain of sounds, or to identify any data structure in Program 2 with a sound, except for originalSong. The thrust of Program 2 is to record certain numbers in a file, not to construct a representation of a sound that is produced in a certain way.

In contrast, in Program 1, softerSong, delayedSofterSong, and songWith Echo all correspond to sounds, and they are created using operations on sequences of samples that correspond to operations on sounds: attentuation (in line 1), delaying (in line 2), and mixing the original song with the echo (in line 3).

Here are diagrams showing these relationships in the two programs.

Here is shown the original song, bouncing off a reflector shown at right, in attenuated form, being delayed, and being mixed with the original song. Color coding and dashed lines show which things in the representation domain correspond to which things in the target domain. Solid lines show which operations in the representation domain correspond to mappings in the target domain.

Notice that the computational representation produces only an approximation to the actual echo effect. As shown, when the recorded sound is played back the echo is truncated, because when mixing is done in the program the resulting sound is only as long as the shorter of the two sounds being mixed.
The similar diagram for Program 2 has many fewer correspondences to point out. The value of originalSong corresponds to the original song in the target domain, but there are no other simple correspondences. Because Program 2 builds up the recording sample by sample, the operations of attentuation, delay, and mixing, are hidden in the arithmetic of the construction of the samples. The prominent decomposition of the program into two loops make sense in view of the change in arithmetic needed to produce two groups of samples, but this split doesn't reflect anything about how echos are produced physically.

Notice also that there is a mapping, "record samples" shown as an arrow for Program 1 that appears only as a floating label in Program 2. In Program 1 there is a list of samples created that corresponds to the song with echo, and this list of samples is recorded, an operation easily understood as a mapping. In Program 2, there is no list of samples that is created and then recorded; rather, samples are created and recorded piecemeal. Recording is happening, but not in a way that can be understood separately from other operations in the program.

So what? For an experienced programmer, Program 2 isn't hard to write. One imagines the numbers one has to record, and figures out how to do the arithmetic that will produce these numbers, in the process of recording them. One's understanding of the physics of sound plays a role only in determining what the correct numbers are, not in writing a program to produce the numbers.

But for beginners, Program 2 is difficult to write. Experience is needed to relate what the numbers are to a method for producing them.

Program 1, on the other hand, is easier to write, if one understands the physics of echo production and the computational operations that correspond to operations on sounds, as discussed in the earlier post. Writing the program requires only that the computational operations that correspond to attenuation, delay, and mixing be deployed and combined, and that one knows how to record a list of samples.

No comments: