Thursday, May 22, 2008

Notations for Composition

What notational devices do programming languages use to specify the composition of two mappings? A common one is borrowed from mathematical notation, nesting of operators. For example, the expression

2+3*4

uses nesting to compose the multiplication and addition mappings in a particular way. The expression

(2+3)*4

composes the same mappings in a different way. How the mappings are composed has to be worked out by analyzing the details of the expression, including the role of operator precedence (if there are no parentheses, multiplications are performed before additions, even when the addition comes first in the expression) and parentheses, if present.

Here is a version of an expression for do-echo-computation that uses this device heavily:

#Version 1
[originalSong[i]+(intervalOfSilence+[.5*e for e in originalSong])[i] for i in range(len(originalSong))]

The code assumes that originalSong is the sound to be operated upon, and that an appropriate constant intervalOfSilence has been constructed.

Is it clear in this notation what mappings are being composed, and how? Before considering this question more fully, here is an alternative form:

#Version 2
softerSong=[.5*e for e in originalSong]
echo=intervalOfSilence+softerSong
[originalSong[i]+echo[i] for i in range(len(originalSong))]

Version 2 of the program uses another common way of specifying composition, the use of variables. Rather than nesting operations to show composition of mappings, Version 2 shows the production of intermediate results, using some mappings, which are then used as inputs to other mappings, to accomplish the effect of composition.

Both Version 1 and Version 2 specify the mappings to be composed in terms of lower level operations like list comprehension in the main expressions. We can focus more clearly on the contrasts in notation by separating out these specifications, like this:

def attenuate(s):
return [.5*e for e in s]
def delay(s):
return intervalOfSilence+s
def mix(s1,s2):
return [s1[i]+s2[i] for i in range(len(s1))]

We then have

#Version 1a
mix(originalSong,delay(attenuate(originalSong)))

#Version 2a
softerSong=attenuate(originalSong)
echo=delay(softerSong)
mix(originalSong,echo)

Note that Version 2a has no nesting of operations; composition is specified entirely by the use of variables. What are the merits and demerits of these two notations?

The best developed framework for analyzing notations is Cognitive Dimensions analysis, developed by Thomas Green, Marian Petre, Alan Blackwell, and others (see http://www.cl.cam.ac.uk/%7Eafb21/CognitiveDimensions/). Cognitive Dimensions are aspects of a notation to examine; here are a few of them, with capsule definitions from Blackwell http://www.cl.cam.ac.uk/Teaching/2000/AGraphHCI/HCI/hcinotes.html#cds, and a discussion of Versions 1a and 2a with respect to each.

Hidden dependencies: important links between entities are not visible.

If the structure of the product means some parts are closely related to other parts, and changes to one may affect the other, are those dependencies visible? What kind of dependencies are hidden? In what ways can it get worse when you are creating a particularly large description? Do these dependencies stay the same, or are there some actions that cause them to get frozen? If so, what are they?

Both versions suffer from a hard to see connection between the definition of mix and its use. While "hearing two sounds together" is completely symmetric (hearing A and B together is just the same as hearing B and A together) the computational operation mix that approximates it is not symmetric: neither Version 1a nor Version 1b will work if mix(echo,originalSound) is substituted for mix(originalSound,echo): mix is defined in such a way that the longer sound has to be specified as the second parameter. There's no way to tell this without close reading of the definition of mix.

When operations have multiple parameters, and they are not symmetric, mathematical notation does not work well, because the notation does not convey the order in which parameters must appear. Perhaps to help with this, common mathematical notation uses special forms for very asymmetric operations like division and exponentiation, providing some help in keeping track of the different roles of numerator and denominator, and base and exponent. Even here, though, one just has to know which parameter is which.

Hard mental operations: high demand on cognitive resources.

What kind of things require the most mental effort with this notation? Do some things seem especially complex or difficult to work out in your head (e.g. when combining several things)? What are they?

There are hard mental operations required in both Version 1a and Version 1b. In 1a one has to parse the expression, keeping track of parentheses, to determine what is composed with what, and in what way (when mappings take more than one parameter the possible compositions multiply). In 2a one has to trace the history of the variables to recover the same information. Neither of these tasks is easy.

We'll consider the next two dimensions together.

Closeness of mapping: closeness of representation to domain.

How closely related is the notation to the result that you are describing? Why?


Role-expressiveness: the purpose of a component is readily inferred.

When reading the notation, is it easy to tell what each part is for? Why? Are there some parts that are particularly difficult to interpret? Which ones? Are there parts that you really don't know what they mean, but you put them in just because it's always been that way? What are they?

These two dimensions point to advantages of Version 2a over 1a that are similar to those pointed out in the earlier post (March 20, 2008 What difference does it make?) comparing two solutions to the echo problem. The introduction of named variables in 2a points up correspondences between entities in the program and entities in the target domain, an example of closeness of mapping. At the same time, these correspondences help the reader understand the roles of these parts in the program.

Note that while these uses of names seem quite important from the representational perspective, choice of names is outside the scope of programming language design, as traditionally conceived. This situation parallels that seen in the discussion of application: it's crucial that a program be applied to the right thing, but languages don't provide support for this.

A further benefit of Version 2a is that the order in which the mappings are composed corresponds to lexical order there, while it is reversed in Version 1a. That is, the operation mix, that comes last in composition order, is written last in Version 2a, but first in Version 1a.

On balance, these cognitive dimensions favor Version 2a over 1a. Yet common programming style, including the recommendations I have made to my intro students in the past, favors 1a. This mismatch seems to reflect the lack of a representational perspective on programs, a perspective that is needed to exploit closeness of mapping, with derivative advantages in role-expressiveness.

It's interesting to speculate whether notations could be devised that would reduce the problems of hidden dependencies and hard mental operations that both versions show.

No comments: