Saturday, March 8, 2008

Human-Centered Computing and Representation

Here is the synopsis of the program description for the Human-Centered Computing Cluster at the National Science Foundation (see http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=500051):

This cluster, Human-Centered Computing (HCC), encompasses a rich panoply of diverse themes in Computer Science and IT, all of which are united by the common thread that human beings, whether as individuals, teams, organizations or societies, assume participatory and integral roles throughout all stages of IT development and use.

HCC topics include, but are not limited to:

  • Problem-solving in distributed environments, ranging across Internet-based information systems, grids, sensor-based information networks, and mobile and wearable information appliances.
  • Multimedia and multi-modal interfaces in which combinations of speech, text, graphics, gesture, movement, touch, sound, etc. are used by people and machines to communicate with one another.
  • Intelligent interfaces and user modeling, information visualization, and adaptation of content to accommodate different display capabilities, modalities, bandwidth and latency.
  • Multi-agent systems that control and coordinate actions and solve complex problems in distributed environments in a wide variety of domains, such as disaster response teams, e-commerce, education, and successful aging.
  • Models for effective computer-mediated human-human interaction under a variety of constraints, (e.g., video conferencing, collaboration across high vs. low bandwidth networks, etc.).
  • Definition of semantic structures for multimedia information to support cross-modal input and output.
  • Specific solutions to address the special needs of particular communities.
  • Collaborative systems that enable knowledge-intensive and dynamic interactions for innovation and knowledge generation across organizational boundaries, national borders, and professional fields.
  • Novel methods to support and enhance social interaction, including innovative ideas like social orthotics, affective computing, and experience capture.
  • Studies of how social organizations, such as government agencies or corporations, respond to and shape the introduction of new information technologies, especially with the goal of improving scientific understanding and technical design.

Can this broad area of work, defined mainly by example, be given conceptual coherence? Focusing on the role of people in the representational work of computational system suggests that it can. Further, the representational perspective shows that HCC is not, as some have thought, a peripheral aspect of computing, but rather a central aspect, not just in its practical importance, but also in the intellectual content it shares with other areas of computing.

As we have seen, some operations in representational systems can be automated, that is, carried out by machine. But some operations, such as making a judgment from a bar chart, are carried out by people. For a representational system to work, the cost structure of these human operations have to meet the same conditions as the cost structure of automated operations, including conditions on accuracy and reliability.

While the cost structures of automated and human-performed operations have symmetric status in designing or analyzing representational systems, the specifics naturally are different. For example, what computers like to do isn't a relevant or even well-defined consideration, while what people like to do certainly is. Here are a few examples that illustrate some of the factors that influence the cost structure of operations performed by people in various representational situations.

Bar charts: The bars are often drawn by computer, and then people make the actual judgments. When a person implements a mapping in this context, they perceive the bars, and then create an output. As discussed earlier, if the bars are base-aligned and parallel, the length comparison will be easy and accurate.

Picture processing: Recognizing objects or people in a scene, or judging aesthetic qualities, or many other everyday operations, can be done from pictures. That is, if a real scene isn't at hand, it can be represented by a picture, and judgments made from the picture can substitute for the judgments that would be made from the real scene. Conveniently, suitable pictures can be created and displayed by computer, taking into account the following characteristics of human vision.

Color perception: For homo sapiens the brightness of just three colors can be chosen to reproduce perfectly any color (but see http://www.purveslab.net/main/ for complications that are at work in color vision). A display for dogs would only need two colors (dogs only discriminate short from long wavelengths.)

Limited spatial resolution, and blending: Below a certain size, closely spaced dots merge into a smooth-appearing area. Exploiting this principle, and the principles of human color perception, most displays work by presenting a large number of tiny, closely spaced dots, of three colors. Even though this arrangement produces images that are physically very different from "real" scense in the world, to the human visual system these images look realistic, and they are processed effectively and easily.

Depth perception: There are many cues for depth, that is, distance from the viewer. Nearly all of these are included in two-dimensional images. For example, perspective, the effect by which more distant objects form smaller images, or aerial perspective, the effect by which more distant objects appear fainter or more hazy, because they are seen through thicker intervening atmosphere, are preserved in two dimensional images. A depth cue that is important for objects near at hand is stereopsis: images seen by the two eyes differ in a way that is related to distance. Displays that exploit this indication of depth present different images to the two eyes, for example by using differently polarized light for the two images, and placing differently polarized filters over the two eyes. As for images made of dots, notice that the resulting pattern of light is very different from the pattern of light in a real scene. But it is different in a way that the human visual system doesn't detect. There could be organisms whose visual system is sensitive to the polarization of light. For such organisms these displays wouldn't work.

Moving from pictures to moving scenes, presentation of movies and videos relies on further facts about how people see. Just as closely spaced dots merge into the appearance of smooth surfaces, so images closely spaced in time merge into the appearance of smooth motion. As we all know, a movie consists of a series of frames, each a static picture, changed very rapidly, but we don’t see it that way. One could imagine a Martian in a movie theater wondering why Earthlings like to sit in the dark and watch long series of very similar pictures flashed on a screen. Video is more complicated, in that the pictures aren’t even complete: they are collections of closely spaced lines, with only half the lines included in each frame. But this ridiculous presentation looks quite good to the human visual system.

These examples bring out what the point of human centered computing is. Effective displays are not based on faithful physical and geometric reproductions of the signals available from real scenes. Rather, they systematically exploit facts that are specific to the human perceptual apparatus. Representational systems in which humans play a role cannot be designed without at least implicit knowledge of these facts.

Input forms, created by people.

In the examples just discussed, operations are carried out by people on entities, like bars, that are presented to them by a computer. From the point of view of the computer these are output forms. But it’s usually also necessary for people to produce forms that are given to the computer, so as to provide data for the computer to operate on, or to control what the computer does. From the point of the computer these are input forms. Just as for output forms, input forms have to be shaped to fit people’s capabilities. For output forms the human abilities that matter center on perception. But since input forms have to be produced by people, the key human capabilities are those of action. What are the actions people can control, to produce forms usable by a computer?

Keypresses. Most people have dexterous fingers that can be moved quickly and accurately so as to apply force to appropriately sized objects. Keys and buttons are such objects, and are designed to fit people’s fingers in size, the force required to activate them, and (for high quality keys) the feedback they provide to confirm that they have been activated. Most people can move their fingers so as to press more than one key at a time, and this ability is exploited in many musical instruments and a few computer input devices, such as chord keyboards.

Text entry. Often keys are used in groups to allow people to specify sequences of characters making up a text. For people who know a written language, many pieces of text are familiar, and can be generated quickly and accurately, whereas random sequences of keypresses can be entered only slowly and with high error rates.

Drawing. Most people can use their hands and fingers to grasp a pointed object of appropriate size and shape, and move the point along a desired path. With more difficulty, most people can move an object that has no point (a mouse) so as to control a pointed marker whose movement traces a desired path. In either case, the path can be sensed and act as input to a computer.

We could add many examples to this list, and perhaps invent new ways to use actions to communicate. For example, HCC researchers are developing ways for people to use facial expression, or tone of voice, to create input forms.

HCC and social systems.

Humans are social animals: most things that people do are done in groups. This influences human-centered computing in a number of ways.

There are usually multiple people in a computational system. The designers of such systems therefore have to understand not just what individual people are likely to do, and can do, but what people working in groups will and can do. For example, sometimes systems fail because some users don’t do things, like entering information into a data base, that are needed to support other users. But sometimes, as in Wikipedia, people working strictly as volunteers put huge amounts of effort into really useful contributions. If you are designing a system for a lot of people to use, you have to try to understand what shapes contrasts like this. In our terms, we need to understand the cost structures of operations performed by groups, not just by individuals. Plainly, our ability to predict what will happen naturally (and hence cheaply) in a group is poor.

Decisions about system design have to reflect the needs and wants of groups of people, not just those of individuals. Historically, before computers became cheap enough for individuals to buy them, nearly all computational systems were created to meet the needs of organizations, especially businesses. Today, even though it is common for individuals to own one or more computers, facilities for communicating with other people, especially via the Internet, are now crucial for almost all users. People need to communicate, in many situations, and want to communicate in many others. The design of future computational systems will continue to be shaped strongly by these needs and wants. Think of MySpace, or FaceBook.

These considerations, too, shape the cost structures we are concerned with as designers of representational systems. Here we are seeing the value side of cost: systems need to deliver value to offset their costs.

HCC and particular user groups.

People have different capabilities. Because, as we've seen, representational systems are shaped by users’ capabilities, differences in capabilities need to be reflected in different designs. Assistive technology uses representations that are suited to the needs of people with limited vision, or limited ability to read text, to mention just two examples. Inclusive design seeks to create representational systems that can be used by people with the widest possible range of capabilities.

To come: representation and programs; philosophical perspectives.

No comments: