预计阅读本页时间:-
(The Basic Ingredient in Building a Mind)
It is probably dangerous to use this theory of information in fields for which it was not designed, but I think the danger will not keep people from using it.
—J. C. R. Licklider (1950)♦
MOST MATHEMATICAL THEORIES take shape slowly; Shannon’s information theory sprang forth like Athena, fully formed. Yet the little book of Shannon and Weaver drew scant public attention when it appeared in 1949. The first review came from a mathematician, Joseph L. Doob, who complained that it was more “suggestive” than mathematical—“and it is not always clear that the author’s mathematical intentions are honorable.”♦ A biology journal said, “At first glance, it might appear that this is primarily an engineering monograph with little or no application to human problems. Actually, the theory has some rather exciting implications.”♦ The Philosophical Review said it would be a mistake for philosophers to overlook this book: “Shannon develops a concept of information which, surprisingly enough, turns out to be an extension of the thermodynamic concept of entropy.”♦ The strangest review was barely a review at all: five paragraphs in Physics Today, September 1950, signed by Norbert Wiener, Massachusetts Institute of Technology.
广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元
Wiener began with a faintly patronizing anecdote:
Some fifteen years ago, a very bright young student came to the authorities at MIT with an idea for a theory of electric switching dependent on the algebra of logic. The student was Claude E. Shannon.
In the present book (Wiener continued), Shannon, along with Warren Weaver, “has summed up his views on communication engineering.”
The fundamental idea developed by Shannon, said Wiener, “is that of the amount of information as negative entropy.” He added that he himself—“the author of the present review”—had developed the same idea at about the same time.
Wiener declared the book to be work “whose origins were independent of my own work, but which has been bound from the beginning to my investigations by cross influences spreading in both directions.” He mentioned “those of us who have tried to pursue this analogy into the study of Maxwell’s demon” and added that much work remained to be done.
Then he suggested that the treatment of language was incomplete without greater emphasis on the human nervous system: “nervous reception and the transmission of language into the brain. I say these things not as a hostile criticism.”
Finally, Wiener concluded with a paragraph devoted to another new book: “my own Cybernetics.” Both books, he said, represent opening salvos in a field that promises to grow rapidly.
In my book, I have taken the privilege of an author to be more speculative, and to cover a wider range than Drs. Shannon and Weaver have chosen to do.… There is not only room, but a definite need for different books.
He saluted his colleagues for their well-worked and independent approach—to cybernetics.
Shannon, meanwhile, had already contributed a short review of Wiener’s book to the Proceedings of the Institute of Radio Engineers, offering praise that could be described as faint. It is “an excellent introduction,” he said.♦ There was a little tension between these men. It could be felt weighing down the long footnote that anchored the opening page of Weaver’s portion of The Mathematical Theory of Communication:
Dr. Shannon has himself emphasized that communication theory owes a great debt to Professor Norbert Wiener for most of its basic philosophy. Professor Wiener, on the other hand, points out that much of Shannon’s early work on switching and mathematical logic antedated his own interest in this field; and generously adds that Shannon certainly deserves credit for independent development of such fundamental aspects of the theory as the introduction of entropic ideas.
Shannon’s colleague John Pierce wrote later: “Wiener’s head was full of his own work.… Competent people have told me that Wiener, under the misapprehension that he already knew what Shannon had done, never actually found out.”♦
Cybernetics was a coinage, future buzzword, proposed field of study, would-be philosophical movement entirely conceived by this brilliant and prickly thinker. The word he took from the Greek for steersman: κυβερνιτησ, kubernites, from which comes also (not coincidentally) the word governor.♦ He meant cybernetics to be a field that would synthesize the study of communication and control, also the study of human and machine. Norbert Wiener had first become known to the world as a curiosity: a sport, a prodigy, driven and promoted by his father, a professor at Harvard. “A lad who has been proudly termed by his friends the brightest boy in the world,” The New York Times reported on page 1 when he was fourteen years old, “will graduate next month from Tufts College.… Aside from the fact that Norbert Wiener’s capacity for learning is phenomenal, he is as other boys.… His intense black eyes are his most striking feature.”♦ When he wrote his memoirs, he always used the word prodigy in the titles: Ex-Prodigy: My Childhood and Youth and I Am a Mathematician: The Later Life of a Prodigy.
After Tufts (mathematics), Harvard graduate school (zoology), Cornell (philosophy), and Harvard again, Wiener left for Cambridge, England, where he studied symbolic logic and Principia Mathematica with Bertrand Russell himself. Russell was not entirely charmed. “An infant prodigy named Wiener, Ph.D. (Harvard), aged 18, turned up,” he wrote a friend. “The youth has been flattered, and thinks himself God Almighty—there is a perpetual contest between him and me as to which is to do the teaching.”♦ For his part, Wiener detested Russell: “He is an iceberg. His mind impresses one as a keen, cold, narrow logical machine, that cuts the universe into neat little packets, that measure, as it were, just three inches each way.”♦ On his return to the United States, Wiener joined the faculty of MIT in 1919, the same year as Vannevar Bush. When Shannon got there in 1936, he took one of Wiener’s mathematics courses. When war loomed, Wiener was one of the first to join the hidden, scattered teams of mathematicians working on antiaircraft fire control.
NORBERT WIENER (1956) (Illustration credit 8.1)
He was short and rotund, with heavy glasses and a Mephistophelian goatee. Where Shannon’s fire-control work drilled down to the signal amid the noise, Wiener stayed with the noise: swarming fluctuations in the radar receiver, unpredictable deviations in flight paths. The noise behaved statistically, he understood, like Brownian motion, the “extremely lively and wholly haphazard movement” that van Leeuwenhoek had observed through his microscope in the seventeenth century. Wiener had undertaken a thoroughgoing mathematical treatment of Brownian motion in the 1920s. The very discontinuity appealed to him—not just the particle trajectories but the mathematical functions, too, seemed to misbehave. This was, as he wrote, discrete chaos, a term that would not be well understood for several generations. On the fire-control project, where Shannon made a modest contribution to the Bell Labs team, Wiener and his colleague Julian Bigelow produced a legendary 120-page monograph, classified and known to the several dozen people allowed to see it as the Yellow Peril because of the color of its binder and the difficulty of its treatment. The formal title was Extrapolation, Interpolation, and Smoothing of Stationary Time Series. In it Wiener developed a statistical method for predicting the future from noisy, uncertain, and corrupted data about the past. It was too ambitious for the existing gun machinery, but he tested it on Vannevar Bush’s Differential Analyzer. Both the antiaircraft gun, with its operator, and the target airplane, with its pilot, were hybrids of machine and human. One had to predict the behavior of the other.
Wiener was as worldly as Shannon was reticent. He was well traveled and polyglot, ambitious and socially aware; he took science personally and passionately. His expression of the second law of thermodynamics, for example, was a cry of the heart:
We are swimming upstream against a great torrent of disorganization, which tends to reduce everything to the heat death of equilibrium and sameness.… This heat death in physics has a counterpart in the ethics of Kierkegaard, who pointed out that we live in a chaotic moral universe. In this, our main obligation is to establish arbitrary enclaves of order and system.… Like the Red Queen, we cannot stay where we are without running as fast as we can.♦
He was concerned for his place in intellectual history, and he aimed high. Cybernetics, he wrote in his memoirs, amounted to “a new interpretation of man, of man’s knowledge of the universe, and of society.”♦ Where Shannon saw himself as a mathematician and an engineer, Wiener considered himself foremost a philosopher, and from his fire-control work he drew philosophical lessons about purpose and behavior. If one defines behavior cleverly—“any change of an entity with respect to its surroundings”♦—then the word can apply to machines as well as animals. Behavior directed toward a goal is purposeful, and the purpose can sometimes be imputed to the machine rather than a human operator: for example, in the case of a target-seeking mechanism. “The term servomechanisms has been coined precisely to designate machines with an intrinsic purposeful behavior.” The key was control, or self-regulation.
To analyze it properly he borrowed an obscure term from electrical engineering: “feed-back,” the return of energy from a circuit’s output back to its input. When feedback is positive, as when the sound from loudspeakers is re-amplified through a microphone, it grows wildly out of control. But when feedback is negative—as in the original mechanical governor of steam engines, first analyzed by James Clerk Maxwell—it can guide a system toward equilibrium; it serves as an agent of stability. Feedback can be mechanical: the faster Maxwell’s governor spins, the wider its arms extend, and the wider its arms extend, the slower it must spin. Or it can be electrical. Either way, the key to the process is information. What governs the antiaircraft gun, for example, is information about the plane’s coordinates and about the previous position of the gun itself. Wiener’s friend Bigelow emphasized this: “that it was not some particular physical thing such as energy or length or voltage, but only information (conveyed by any means).”♦
Negative feedback must be ubiquitous, Wiener felt. He could see it at work in the coordination of eye and hand, guiding the nervous system of a person performing an action as ordinary as picking up a pencil. He focused especially on neurological disorders, maladies that disrupted physical coordination or language. He saw them quite specifically as cases of information feedback gone awry: varieties of ataxia, for example, where sense messages are either interrupted in the spinal cord or misinterpreted in the cerebellum. His analysis was detailed and mathematical, with equations—almost unheard of in neurology. Meanwhile, feedback control systems were creeping into factory assembly lines, because a mechanical system, too, can modify its own behavior. Feedback is the governor, the steersman.
So Cybernetics became the title of Wiener’s first book, published in the fall of 1948 in both the United States and France. Subtitle: Control and Communication in the Animal and the Machine. The book is a hodgepodge of notions and analysis, and, to the astonishment of its publishers, it became the year’s unexpected bestseller. The popular American news magazines, Time and Newsweek, both featured it. Wiener and cybernetics were identified with a phenomenon that was bursting into public consciousness just at that moment: computing machines. With the end of the war, a veil had been lifted from the first urgent projects in electronic calculation, particularly the ENIAC, a thirty-ton monster of vacuum tubes, relays, and hand-soldered wires stretching across eighty feet at the University of Pennsylvania’s electrical engineering school. It could store and multiply up to twenty numbers of ten decimal digits; the army used it to calculate artillery firing tables. The International Business Machines company, IBM, which provided punched card machines for the army projects, also built a giant calculating machine at Harvard, the Mark I. In Britain, still secret, the code breakers at Bletchley Park had gone on to build a vacuum-tube computing machine called the Colossus. Alan Turing was beginning work on another, at the University of Manchester. When the public learned about these machines, they were naturally thought of as “brains.” Everyone asked the same question: Can machines think?
“They are growing with fearful speed,” declared Time in its year-end issue. “They started by solving mathematical equations with flash-of-lightning rapidity. Now they are beginning to act like genuine mechanical brains.”♦ Wiener encouraged the speculation, if not the wild imagery:
Dr. Wiener sees no reason why they can’t learn from experience, like monstrous and precocious children racing through grammar school. One such mechanical brain, ripe with stored experience, might run a whole industry, replacing not only mechanics and clerks but many of the executives too.…
As men construct better calculating machines, explains Wiener, and as they explore their own brains, the two seem more & more alike. Man, he thinks, is recreating himself, monstrously magnified, in his own image.
Much of the success of his book, abstruse and ungainly as it was, lay in Wiener’s always returning his focus to the human, not the machine. He was not as interested in shedding light on the rise of computing—to which, in any case, his connections were peripheral—as in how computing might shed light on humanity. He cared profoundly, it turned out, about understanding mental disorders; about mechanical prostheses; and about the social dislocations that might follow the rise of smart machinery. He worried that it would devalue the human brain as factory machinery had devalued the human hand.
He developed the human-machine parallels in a chapter titled “Computing Machines and the Nervous System.” First he laid out a distinction between two types of computing machines: analog and digital, though he did not yet use those words. The first type, like the Bush Differential Analyzer, represented numbers as measurements on a continuous scale; they were analogy machines. The other kind, which he called numerical machines, represented numbers directly and exactly, as desk calculators did. Ideally, these devices would use the binary number system for simplicity. For advanced calculations they would need to employ a form of logic. What form? Shannon had answered that question in his master’s thesis of 1937, and Wiener offered the same answer:
the algebra of logic par excellence, or the Boolean algebra. This algorithm, like the binary arithmetic, is based on the dichotomy, the choice between yes and no, the choice between being in a class and outside.♦
The brain, too, he argued, is at least partly a logical machine. Where computers employ relays—mechanical, or electromechanical, or purely electrical—the brain has neurons. These cells tend to be in one of two states at any given moment: active (firing) or at rest (in repose). So they may be considered relays with two states. They are connected to one another in vast arrays, at points of contact known as synapses. They transmit messages. To store the messages, brains have memory; computing machines, too, need physical storage that can be called memory. (He knew well that this was a simplified picture of a complex system, that other sorts of messages, more analog than digital, seemed to be carried chemically by hormones.) Wiener suggested, too, that functional disorders such as “nervous breakdowns” might have cousins in electronics. Designers of computing machines might need to plan for untimely floods of data—perhaps the equivalent of “traffic problems and overloading in the nervous system.”♦
Brains and electronic computers both use quantities of energy in performing their work of logic—“all of which is wasted and dissipated in heat,” to be carried away by the blood or by ventilating and cooling apparatus. But this is really beside the point, Wiener said. “Information is information, not matter or energy. No materialism which does not admit this can survive at the present day.”
Now came a time of excitement.
“We are again in one of those prodigious periods of scientific progress—in its own way like the pre-Socratic period,” declared the gnomic, white-bearded neurophysiologist Warren McCulloch to a meeting of British philosophers. He told them that listening to Wiener and von Neumann put him in mind of the debates of the ancients. A new physics of communication had been born, he said, and metaphysics would never be the same: “For the first time in the history of science we know how we know and hence are able to state it clearly.”♦ He offered them heresy: that the knower was a computing machine, the brain composed of relays, perhaps ten billion of them, each receiving signals from other relays and sending them onward. The signals are quantized: they either happen or do not happen. So once again the stuff of the world, he said, turns out to be the atoms of Democritus—“indivisibles—leasts—which go batting about in the void.”
It is a world for Heraclitus, always “on the move.” I do not mean merely that every relay is itself being momentarily destroyed and re-created like a flame, but I mean that its business is with information which pours into it over many channels, passes through it, eddies within it and emerges again to the world.
That these ideas were spilling across disciplinary borders was due in large part to McCulloch, a dynamo of eclecticism and cross-fertilization. Soon after the war he began organizing a series of conferences at the Beekman Hotel on Park Avenue in New York City, with money from the Josiah Macy Jr. Foundation, endowed in the nineteenth century by heirs of Nantucket whalers. A host of sciences were coming of age all at once—so-called social sciences, like anthropology and psychology, looking for new mathematical footing; medical offshoots with hybrid names, like neurophysiology; not-quite-sciences like psychoanalysis—and McCulloch invited experts in all these fields, as well as mathematics and electrical engineering. He instituted a Noah’s Ark rule, inviting two of each species so that speakers would always have someone present who could see through their jargon.♦ Among the core group were the already famous anthropologist Margaret Mead and her then-husband Gregory Bateson, the psychologists Lawrence K. Frank and Heinrich Klüver, and that formidable, sometimes rivalrous pair of mathematicians, Wiener and von Neumann.
Mead, recording the proceedings in a shorthand no one else could read, said she broke a tooth in the excitement of the first meeting and did not realize it till afterward. Wiener told them that all these sciences, the social sciences especially, were fundamentally the study of communication, and that their unifying idea was the message.♦ The meetings began with the unwieldy name of Conferences for Circular Causal and Feedback Mechanisms in Biological and Social Systems and then, in deference to Wiener, whose new fame they enjoyed, changed that to Conference on Cybernetics. Throughout the conferences, it became habitual to use the new, awkward, and slightly suspect term information theory. Some of the disciplines were more comfortable than others. It was far from clear where information belonged in their respective worldviews.
The meeting in 1950, on March 22 and 23, began self-consciously. “The subject and the group have provoked a tremendous amount of external interest,” said Ralph Gerard, a neuroscientist from the University of Chicago’s medical school, “almost to the extent of a national fad. They have prompted extensive articles in such well known scientific magazines as Time, News-Week, and Life.”♦ He was referring, among others, to Time’s cover story earlier that winter titled “The Thinking Machine” and featuring Wiener:
Professor Wiener is a stormy petrel (he looks more like a stormy puffin) of mathematics and adjacent territory.… The great new computers, cried Wiener with mingled alarm and triumph, are … harbingers of a whole new science of communication and control, which he promptly named “cybernetics.” The newest machines, Wiener pointed out, already have an extraordinary resemblance to the human brain, both in structure and function. So far, they have no senses or “effectors” (arms and legs), but why shouldn’t they have?
It was true, Gerard said, that his field was being profoundly affected by new ways of thought from communications engineering—helping them think of a nerve impulse not just as a “physical-chemical event” but as a sign or a signal. So it was helpful to take lessons from “calculating machines and communications systems,” but it was dangerous, too.
To say, as the public press says, that therefore these machines are brains, and that our brains are nothing but calculating machines, is presumptuous. One might as well say that the telescope is an eye or that a bulldozer is a muscle.♦
Wiener felt he had to respond. “I have not been able to prevent these reports,” he said, “but I have tried to make the publications exercise restraint. I still do not believe that the use of the word ‘thinking’ in them is entirely to be reprehended.”♦♦♦
Gerard’s main purpose was to talk about whether the brain, with its mysterious architecture of neurons, branching dendrite trees, and complex interconnections alive within a chemical soup, could properly be described as analog or digital.♦ Gregory Bateson instantly interrupted: he still found this distinction confusing. It was a basic question. Gerard owed his own understanding to “the expert tutelage that I have received here, primarily from John von Neumann”—who was sitting right there—but Gerard took a stab at it anyway. Analog is a slide rule, where number is represented as distance; digital is an abacus, where you either count a bead or you do not; there’s nothing in between. A rheostat—light dimmer—is analog; a wall switch that snaps on or off, digital. Brain waves and neural chemistry, said Gerard, are analog.
Discussion ensued. Von Neumann had plenty to say. He had lately been developing a “game theory,” which he viewed effectively as a mathematics of incomplete information. And he was taking the lead in designing an architecture for the new electronic computers. He wanted the more analog-minded of the group to think more abstractly—to recognize that digital processes take place in a messy, continuous world but are digital nonetheless. When a neuron snaps between two possible states—“the state of the nerve cell with no message in it and the state of the cell with a message in it”♦—the chemistry of this transition may have intermediate shadings, but for theoretical purposes the shadings may be ignored. In the brain, he suggested, just as in a computer made of vacuum tubes, “these discrete actions are in reality simulated on the background of continuous processes.” McCulloch had just put this neatly in a new paper called “Of Digital Computers Called Brains”: “In this world it seems best to handle even apparent continuities as some numbers of some little steps.”♦ Remaining quiet in the audience was the new man in the group, Claude Shannon.
The next speaker was J. C. R. Licklider, an expert on speech and sound from the new Psycho-Acoustic Laboratory at Harvard, known to everyone as Lick. He was another young scientist with his feet in two different worlds—part psychologist and part electrical engineer. Later that year he moved to MIT, where he established a new psychology department within the department of electrical engineering. He was working on an idea for quantizing speech—taking speech waves and reducing them to the smallest quantities that could be reproduced by a “flip-flop circuit,” a homemade gadget made from twenty-five dollars of vacuum tubes, resistors, and capacitors.♦ It was surprising—even to people used to the crackling and hissing of telephones—how far speech could be reduced and still remain intelligible. Shannon listened closely, not just because he knew about the relevant telephone engineering but because he had dealt with the issues in his secret war work on audio scrambling. Wiener perked up, too, in part because of a special interest in prosthetic hearing aids.
When Licklider described some distortion as neither linear nor logarithmic but “halfway between,” Wiener interrupted.
“What does ‘halfway’ mean? X plus S over N?”
Licklider sighed. “Mathematicians are always doing that, taking me up on inexact statements.”♦ But he had no problem with the math and later offered an estimate for how much information—using Shannon’s new terminology—could be sent down a transmission line, given a certain bandwidth (5,000 cycles) and a certain signal-to-noise ratio (33 decibels), numbers that were realistic for commercial radio. “I think it appears that 100,000 bits of information can be transmitted through such a communication channel”—bits per second, he meant. That was a staggering number; by comparison, he calculated the rate of ordinary human speech this way: 10 phonemes per second, chosen from a vocabulary of 64 phonemes (26, “to make it easy”—the logarithm of the number of choices is 6), so a rate of 60 bits per second. “This assumes that the phonemes are all equally probable—”
“Yes!” interrupted Wiener.♦
“—and of course they are not.”
Wiener wondered whether anyone had tried a similar calculation for “compression for the eye,” for television. How much “real information” is necessary for intelligibility? Though he added, by the way: “I often wonder why people try to look at television.”
Margaret Mead had a different issue to raise. She did not want the group to forget that meaning can exist quite apart from phonemes and dictionary definitions. “If you talk about another kind of information,” she said, “if you are trying to communicate the fact that somebody is angry, what order of distortion might be introduced to take the anger out of a message that otherwise will carry exactly the same words?”♦
That evening Shannon took the floor. Never mind meaning, he said. He announced that, even though his topic was the redundancy of written English, he was not going to be interested in meaning at all.
He was talking about information as something transmitted from one point to another: “It might, for example, be a random sequence of digits, or it might be information for a guided missile or a television signal.”♦ What mattered was that he was going to represent the information source as a statistical process, generating messages with varying probabilities. He showed them the sample text strings he had used in The Mathematical Theory of Communication—which few of them had read—and described his “prediction experiment,” in which the subject guesses text letter by letter. He told them that English has a specific entropy, a quantity correlated with redundancy, and that he could use these experiments to compute the number. His listeners were fascinated—Wiener, in particular, thinking of his own “prediction theory.”
“My method has some parallelisms to this,” Wiener interrupted. “Excuse me for interrupting.”
There was a difference in emphasis between Shannon and Wiener. For Wiener, entropy was a measure of disorder; for Shannon, of uncertainty. Fundamentally, as they were realizing, these were the same. The more inherent order exists in a sample of English text—order in the form of statistical patterns, known consciously or unconsciously to speakers of the language—the more predictability there is, and in Shannon’s terms, the less information is conveyed by each subsequent letter. When the subject guesses the next letter with confidence, it is redundant, and the arrival of the letter contributes no new information. Information is surprise.
The others brimmed with questions about different languages, different prose styles, ideographic writing, and phonemes. One psychologist asked whether newspaper writing would look different, statistically, from the work of James Joyce. Leonard Savage, a statistician who worked with von Neumann, asked how Shannon chose a book for his test: at random?
“I just walked over to the shelf and chose one.”
“I wouldn’t call that random, would you?” said Savage. “There is a danger that the book might be about engineering.”♦ Shannon did not tell them that in point of fact it had been a detective novel.
Someone else wanted to know if Shannon could say whether baby talk would be more or less predictable than the speech of an adult.
“I think more predictable,” he replied, “if you are familiar with the baby.”
English is actually many different languages—as many, perhaps, as there are English speakers—each with different statistics. It also spawns artificial dialects: the language of symbolic logic, with its restricted and precise alphabet, and the language one questioner called “Airplanese,” employed by control towers and pilots. And language is in constant flux. Heinz von Foerster, a young physicist from Vienna and an early acolyte of Wittgenstein, wondered how the degree of redundancy in a language might change as the language evolved, and especially in the transition from oral to written culture.
Von Foerster, like Margaret Mead and others, felt uncomfortable with the notion of information without meaning. “I wanted to call the whole of what they called information theory signal theory,” he said later, “because information was not yet there. There were ‘beep beeps’ but that was all, no information. The moment one transforms that set of signals into other signals our brain can make an understanding of, then information is born—it’s not in the beeps.”♦ But he found himself thinking of the essence of language, its history in the mind and in the culture, in a new way. At first, he pointed out, no one is conscious of letters, or phonemes, as basic units of a language.
I’m thinking of the old Maya texts, the hieroglyphics of the Egyptians or the Sumerian tables of the first period. During the development of writing it takes some considerable time—or an accident—to recognize that a language can be split into smaller units than words, e.g., syllables or letters. I have the feeling that there is a feedback between writing and speaking.♦
The discussion changed his mind about the centrality of information. He added an epigrammatic note to his transcript of the eighth conference: “Information can be considered as order wrenched from disorder.”♦
Hard as Shannon tried to keep his listeners focused on his pure, meaning-free definition of information, this was a group that would not steer clear of semantic entanglements. They quickly grasped Shannon’s essential ideas, and they speculated far afield. “If we could agree to define as information anything which changes probabilities or reduces uncertainties,” remarked Alex Bavelas, a social psychologist, “changes in emotional security could be seen quite easily in this light.” What about gestures or facial expressions, pats on the back or winks across the table? As the psychologists absorbed this artificial way of thinking about signals and the brain, their whole discipline stood on the brink of a radical transformation.
Ralph Gerard, the neuroscientist, was reminded of a story. A stranger is at a party of people who know one another well. One says, “72,” and everyone laughs. Another says, “29,” and the party roars. The stranger asks what is going on.
His neighbor said, “We have many jokes and we have told them so often that now we just use a number.” The guest thought he’d try it, and after a few words said, “63.” The response was feeble. “What’s the matter, isn’t this a joke?”
“Oh, yes, that is one of our very best jokes, but you did not tell it well.”♦
The next year Shannon returned with a robot. It was not a very clever robot, nor lifelike in appearance, but it impressed the cybernetics group. It solved mazes. They called it Shannon’s rat.
He wheeled out a cabinet with a five-by-five grid on its top panel. Partitions could be placed around and between any of the twenty-five squares to make mazes in different configurations. A pin could be placed in any square to serve as the goal, and moving around the maze was a sensing rod driven by a pair of little motors, one for east-west and one for north-south. Under the hood lay an array of electrical relays, about seventy-five of them, interconnected, switching on and off to form the robot’s “memory.” Shannon flipped the switch to power it up.
“When the machine was turned off,” he said, “the relays essentially forgot everything they knew, so that they are now starting afresh, with no knowledge of the maze.” His listeners were rapt. “You see the finger now exploring the maze, hunting for the goal. When it reaches the center of a square, the machine makes a new decision as to the next direction to try.”♦ When the rod hit a partition, the motors reversed and the relays recorded the event. The machine made each “decision” based on its previous “knowledge”—it was impossible to avoid these psychological words—according to a strategy Shannon had designed. It wandered about the space by trial and error, turning down blind alleys and bumping into walls. Finally, as they all watched, the rat found the goal, a bell rang, a lightbulb flashed on, and the motors stopped.
Then Shannon put the rat back at the starting point for a new run. This time it went directly to the goal without making any wrong turns or hitting any partitions. It had “learned.” Placed in other, unexplored parts of the maze, it would revert to trial and error until, eventually, “it builds up a complete pattern of information and is able to reach the goal directly from any point.”♦
To carry out the exploring and goal-seeking strategy, the machine had to store one piece of information for each square it visited: namely, the direction by which it last left the square. There were only four possibilities—north, west, south, east—so, as Shannon carefully explained, two relays were assigned as memory for each square. Two relays meant two bits of information, enough for a choice among four alternatives, because there were four possible states: off-off, off-on, on-off, and on-on.
Next Shannon rearranged the partitions so that the old solution would no longer work. The machine would then “fumble around” till it found a new solution. Sometimes, however, a particularly awkward combination of previous memory and a new maze would put the machine in an endless loop. He showed them: “When it arrives at A, it remembers that the old solution said to go to B, and so it goes around the circle, A, B, C, D, A, B, C, D. It has established a vicious circle, or a singing condition.”♦
“A neurosis!” said Ralph Gerard.
Shannon added “an antineurotic circuit”: a counter, set to break out of the loop when the machine repeated the same sequence six times. Leonard Savage saw that this was a bit of a cheat. “It doesn’t have any way to recognize that it is ‘psycho’—it just recognizes that it has been going too long?” he asked. Shannon agreed.
SHANNON AND HIS MAZE (Illustration credit 8.2)
“It is all too human,” remarked Lawrence K. Frank.
“George Orwell should have seen this,” said Henry Brosin, a psychiatrist.
A peculiarity of the way Shannon had organized the machine’s memory—associating a single direction with each square—was that the path could not be reversed. Having reached the goal, the machine did not “know” how to return to its origin. The knowledge, such as it was, emerged from what Shannon called the vector field, the totality of the twenty-five directional vectors. “You can’t say where the sensing finger came from by studying the memory,” he explained.
“Like a man who knows the town,” said McCulloch, “so he can go from any place to any other place, but doesn’t always remember how he went.”♦
Shannon’s rat was kin to Babbage’s silver dancer and the metal swans and fishes of Merlin’s Mechanical Museum: automata performing a simulation of life. They never failed to amaze and entertain. The dawn of the information age brought a whole new generation of synthetic mice, beetles, and turtles, made with vacuum tubes and then transistors. They were crude, almost trivial, by the standards of just a few years later. In the case of the rat, the creature’s total memory amounted to seventy-five bits. Yet Shannon could fairly claim that it solved a problem by trial and error; retained the solution and repeated it without the errors; integrated new information from further experience; and “forgot” the solution when circumstances changed. The machine was not only imitating lifelike behavior; it was performing functions previously reserved for brains.
One critic, Dennis Gabor, a Hungarian electrical engineer who later won the Nobel Prize for inventing holography, complained, “In reality it is the maze which remembers, not the mouse.”♦ This was true up to a point. After all, there was no mouse. The electrical relays could have been placed anywhere, and they held the memory. They became, in effect, a mental model of a maze—a theory of a maze.
The postwar United States was hardly the only place where biologists and neuroscientists were suddenly making common cause with mathematicians and electrical engineers—though Americans sometimes talked as though it was. Wiener, who recounted his travels to other countries at some length in his introduction to Cybernetics, wrote dismissively that in England he had found researchers to be “well-informed” but that not much progress had been made “in unifying the subject and in pulling the various threads of research together.”♦ New cadres of British scientists began coalescing in response to information theory and cybernetics in 1949—mostly young, with fresh experience in code breaking, radar, and gun control. One of their ideas was to form a dining club in the English fashion—“limited membership and a post-prandial situation,” proposed John Bates, a pioneer in electroencephalography. This required considerable discussion of names, membership rules, venues, and emblems. Bates wanted electrically inclined biologists and biologically oriented engineers and suggested “about fifteen people who had Wiener’s ideas before Wiener’s book appeared.”♦ They met for the first time in the basement of the National Hospital for Nervous Diseases, in Bloomsbury, and decided to call themselves the Ratio Club—a name meaning whatever anyone wanted. (Their chroniclers Philip Husbands and Owen Holland, who interviewed many of the surviving members, report that half pronounced it RAY-she-oh and half RAT-ee-oh.♦) For their first meeting they invited Warren McCulloch.
They talked not just about understanding brains but “designing” them. A psychiatrist, W. Ross Ashby, announced that he was working on the idea that “a brain consisting of randomly connected impressional synapses would assume the required degree of orderliness as a result of experience”♦—in other words, that the mind is a self-organizing dynamical system. Others wanted to talk about pattern recognition, about noise in the nervous system, about robot chess and the possibility of mechanical self-awareness. McCulloch put it this way: “Think of the brain as a telegraphic relay, which, tripped by a signal, emits another signal.” Relays had come a long way since Morse’s time. “Of the molecular events of brains these signals are the atoms. Each goes or does not go.” The fundamental unit is a choice, and it is binary. “It is the least event that can be true or false.”♦
They also managed to attract Alan Turing, who published his own manifesto with a provocative opening statement—“I propose to consider the question, ‘Can machines think?’ ”♦—followed by a sly admission that he would do so without even trying to define the terms machine and think. His idea was to replace the question with a test called the Imitation Game, destined to become famous as the “Turing Test.” In its initial form the Imitation Game involves three people: a man, a woman, and an interrogator. The interrogator sits in a room apart and poses questions (ideally, Turing suggests, by way of a “teleprinter communicating between the two rooms”). The interrogator aims to determine which is the man and which is the woman. One of the two—say, the man—aims to trick the interrogator, while the other aims to help reveal the truth. “The best strategy for her is probably to give truthful answers,” Turing suggests. “She can add such things as ‘I am the woman, don’t listen to him!’ but it will avail nothing as the man can make similar remarks.”
But what if the question is not which gender but which genus: human or machine?
It is understood that the essence of being human lies in one’s “intellectual capacities”; hence this game of disembodied messages transmitted blindly between rooms. “We do not wish to penalise the machine for its inability to shine in beauty competitions,” says Turing dryly, “nor to penalise a man for losing in a race against an aeroplane.” Nor, for that matter, for slowness in arithmetic. Turing offers up some imagined questions and answers:
Q: Please write me a sonnet on the subject of the Forth Bridge.
A: Count me out on this one. I never could write poetry.
Before proceeding further, however, he finds it necessary to explain just what sort of machine he has in mind. “The present interest in ‘thinking machines,’ ” he notes, “has been aroused by a particular kind of machine, usually called an ‘electronic computer’ or ‘digital computer.’ ”♦ These devices do the work of human computers, faster and more reliably. Turing spells out, as Shannon had not, the nature and properties of the digital computer. John von Neumann had done this, too, in constructing a successor machine to ENIAC. The digital computer comprises three parts: a “store of information,” corresponding to the human computer’s memory or paper; an “executive unit,” which carries out individual operations; and a “control,” which manages a list of instructions, making sure they are carried out in the right order. These instructions are encoded as numbers. They are sometimes called a “programme,” Turing explains, and constructing such a list may be called “programming.”
The idea is an old one, Turing says, and he cites Charles Babbage, whom he identifies as Lucasian Professor of Mathematics at Cambridge from 1828 to 1839—once so famous, now almost forgotten. Turing explains that Babbage “had all the essential ideas” and “planned such a machine, called the Analytical Engine, but it was never completed.” It would have used wheels and cards—nothing to do with electricity. The existence (or nonexistence, but at least near existence) of Babbage’s engine allows Turing to rebut a superstition he senses forming in the zeitgeist of 1950. People seem to feel that the magic of digital computers is essentially electrical; meanwhile, the nervous system is also electrical. But Turing is at pains to think of computation in a universal way, which means in an abstract way. He knows it is not about electricity at all:
Since Babbage’s machine was not electrical, and since all digital computers are in a sense equivalent, we see that this use of electricity cannot be of theoretical importance.… The feature of using electricity is thus seen to be only a very superficial similarity.♦
Turing’s famous computer was a machine made of logic: imaginary tape, arbitrary symbols. It had all the time in the world and unbounded memory, and it could do anything expressible in steps and operations. It could even judge the validity of a proof in the system of Principia Mathematica. “In the case that the formula is neither provable nor disprovable such a machine certainly does not behave in a very satisfactory manner, for it continues to work indefinitely without producing any result at all, but this cannot be regarded as very different from the reaction of the mathematicians.”♦ So Turing supposed it could play the Imitation Game.
He could not pretend to prove that, of course. He was mainly trying to change the terms of a debate he considered largely fatuous. He offered a few predictions for the half century to come: that computers would have a storage capacity of 109 bits (he imagined a few very large computers; he did not foresee our future of ubiquitous tiny computing devices with storage many magnitudes greater than that); and that they might be programmed to play the Imitation Game well enough to fool some interrogators for at least a few minutes (true, as far as it goes).
The original question, “Can machines think?” I believe to be too meaningless to deserve discussion. Nevertheless I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.♦
He did not live to see how apt his prophecy was. In 1952 he was arrested for the crime of homosexuality, tried, convicted, stripped of his security clearance, and subjected by the British authorities to a humiliating, emasculating program of estrogen injections. In 1954 he took his own life.
Until years later, few knew of Turing’s crucial secret work for his country on the Enigma project at Bletchley Park. His ideas of thinking machines did attract attention, on both sides of the Atlantic. Some of the people who found the notion absurd or even frightening appealed to Shannon for his opinion; he stood squarely with Turing. “The idea of a machine thinking is by no means repugnant to all of us,” Shannon told one engineer. “In fact, I find the converse idea, that the human brain may itself be a machine which could be duplicated functionally with inanimate objects, quite attractive.” More useful, anyway, than “hypothecating intangible and unreachable ‘vital forces,’ ‘souls’ and the like.”♦
Computer scientists wanted to know what their machines could do. Psychologists wanted to know whether brains are computers—or perhaps whether brains are merely computers. At midcentury computer scientists were new; but so, in their way, were psychologists.
Psychology at midcentury had grown moribund. Of all the sciences, it always had the most difficulty in saying what exactly it studied. Originally its object was the soul, as opposed to the body (somatology) and the blood (hematology). “Psychologie is a doctrine which searches out man’s Soul, and the effects of it; this is the part without which a man cannot consist,”♦ wrote James de Back in the seventeenth century. Almost by definition, though, the soul was ineffable—hardly a thing to be known. Complicating matters further was the entanglement (in psychology as in no other field) of the observer with the observed. In 1854, when it was still more likely to be called “mental philosophy,” David Brewster lamented that no other department of knowledge had made so little progress as “the science of mind, if it can be called a science.”♦
Viewed as material by one inquirer, as spiritual by another, and by others as mysteriously compounded as both, the human mind escapes from the cognisance of sense and reason, and lies, a waste field with a northern exposure, upon which every passing speculator casts his mental tares.
The passing speculators were still looking mainly inward, and the limits of introspection were apparent. Looking for rigor, verifiability, and perhaps even mathematicization, students of the mind veered in radically different directions by the turn of the twentieth century. Sigmund Freud’s path was only one. In the United States, William James constructed a discipline of psychology almost single-handed—professor of the first university courses, author of the first comprehensive textbook—and when he was done, he threw up his hands. His own Principles of Psychology, he wrote, was “a loathsome, distended, tumefied, bloated, dropsical mass, testifying to but two facts: 1st, that there is no such thing as a science of psychology, and 2nd, that WJ is an incapable.”♦
In Russia, a new strain of psychology began with a physiologist, Ivan Petrovich Pavlov, known for his Nobel Prize–winning study of digestion, who scorned the word psychology and all its associated terminology. James, in his better moods, considered psychology the science of mental life, but for Pavlov there was no mind, only behavior. Mental states, thoughts, emotions, goals, and purpose—all these were intangible, subjective, and out of reach. They bore the taint of religion and superstition. What James had identified as central topics—“the stream of thought,” “the consciousness of self,” the perception of time and space, imagination, reasoning, and will—had no place in Pavlov’s laboratory. All a scientist could observe was behavior, and this, at least, could be recorded and measured. The behaviorists, particularly John B. Watson in the United States and then, most famously, B. F. Skinner, made a science based on stimulus and response: food pellets, bells, electric shocks; salivation, lever pressing, maze running. Watson said that the whole purpose of psychology was to predict what responses would follow a given stimulus and what stimuli could produce a given behavior. Between stimulus and response lay a black box, known to be composed of sense organs, neural pathways, and motor functions, but fundamentally off limits. In effect, the behaviorists were saying yet again that the soul is ineffable. For a half century, their research program thrived because it produced results about conditioning reflexes and controlling behavior.
Behaviorists said, as the psychologist George Miller put it afterward: “You talk about memory; you talk about anticipation; you talk about your feelings; you talk about all these mentalistic things. That’s moonshine. Show me one, point to one.”♦ They could teach pigeons to play ping-pong and rats to run mazes. But by midcentury, frustration had set in. The behaviorists’ purity had become a dogma; their refusal to consider mental states became a cage, and psychologists still wanted to understand what the mind was.
Information theory gave them a way in. Scientists analyzed the processing of information and built machines to do it. The machines had memory. They simulated learning and goal seeking. A behaviorist running a rat through a maze would discuss the association between stimulus and response but would refuse to speculate in any way about the mind of the rat; now engineers were building mental models of rats out of a few electrical relays. They were not just prying open the black box; they were making their own. Signals were being transmitted, encoded, stored, and retrieved. Internal models of the external world were created and updated. Psychologists took note. From information theory and cybernetics, they received a set of useful metaphors and even a productive conceptual framework. Shannon’s rat could be seen not only as a very crude model of the brain but also as a theory of behavior. Suddenly psychologists were free to talk about plans, algorithms, syntactic rules. They could investigate not just how living creatures react to the outside world but how they represent it to themselves.
Shannon’s formulation of information theory seemed to invite researchers to look in a direction that he himself had not intended. He had declared, “The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.” A psychologist could hardly fail to consider the case where the source of the message is the outside world and the receiver is the mind.
Ears and eyes were to be understood as message channels, so why not test and measure them like microphones and cameras? “New concepts of the nature and measure of information,” wrote Homer Jacobson, a chemist at Hunter College in New York, “have made it possible to specify quantitatively the informational capacity of the human ear,”♦ and he proceeded to do so. Then he did the same for the eye, arriving at an estimate four hundred times greater, in bits per second. Many more subtle kinds of experiments were suddenly fair game, some of them directly suggested by Shannon’s work on noise and redundancy. A group in 1951 tested the likelihood that listeners would hear a word correctly when they knew it was one of just a few alternatives, as opposed to many alternatives.♦ It seemed obvious but had never been done. Experimenters explored the effect of trying to understand two conversations at once. They began considering how much information an ensemble of items contained—digits or letters or words—and how much could be understood or remembered. In standard experiments, with speech and buzzers and key pressing and foot tapping, the language of stimulus and response began to give way to transmission and reception of information.
For a brief period, researchers discussed the transition explicitly; later it became invisible. Donald Broadbent, an English experimental psychologist exploring issues of attention and short-term memory, wrote of one experiment in 1958: “The difference between a description of the results in terms of stimulus and response, and a description in information theory terms, becomes most marked.… One could no doubt develop an adequate description of the results in S-R terms … but such a description is clumsy compared to the information theory description.”♦ Broadbent founded an applied psychology division at Cambridge University, and a flood of research followed, there and elsewhere, in the general realm of how people handle information: effects of noise on performance; selective attention and filtering of perception; short-term and long-term memory; pattern recognition; problem solving. And where did logic belong? To psychology or to computer science? Surely not just to philosophy.
An influential counterpart of Broadbent’s in the United States was George Miller, who helped found the Center for Cognitive Studies at Harvard in 1960. He was already famous for a paper published in 1956 under the slightly whimsical title “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information.”♦ Seven seemed to be the number of items that most people could hold in working memory at any one time: seven digits (the typical American telephone number of the time), seven words, or seven objects displayed by an experimental psychologist. The number also kept popping up, Miller claimed, in other sorts of experiments. Laboratory subjects were fed sips of water with different amounts of salt, to see how many different levels of saltiness they could discriminate. They were asked to detect differences between tones of varying pitch or loudness. They were shown random patterns of dots, flashed on a screen, and asked how many (below seven, they almost always knew; above seven, they almost always estimated). In one way and another, the number seven kept recurring as a threshold. “This number assumes a variety of disguises,” he wrote, “being sometimes a little larger and sometimes a little smaller than usual, but never changing so much as to be unrecognizable.”
Clearly this was a crude simplification of some kind; as Miller noted, people can identify any of thousands of faces or words and can memorize long sequences of symbols. To see what kind of simplification, he turned to information theory, and especially to Shannon’s understanding of information as a selection among possible alternatives. “The observer is considered to be a communication channel,” he announced—a formulation sure to appall the behaviorists who dominated the profession. Information is being transmitted and stored—information about loudness, or saltiness, or number. He explained about bits:
One bit of information is the amount of information that we need to make a decision between two equally likely alternatives. If we must decide whether a man is less than six feet tall or more than six feet tall and if we know that the chances are 50-50, then we need one bit of information.…
Two bits of information enable us to decide among four equally likely alternatives. Three bits of information enable us to decide among eight equally likely alternatives … and so on. That is to say, if there are 32 equally likely alternatives, we must make five successive binary decisions, worth one bit each, before we know which alternative is correct. So the general rule is simple: every time the number of alternatives is increased by a factor of two, one bit of information is added.
The magical number seven is really just under three bits. Simple experiments measured discrimination, or channel capacity, in a single dimension; more complex measures arise from combinations of variables in multiple dimensions—for example, size, brightness, and hue. And people perform acts of what information theorists call “recoding,” grouping information into larger and larger chunks—for example, organizing telegraph dots and dashes into letters, letters into words, and words into phrases. By now Miller’s argument had become something in the nature of a manifesto. Recoding, he declared, “seems to me to be the very lifeblood of the thought processes.”
The concepts and measures provided by the theory of information provide a quantitative way of getting at some of these questions. The theory provides us with a yardstick for calibrating our stimulus materials and for measuring the performance of our subjects.… Informational concepts have already proved valuable in the study of discrimination and of language; they promise a great deal in the study of learning and memory; and it has even been proposed that they can be useful in the study of concept formation. A lot of questions that seemed fruitless twenty or thirty years ago may now be worth another look.
This was the beginning of the movement called the cognitive revolution in psychology, and it laid the foundation for the discipline called cognitive science, combining psychology, computer science, and philosophy. Looking back, some philosophers have called this moment the informational turn. “Those who take the informational turn see information as the basic ingredient in building a mind,” writes Frederick Adams. “Information has to contribute to the origin of the mental.”♦ As Miller himself liked to say, the mind came in on the back of the machine.♦
Shannon was hardly a household name—he never did become famous to the general public—but he had gained an iconic stature in his own academic communities, and sometimes he gave popular talks about “information” at universities and museums. He would explain the basic ideas; puckishly quote Matthew 5:37, “Let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil” as a template for the notions of bits and of redundant encoding; and speculate about the future of computers and automata. “Well, to conclude,” he said at the University of Pennsylvania, “I think that this present century in a sense will see a great upsurge and development of this whole information business; the business of collecting information and the business of transmitting it from one point to another, and perhaps most important of all, the business of processing it.”♦
With psychologists, anthropologists, linguists, economists, and all sorts of social scientists climbing aboard the bandwagon of information theory, some mathematicians and engineers were uncomfortable. Shannon himself called it a bandwagon. In 1956 he wrote a short warning notice—four paragraphs: “Our fellow scientists in many different fields, attracted by the fanfare and by the new avenues opened to scientific analysis, are using these ideas in their own problems.… Although this wave of popularity is certainly pleasant and exciting for those of us working in the field, it carries at the same time an element of danger.”♦ Information theory was in its hard core a branch of mathematics, he reminded them. He, personally, did believe that its concepts would prove useful in other fields, but not everywhere, and not easily: “The establishing of such applications is not a trivial matter of translating words to a new domain, but rather the slow tedious process of hypothesis and experimental verification.” Furthermore, he felt the hard slogging had barely begun in “our own house.” He urged more research and less exposition.
As for cybernetics, the word began to fade. The Macy cyberneticians held their last meeting in 1953, at the Nassau Inn in Princeton; Wiener had fallen out with several of the group, who were barely speaking to him. Given the task of summing up, McCulloch sounded wistful. “Our consensus has never been unanimous,” he said. “Even had it been so, I see no reason why God should have agreed with us.”♦
Throughout the 1950s, Shannon remained the intellectual leader of the field he had founded. His research produced dense, theorem-packed papers, pregnant with possibilities for development, laying foundations for broad fields of study. What Marshall McLuhan later called the “medium” was for Shannon the channel, and the channel was subject to rigorous mathematical treatment. The applications were immediate and the results fertile: broadcast channels and wiretap channels, noisy and noiseless channels, Gaussian channels, channels with input constraints and cost constraints, channels with feedback and channels with memory, multiuser channels and multiaccess channels. (When McLuhan announced that the medium was the message, he was being arch. The medium is both opposite to, and entwined with, the message.)
CLAUDE SHANNON (1963) (Illustration credit 8.3)
One of Shannon’s essential results, the noisy coding theorem, grew in importance, showing that error correction can effectively counter noise and corruption. At first this was just a tantalizing theoretical nicety; error correction requires computation, which was not yet cheap. But during the 1950s, work on error-correcting methods began to fulfill Shannon’s promise, and the need for them became apparent. One application was exploration of space with rockets and satellites; they needed to send messages very long distances with limited power. Coding theory became a crucial part of computer science, with error correction and data compression advancing side by side. Without it, modems, CDs, and digital television would not exist. For mathematicians interested in random processes, coding theorems are also measures of entropy.
Shannon, meanwhile, made other theoretical advances that planted seeds for future computer design. One discovery showed how to maximize flow through a network of many branches, where the network could be a communication channel or a railroad or a power grid or water pipes. Another was aptly titled “Reliable Circuits Using Crummy Relays” (though this was changed for publication to “… Less Reliable Relays”).♦ He studied switching functions, rate-distortion theory, and differential entropy. All this was invisible to the public, but the seismic tremors that came with the dawn of computing were felt widely, and Shannon was part of that, too.
As early as 1948 he completed the first paper on a problem that he said, “of course, is of no importance in itself”♦: how to program a machine to play chess. People had tried this before, beginning in the eighteenth and nineteenth centuries, when various chess automata toured Europe and were revealed every so often to have small humans hiding inside. In 1910 the Spanish mathematician and tinkerer Leonardo Torres y Quevedo built a real chess machine, entirely mechanical, called El Ajedrecista, that could play a simple three-piece endgame, king and rook against king.
Shannon now showed that computers performing numerical calculations could be made to play a full chess game. As he explained, these devices, “containing several thousand vacuum tubes, relays, and other elements,” retained numbers in “memory,” and a clever process of translation could make these numbers represent the squares and pieces of a chessboard. The principles he laid out have been employed in every chess program since. In these salad days of computing, many people immediately assumed that chess would be solved: fully known, in all its pathways and combinations. They thought a fast electronic computer would play perfect chess, just as they thought it would make reliable long-term weather forecasts. Shannon made a rough calculation, however, and suggested that the number of possible chess games was more than 10120—a number that dwarfs the age of the universe in nanoseconds. So computers cannot play chess by brute force; they must reason, as Shannon saw, along something like human lines.
He visited the American champion Edward Lasker in his apartment on East Twenty-third Street in New York, and Lasker offered suggestions for improvement.♦ When Scientific American published a simplified version of his paper in 1950, Shannon could not resist raising the question on everyone’s minds: “Does a chess-playing machine of this type ‘think’ ”
From a behavioristic point of view, the machine acts as though it were thinking. It has always been considered that skillful chess play requires the reasoning faculty. If we regard thinking as a property of external actions rather than internal method the machine is surely thinking.
Nonetheless, as of 1952 he estimated that it would take three programmers working six months to enable a large-scale computer to play even a tolerable amateur game. “The problem of a learning chess player is even farther in the future than a preprogrammed type. The methods which have been suggested are obviously extravagantly slow. The machine would wear out before winning a single game.”♦ The point, though, was to look in as many directions as possible for what a general-purpose computer could do.
He was exercising his sense of whimsy, too. He designed and actually built a machine to do arithmetic with Roman numerals: for example, IV times XII equals XLVIII. He dubbed this THROBAC I, an acronym for Thrifty Roman-numeral Backward-looking Computer. He created a “mind-reading machine” meant to play the child’s guessing game of odds and evens. What all these flights of fancy had in common was an extension of algorithmic processes into new realms—the abstract mapping of ideas onto mathematical objects. Later, he wrote thousands of words on scientific aspects of juggling♦—with theorems and corollaries—and included from memory a quotation from E. E. Cummings: “Some son-of-a-bitch will invent a machine to measure Spring with.”
In the 1950s Shannon was also trying to design a machine that would repair itself.♦ If a relay failed, the machine would locate and replace it. He speculated on the possibility of a machine that could reproduce itself, collecting parts from the environment and assembling them. Bell Labs was happy for him to travel and give talks on such things, often demonstrating his maze-learning machine, but audiences were not universally delighted. The word “Frankenstein” was heard. “I wonder if you boys realize what you’re toying around with there,” wrote a newspaper columnist in Wyoming.
What happens if you switch on one of these mechanical computers but forget to turn it off before you leave for lunch? Well, I’ll tell you. The same thing would happen in the way of computers in America that happened to Australia with jack rabbits. Before you could multiply 701,945,240 by 879,030,546, every family in the country would have a little computer of their own.…
Mr. Shannon, I don’t mean to knock your experiments, but frankly I’m not remotely interested in even one computer, and I’m going to be pretty sore if a gang of them crowd in on me to multiply or divide or whatever they do best.♦
Two years after Shannon raised his warning flag about the bandwagon, a younger information theorist, Peter Elias, published a notice complaining about a paper titled “Information Theory, Photosynthesis, and Religion.”♦ There was, of course, no such paper. But there had been papers on information theory, life, and topology; information theory and the physics of tissue damage; and clerical systems; and psychopharmacology; and geophysical data interpretation; and crystal structure; and melody. Elias, whose father had worked for Edison as an engineer, was himself a serious specialist—a major contributor to coding theory. He mistrusted the softer, easier, platitudinous work flooding across disciplinary boundaries. The typical paper, he said, “discusses the surprisingly close relationship between the vocabulary and conceptual framework of information theory and that of psychology (or genetics, or linguistics, or psychiatry, or business organization).… The concepts of structure, pattern, entropy, noise, transmitter, receiver, and code are (when properly interpreted) central to both.” He declared this to be larceny. “Having placed the discipline of psychology for the first time on a sound scientific basis, the author modestly leaves the filling in of the outline to the psychologists.” He suggested his colleagues give up larceny for a life of honest toil.
These warnings from Shannon and Elias appeared in one of the growing number of new journals entirely devoted to information theory.
In these circles a notorious buzzword was entropy. Another researcher, Colin Cherry, complained, “We have heard of ‘entropies’ of languages, social systems, and economic systems and of its use in various method-starved studies. It is the kind of sweeping generality which people will clutch like a straw.”♦ He did not say, because it was not yet apparent, that information theory was beginning to change the course of theoretical physics and of the life sciences and that entropy was one of the reasons.
In the social sciences, the direct influence of information theorists had passed its peak. The specialized mathematics had less and less to contribute to psychology and more and more to computer science. But their contributions had been real. They had catalyzed the social sciences and prepared them for the new age under way. The work had begun; the informational turn could not be undone.
♦ As Jean-Pierre Dupuy remarks: “It was, at bottom, a perfectly ordinary situation, in which scientists blamed nonscientists for taking them at their word. Having planted the idea in the public mind that thinking machines were just around the corner, the cyberneticians hastened to dissociate themselves from anyone gullible enough to believe such a thing.”