The End of Moore’s Law

I have been working on an upcoming post about megatrends and how they drive tech.  I had included the end of Moore’s Law to illustrate how the end of a megatrend might also have a big influence on tech, but that section got away from me, becoming much larger than the sections on each individual current megatrend. So I decided to break it out into a separate post and publish it first.  Here it is.

Moore’s Law, concerning what we put on silicon wafers, is over after a solid fifty year run that completely reshaped our world. But that end unleashes lots of new opportunities.


Moore, Gordon E., Cramming more components onto integrated circuits, Electronics, Vol 32, No. 8, April 19, 1965.

Electronics was a trade journal that published monthly, mostly, from 1930 to 1995. Gordon Moore’s four and a half page contribution in 1965 was perhaps its most influential article ever. That article not only articulated the beginnings, and it was the very beginnings, of a trend, but the existence of that articulation became a goal/law that has run the silicon based circuit industry (which is the basis of every digital device in our world) for fifty years. Moore was a Cal Tech PhD, cofounder in 1957 of Fairchild Semiconductor, and head of its research and development laboratory from 1959. Fairchild had been founded to make transistors from silicon at a time when they were usually made from much slower germanium.

One can find many files on the Web that claim to be copies of the original paper, but I have noticed that some of them have the graphs redrawn and that they are sometimes slightly different from the ones that I have always taken to be the originals. Below I reproduce two figures from the original that as far as I can tell have only been copied from an original paper version of the magazine, with no manual/human cleanup.

The first one that I reproduce here is the money shot for the origin of Moore’s Law. There was however an equally important earlier graph in the paper which was predictive of the future yield over time of functional circuits that could be made from silicon. It had less actual data than this one, and as we’ll see, that is really saying something.

This graph is about the number of components on an integrated circuit. An integrated circuit is made through a process that is like printing. Light is projected onto a thin wafer of silicon in a number of different patterns, while different gases fill the chamber in which it is held. The different gases cause different light activated chemical processes to happen on the surface of the wafer, sometimes depositing some types of material, and sometimes etching material away. With precise masks to pattern the light, and precise control over temperature and duration of exposures, a physical two dimensional electronic circuit can be printed. The circuit has transistors, resistors, and other components. Lots of them might be made on a single wafer at once, just as lots of letters are printed on a single page at one. The yield is how many of those circuits are functional–small alignment or timing errors in production can screw up some of the circuits in any given print. Then the silicon wafer is cut up into pieces, each containing one of the circuits and each is put inside its own plastic package with little “legs” sticking out as the connectors–if you have looked at a circuit board made in the last forty years you have seen it populated with lots of integrated circuits.

The number of components in a single integrated circuit is important. Since the circuit is printed it involves no manual labor, unlike earlier electronics where every single component had to be placed and attached by hand. Now a complex circuit which involves multiple integrated circuits only requires hand construction (later this too was largely automated), to connect up a much smaller number of components. And as long as one has a process which gets good yield, it is constant time to build a single integrated circuit, regardless of how many components are in it. That means less total integrated circuits that need to be connected by hand or machine. So, as Moore’s paper’s title references, cramming more components into a single integrated circuit is a really good idea.

The graph plots the logarithm base two of the number of components in an integrated circuit on the vertical axis against calendar years on the horizontal axis. Every notch upwards on the left doubles the number of components. So while 3 means 2^3 = 8 components, 13 means 2^{13} = 8,192 components. That is a thousand fold increase from 1962 to 1972.

There are two important things to note here.

The first is that he is talking about components on an integrated circuit, not just the number of transistors. Generally there are many more components than transistors, though the ratio did drop over time as different fundamental sorts of transistors were used. But in later years Moore’s Law was often turned into purely a count of transistors.

The other thing is that there are only four real data points here in this graph which he published in 1965. In 1959 the number of components is 2^0 = 1, i.e., that is not about an integrated circuit at all, just about single circuit elements–integrated circuits had not yet been invented. So this is a null data point.  Then he plots four actual data points, which we assume were taken from what Fairchild could produce, for 1962, 1963, 1964, and 1965, having 8, 16, 32, and 64 components. That is a doubling every year. It is an exponential increase in the true sense of exponential^{\big 1}.

What is the mechanism for this, how can this work? It works because it is in the digital domain, the domain of yes or no, the domain of 0 or 1.

In the last half page of the four and a half page article Moore explains the limitations of his prediction, saying that for some things, like energy storage, we will not see his predicted trend. Energy takes up a certain number of atoms and their electrons to store a given amount, so you can not just arbitrarily change the number of atoms and still store the same amount of energy. Likewise if you have a half gallon milk container you can not put a gallon of milk in it.

But the fundamental digital abstraction is yes or no. A circuit element in an integrated circuit just needs to know whether a previous element said yes or no, whether there is a voltage or current there or not. In the design phase one decides above how many volts or amps, or whatever, means yes, and below how many means no. And there needs to be a good separation between those numbers, a significant no mans land compared to the maximum and minimum possible. But, the magnitudes do not matter.

I like to think of it like piles of sand. Is there a pile of sand on the table or not? We might have a convention about how big a typical pile of sand is. But we can make it work if we halve the normal size of a pile of sand. We can still answer whether or not there is a pile of sand there using just half as many grains of sand in a pile.

And then we can halve the number again. And the digital abstraction of yes or no still works. And we can halve it again, and it still works. And again, and again, and again.

This is what drives Moore’s Law, which in its original form said that we could expect to double the number of components on an integrated circuit every year for 10 years, from 1965 to 1975.  That held up!

Variations of Moore’s Law followed; they were all about doubling, but sometimes doubling different things, and usually with slightly longer time constants for the doubling. The most popular versions were doubling of the number of transistors, doubling of the switching speed of those transistors (so a computer could run twice as fast), doubling of the amount of memory on a single chip, and doubling of the secondary memory of a computer–originally on mechanically spinning disks, but for the last five years in solid state flash memory. And there were many others.

Let’s get back to Moore’s original law for a moment. The components on an integrated circuit are laid out on a two dimensional wafer of silicon. So to double the number of components for the same amount of silicon you need to double the number of components per unit area. That means that the size of a component, in each linear dimension of the wafer needs to go down by a factor of \frac{1}{\sqrt{2}}. In turn, that means that Moore was seeing the linear dimension of each component go down to 71\% of what it was in a year, year over year.

But why was it limited to just a measly factor of two per year? Given the pile of sand analogy from above, why not just go to a quarter of the size of a pile of sand each year, or one sixteenth? It gets back to the yield one gets, the number of working integrated circuits, as you reduce the component size (most commonly called feature size). As the feature size gets smaller, the alignment of the projected patterns of light for each step of the process needs to get more accurate. Since \sqrt{2} = 1.41, approximately, it needs to get better by {{\sqrt{2}-1}\over{\sqrt{2}}}= 29\% as you halve the feature size. And because impurities in the materials that are printed on the circuit, the material from the gasses that are circulating and that are activated by light, the gas needs to get more pure, so that there are fewer bad atoms in each component, now half the area of before. Implicit in Moore’s Law, in its original form, was the idea that we could expect the production equipment to get better by about 29\% per year, for 10 years.

For various forms of Moore’s Law that came later, the time constant stretched out to 2 years, or even a little longer, for a doubling, but nevertheless the processing equipment has gotten that 29\% better time period over time period, again and again.

To see the magic of how this works, let’s just look at 25 doublings. The equipment has to operate with things \sqrt{2}^{25} times smaller, i.e., roughly 5,793 times smaller. But we can fit 2^{25} more components in a single circuit, which is 33,554,432 times more. The accuracy of our equipment has improved 5,793 times, but that has gotten a further acceleration of 5,793 on top of the original 5,793 times due to the linear to area impact. That is where the payoff of Moore’s Law has come from.

In his original paper Moore only dared project out, and only implicitly, that the equipment would get 29\% better every year for ten years. In reality, with somewhat slowing time constants, that has continued to happen for 50 years.

Now it is coming to an end. But not because the accuracy of the equipment needed to give good yields has stopped improving. No. Rather it is because those piles of sand we referred to above have gotten so small that they only contain a single metaphorical grain of sand. We can’t split the minimal quantum of a pile into two any more.


Perhaps the most remarkable thing is Moore’s foresight into how this would have an incredible impact upon the world.  Here is the first sentence of his second paragraph:

Integrated circuits will lead to such wonders as home computers–or at least terminals connected to a central computer–automatic controls for automobiles, and personal portable communications equipment.

This was radical stuff in 1965. So called “mini computers” were still the size of a desk, and to be useful usually had a few peripherals such as tape units, card readers, or printers, that meant they would be hard to fit into a home kitchen of the day, even with the refrigerator, oven, and sink removed. Most people had never seen a computer and even fewer had interacted with one, and those who had, had mostly done it by dropping off a deck of punched cards, and a day later picking up a printout from what the computer had done when humans had fed the cards to the machine.

The electrical systems of cars were unbelievably simple by today’s standards, with perhaps half a dozen on off switches, and simple electromechanical devices to drive the turn indicators, windshield wipers, and the “distributor” which timed the firing of the spark plugs–every single function producing piece of mechanism in auto electronics was big enough to be seen with the naked eye.  And personal communications devices were rotary dial phones, one per household, firmly plugged into the wall at all time. Or handwritten letters than needed to be dropped into the mail box.

That sentence quoted above, given when it was made, is to me the bravest and most insightful prediction of technology future that we have ever seen.

By the way, the first computer made from integrated circuits was the guidance computer for the Apollo missions, one in the Command Module, and one in the Lunar Lander. The integrated circuits were made by Fairchild, Gordon Moore’s company. The first version had 4,100 integrated circuits, each implementing a single 3 input NOR gate. The more capable manned flight versions, which first flew in 1968, had only 2,800 integrated circuits, each implementing two 3 input NOR gates. Moore’s Law had its impact on getting to the Moon, even in the Law’s infancy.


In the original magazine article this cartoon appears:

At a fortieth anniversary of Moore’s Law at the Chemical Heritage Foundation^{\big 2} in Philadelphia I asked Dr. Moore whether this cartoon had been his idea. He replied that he had nothing to do with it, and it was just there in the magazine in the middle of his article, to his surprise.

Without any evidence at all on this, my guess is that the cartoonist was reacting somewhat skeptically to the sentence quoted above. The cartoon is set in a department store, as back then US department stores often had a “Notions” department, although this was not something of which I have any personal experience as they are long gone (and I first set foot in the US in 1977). It seems that notions is another word for haberdashery, i.e., pins, cotton, ribbons, and generally things used for sewing. As still today, there is also a Cosmetics department. And plop in the middle of them is the Handy Home Computers department, with the salesman holding a computer in his hand.

I am guessing that the cartoonist was making fun of this idea, trying to point out the ridiculousness of it. It all came to pass in only 25 years, including being sold in department stores. Not too far from the cosmetics department. But the notions departments had all disappeared. The cartoonist was right in the short term, but blew it in the slightly longer term^{\big 3}.


There were many variations on Moore’s Law, not just his original about the number of components on a single chip.

Amongst the many there was a version of the law about how fast circuits could operate, as the smaller the transistors were the faster they could switch on and off. There were versions of the law for how much RAM memory, main memory for running computer programs, there would be and when. And there were versions of the law for how big and fast disk drives, for file storage, would be.

This tangle of versions of Moore’s Law had a big impact on how technology developed. I will discuss three modes of that impact; competition, coordination, and herd mentality in computer design.


Memory chips are where data and programs are stored as they are run on a computer. Moore’s Law applied to the number of bits of memory that a single chip could store, and a natural rhythm developed of that number of bits going up my a multiple of four on a regular but slightly slowing basis. By jumping over just a doubling, the cost of the silicon foundries could me depreciated over long enough time to keep things profitable (today a silicon foundry is about a $7B capital cost!), and furthermore it made sense to double the number of memory cells in each dimension to keep the designs balanced, again pointing to a step factor of four.

In the very early days of desktop PCs memory chips had 2^{14} = 16384 bits. The memory chips were called RAM (Random Access Memory–i.e., any location in memory took equally long to access, there were no slower of faster places), and a chip of this size was called a 16K chip, where K means not exactly 1,000, but instead 1,024 (which is 2^{10}). Many companies produced 16K RAM chips. But they all knew from Moore’s Law when the market would be expecting 64K RAM chips to appear. So they knew what they had to do to not get left behind, and they knew when they had to have samples ready for engineers designing new machines so that just as the machines came out their chips would be ready to be used having been designed in. And they could judge when it was worth getting just a little ahead of the competition at what price. Everyone knew the game (and in fact all came to a consensus agreement on when the Moore’s Law clock should slow down just a little), and they all competed on operational efficiency.


Technology Review talks about this in their story on the end of Moore’s Law. If you were the designer of a new computer box for a desktop machine, or any other digital machine for that matter, you could look at when you planned to hit the market and know what amount of RAM memory would take up what board space because you knew how many bits per chip would be available at that time.  And you knew how much disk space would be available at what price and what physical volume (disks got smaller and smaller diameters just as they increased the total amount of storage). And you knew how fast the latest processor chip would run. And you knew what resolution display screen would be available at what price. So a couple of years ahead you could put all these numbers together and come up with what options and configurations would make sense by the exact time when you were going to bring your new computer to market.

The company that sold the computers might make one or two of the critical chips for their products but mostly they bought other components from other suppliers. The clockwork certainty of Moore’s Law let them design a new product without having horrible surprises disrupt their flow and plans. This really let the digital revolution proceed. Everything was orderly and predictable so there were fewer blind alleys to follow. We had probably the single most sustained continuous and predictable improvement in any technology over the history of mankind.

Herd mentality in computer design

But with this good came some things that might be viewed negatively (though I’m sure there are some who would argue that they were all unalloyed good).  I’ll take up one of these as the third thing to talk about that Moore’s Law had a major impact upon.

A particular form of general purpose computer design had arisen by the time that central processors could be put on a single chip (see the Intel 4004 below), and soon those processors on a chip, microprocessors as they came to be known, supported that general architecture.  That architecture is known as the von Neumann architecture.

A distinguishing feature of this architecture is that there is a large RAM memory which holds both instructions and data–made from the RAM chips we talked about above under coordination. The memory is organized into consecutive indexable (or addressable) locations, each containing the same number of binary bits, or digits. The microprocessor itself has a few specialized memory cells, known as registers, and an arithmetic unit that can do additions, multiplications, divisions (more recently), etc. One of those specialized registers is called the program counter (PC), and it holds an address in RAM for the current instruction. The CPU looks at the pattern of bits in that current instruction location and decodes them into what actions it should perform. That might be an action to fetch another location in RAM and put it into one of the specialized registers (this is called a LOAD), or to send the contents the other direction (STORE), or to take the contents of two of the specialized registers feed them to the arithmetic unit, and take their sum from the output of that unit and store it in another of the specialized registers. Then the central processing unit increments its PC and looks at the next consecutive addressable instruction. Some specialized instructions can alter the PC and make the machine go to some other part of the program and this is known as branching. For instance if one of the specialized registers is being used to count down how many elements of an array of consecutive values stored in RAM have been added together, right after the addition instruction there might be an instruction to decrement that counting register, and then branch back earlier in the program to do another LOAD and add if the counting register is still more than zero.

That’s pretty much all there is to most digital computers.  The rest is just hacks to make them go faster, while still looking essentially like this model. But note that the RAM is used in two ways by a von Neumann computer–to contain data for a program and to contain the program itself. We’ll come back to this point later.

With all the versions of Moore’s Law firmly operating in support of this basic model it became very hard to break out of it. The human brain certainly doesn’t work that way, so it seems that there could be powerful other ways to organize computation. But trying to change the basic organization was a dangerous thing to do, as the inexorable march of Moore’s Law based existing architecture was going to continue anyway. Trying something new would most probably set things back a few years. So brave big scale experiments like the Lisp Machine^{\big 4} or Connection Machine which both grew out of the MIT Artificial Intelligence Lab (and turned into at least three different companies) and Japan’s fifth generation computer project (which played with two unconventional ideas, data flow and logical inference) all failed, as before long the Moore’s Law doubling conventional computers overtook the advanced capabilities of the new machines, and software could better emulate the new ideas.

Most computer architects were locked into the conventional organizations of computers that had been around for decades. They competed on changing the coding of the instructions to make execution of programs slightly more efficient per square millimeter of silicon. They competed on strategies to cache copies of  larger and larger amounts of RAM memory right on the main processor chip. They competed on how to put multiple processors on a single chip and how to share the cached information from RAM across multiple processor units running at once on a single piece of silicon. And they competed on how to make the hardware more predictive of what future decisions would be in a running program so that they could precompute the right next computations before it was clear whether they would be needed or not. But, they were all locked in to fundamentally the same way of doing computation. Thirty years ago there were dozens of different detailed processor designs, but now they fall into only a small handful of families, the X86, the ARM, and the PowerPC. The X86’s are mostly desktops, laptops, and cloud servers. The ARM is what we find in phones and tablets.  And you probably have a PowerPC adjusting all the parameters of your car’s engine.

The one glaring exception to the lock in caused by Moore’s Law is that of Graphical Processing Units, or GPUs. These are different from von Neumann machines. Driven by wanting better video performance for video and graphics, and in particular gaming, the main processor getting better and better under Moore’s Law was just not enough to make real time rendering perform well as the underlying simulations got better and better. In this case a new sort of processor was developed. It was not particularly useful for general purpose computations but it was optimized very well to do additions and multiplications on streams of data which is what is needed to render something graphically on a screen. Here was a case where a new sort of chip got added into the Moore’s Law pool much later than conventional microprocessors, RAM, and disk. The new GPUs did not replace existing processors, but instead got added as partners where graphics rendering was needed. I mention GPUs here because it turns out that they are useful for another type of computation that has become very popular over the last three years, and that is being used as an argument that Moore’s Law is not over. I still think it is and will return to GPUs in the next section.


As I pointed out earlier we can not halve a pile of sand once we are down to piles that are only a single grain of sand. That is where we are now, we have gotten down to just about one grain piles of sand. Gordon Moore’s Law in its classical sense is over. See The Economist from March of last year for a typically thorough, accessible, and thoughtful report.

I earlier talked about the feature size of an integrated circuit and how with every doubling that size is divided by \sqrt{2}.  By 1971 Gordon Moore was at Intel, and they released their first microprocessor on a single chip, the 4004 with 2,300 transistors on 12 square millimeters of silicon, with a feature size of 10 micrometers, written 10μm. That means that the smallest distinguishable aspect of any component on the chip was 1/100th of a millimeter.

Since then the feature size has regularly been reduced by a factor of \frac{1}{\sqrt{2}}, or reduced to 71\% of its previous size, doubling the number of components in a given area, on a clockwork schedule.  The schedule clock has however slowed down. Back in the era of Moore’s original publication the clock period was a year.  Now it is a little over 2 years.  In the first quarter of 2017 we are expecting to see the first commercial chips in mass market products with a feature size of 10 nanometers, written 10nm. That is 1,000 times smaller than the feature size of 1971, or 20 applications of the 71\% rule over 46 years.  Sometimes the jump has been a little better than 71\%, and so we actually seen 17 jumps from 10μm down to 10nm. You can see them listed in Wikipedia. In 2012 the feature size was 22nm, in 2014 it was 14nm, now in the first quarter of 2017 we are about to see 10nm shipped to end users, and it is expected that we will see 7nm in 2019 or so. There are still active areas of research working on problems that are yet to be solved to make 7nm a reality, but industry is confident that it will happen. There are predictions of 5nm by 2021, but a year ago there was still much uncertainty over whether the engineering problems necessary to do this could be solved and whether they would be economically viable in any case.

Once you get down to 5nm features they are only about 20 silicon atoms wide.  If you go much below this the material starts to be dominated by quantum effects and classical physical properties really start to break down. That is what I mean by only one grain of sand left in the pile.

Today’s microprocessors have a few hundred square millimeters of silicon, and 5 to 10 billion transistors. They have a lot of extra circuitry these days to cache RAM, predict branches, etc., all to improve performance. But getting bigger comes with many costs as they get faster too. There is heat to be dissipated from all the energy used in switching so many signals in such a small amount of time, and the time for a signal to travel from one side of the chip to the other, ultimately limited by the speed of light (in reality, in copper it is about 5\% less), starts to be significant. The speed of light is approximately 300,000 kilometers per second, or 300,000,000,000 millimeters per second. So light, or a signal, can travel 30 millimeters (just over an inch, about the size of a very large chip today) in no less than one over 10,000,000,000 seconds, i.e., no less than one ten billionth of a second.

Today’s fastest processors have a clock speed of 8.760GigaHertz, which means by the time the signal is getting to the other side of the chip, the place if came from has moved on to the next thing to do. This makes synchronization across a single microprocessor something of a nightmare, and at best a designer can know ahead of time how late different signals from different parts of the processor will be, and try to design accordingly. So rather than push clock speed further (which is also hard) and rather than make a single microprocessor bigger with more transistors to do more stuff at every clock cycle, for the last few years we have seen large chips go to “multicore”, with two, four, or eight independent microprocessors on a single piece of silicon.

Multicore has preserved the “number of operations done per second” version of Moore’s Law, but at the cost of a simple program not being sped up by that amount–one cannot simply smear a single program across multiple processing units. For a laptop or a smart phone that is trying to do many things at once that doesn’t really matter, as there are usually enough different tasks that need to be done at once, that farming them out to different cores on the same chip leads to pretty full utilization.  But that will not hold, except for specialized computations, when the number of cores doubles a few more times.  The speed up starts to disappear as silicon is left idle because there just aren’t enough different things to do.

Despite the arguments that I presented a few paragraphs ago about why Moore’s Law is coming to a silicon end, many people argue that it is not, because we are finding ways around those constraints of small numbers of atoms by going to multicore and GPUs.  But I think that is changing the definitions too much.

Here is a recent chart that Steve Jurvetson, cofounder of the VC firm DFJ^{\big 5} (Draper Fisher Jurvetson), posted on his FaceBook page.  He said it is an update of an earlier chart compiled by Ray Kurzweil.


In this case the left axis is a logarithmically scaled count of the number of calculations per second per constant dollar. So this expresses how much cheaper computation has gotten over time. In the 1940’s there are specialized computers, such as the electromagnetic computers built to break codes at Bletchley Park. By the 1950’s they become general purpose, von Neuman style computers and stay that way until the last few points.

The last two points are both GPUs, the GTX 450 and the NVIDIA Titan X.  Steve doesn’t label the few points before that, but in every earlier version of a diagram that I can find on the Web (and there are plenty of them), the points beyond 2010 are all multicore.  First dual cores, and then quad cores, such as Intel’s quad core i7 (and I am typing these words on a 2.9MHz version of that chip, powering my laptop).

That GPUs are there and that people are excited about them is because besides graphics they happen to be very good at another very fashionable computation.  Deep learning, a form of something known originally as back propagation neural networks, has had a big technological impact recently. It is what has made speech recognition so fantastically better in the last three years that Apple’s Siri, Amazon’s Echo, and Google Home are useful and practical programs and devices. It has also made image labeling so much better than what we had five years ago, and there is much experimentation with using networks trained on lots of road scenes as part of situational awareness for self driving cars. For deep learning there is a training phase, usually done in the cloud, on millions of examples. That produces a few million numbers which represent the network that is learned. Then when it is time to recognize a word or label an image that input is fed into a program simulating the network by doing millions of multiplications and additions. Coincidentally GPUs just happen to perfect for the way these networks are structured, and so we can expect more and more of them to be built into our automobiles. Lucky break for GPU manufacturers! While GPUs can do lots of computations they don’t work well on just any problem. But they are great for deep learning networks and those are quickly becoming the flavor of the decade.

While rightly claiming that we continue to see exponential growth as in the chart above, exactly what is being measured has changed. That is a bit of a sleight of hand.

And I think that change will have big implications.


I think the end of Moore’s Law, as I have defined the end, will bring about a golden new era of computer architecture.  No longer will architects need to cower at the relentless improvements that they know others will get due to Moore’s Law. They will be able to take the time to try new ideas out in silicon, now safe in the knowledge that a conventional computer architecture will not be able to do the same thing in just two or four years in software. And the new things they do may not be about speed. They might be about making computation better in other ways.

Machine learning runtime

We are seeing this with GPUs as runtime engines for deep learning networks.  But we are also seeing some more specific architectures. For instance, for about a a year Google has had their own chips called TensorFlow Units (or TPUs) that save power for deep learning networks by effectively reducing the number of significant digits that are kept around as neural networks work quite well at low precision. Google has placed many of these chips in the computers in their server farms, or cloud, and are able to use learned networks in various search queries, at higher speed for lower electrical power consumption.

Special purpose silicon

Typical mobile phone chips now have four ARM processor cores on a single piece of silicon, plus some highly optimized special purpose processors on that same piece of silicon. The processors manage data flowing from cameras and optimizing speech quality, and even on some chips there is a special highly optimized processor for detecting human faces. That is used in the camera application, you’ve probably noticed little rectangular boxes around peoples’ faces as you are about to take a photograph, to decide what regions in an image should be most in focus and with the best exposure timing–the faces!

New general purpose approaches

We are already seeing the rise of special purpose architectures for very specific computations. But perhaps we will see more general purpose architectures but with a a different style of computation making a comeback.

Conceivably the dataflow and logic models of the Japanese fifth generation computer project might now be worth exploring again. But as we digitalize the world the cost of bad computer security will threaten our very existence. So perhaps if things work out, the unleashed computer architects can slowly start to dig us out of our current deplorable situation.

Secure computing

We all hear about cyber hackers breaking into computers, often half a world away, or sometimes now in a computer controlling the engine, and soon everything else, of a car as it drives by. How can this happen?

Cyber hackers are creative but many ways that they get into systems are fundamentally through common programming errors in programs built on top of the von Neumann architectures we talked about before.

A common case is exploiting something known as “buffer overrun”. A fixed size piece of memory is reserved to hold, say, the web address that one can type into a browser, or the Google query box. If all programmers wrote very careful code and someone typed in way too many characters those past the limit would not get stored in RAM at all. But all too often a programmer has used a coding trick that is simple, and quick to produce, that does not check for overrun and the typed characters get put into memory way past the end of the buffer, perhaps overwriting some code that the program might jump to later. This relies on the feature of von Neumann architectures that data and programs are stored in the same memory. So, if the hacker chooses some characters whose binary codes correspond to instructions that do something malicious to the computer, say setting up an account for them with a particular password, then later as if by magic the hacker will have a remotely accessible account on the computer, just as many other human and program services may. Programmers shouldn’t oughta make this mistake but history shows that it happens again and again.

Another common way in is that in modern web services sometimes the browser on a lap top, tablet, or smart phone, and the computers in the cloud need to pass really complex things between them. Rather than the programmer having to know in advance all those complex possible things and handle messages for them, it is set up so that one or both sides can pass little bits of source code of programs back and forth and execute them on the other computer. In this way capabilities that were never originally conceived of can start working later on in an existing system without having to update the applications. It is impossible to be sure that a piece of code won’t do certain things, so if the programmer decided to give a fully general capability through this mechanism there is no way for the receiving machine to know ahead of time that the code is safe and won’t do something malicious (this is a generalization of the halting problem — I could go on and on… but I won’t here). So sometimes a cyber hacker can exploit this weakness and send a little bit of malicious code directly to some service that accepts code.

Beyond that cyber hackers are always coming up with new inventive ways in–these have just been two examples to illustrate a couple of ways of how it is currently done.

It is possible to write code that protects against many of these problems, but code writing is still a very human activity, and there are just too many human-created holes that can leak, from too many code writers. One way to combat this is to have extra silicon that hides some of the low level possibilities of a von Neumann architecture from programmers, by only giving the instructions in memory a more limited set of possible actions.

This is not a new idea. Most microprocessors have some version of “protection rings” which let more and more untrusted code only have access to more and more limited areas of memory, even if they try to access it with normal instructions. This idea has been around a long time but it has suffered from not having a standard way to use or implement it, so most software, in an attempt to be able to run on most machines, usually only specifies two or at most three rings of protection. That is a very coarse tool and lets too much through.  Perhaps now the idea will be thought about more seriously in an attempt to get better security when just making things faster is no longer practical.

Another idea, that has mostly only been implemented in software, with perhaps one or two exceptions, is called capability based security, through capability based addressing. Programs are not given direct access to regions of memory they need to use, but instead are given unforgeable cryptographically sound reference handles, along with a defined subset of things they are allowed to do with the memory. Hardware architects might now have the time to push through on making this approach completely enforceable, getting it right once in hardware so that mere human programmers pushed to get new software out on a promised release date can not screw things up.

From one point of view the Lisp Machines that I talked about earlier were built on a very specific and limited version of a capability based architecture. Underneath it all, those machines were von Neumann machines, but the instructions they could execute were deliberately limited. Through the use of something called “typed pointers”, at the hardware level, every reference to every piece of memory came with restrictions on what instructions could do with that memory, based on the type encoded in the pointer. And memory could only be referenced by a pointer to the start of a chunk of memory of a fixed size at the time the memory was reserved. So in the buffer overrun case, a buffer for a string of characters would not allow data to be written to or read from beyond the end of it. And instructions could only be referenced from another type of pointer, a code pointer. The hardware kept the general purpose memory partitioned at a very fine grain by the type of pointers granted to it when reserved. And to a first approximation the type of a pointer could never be changed, nor could the actual address in RAM be seen by any instructions that had access to a pointer.

There have been ideas out there for a long time on how to improve security through this use of hardware restrictions on the general purpose von Neumann architecture.  I have talked about a few of them here. Now I think we can expect this to become a much more compelling place for hardware architects to spend their time, as security of our computational systems becomes a major achilles heel on the smooth running of our businesses, our lives, and our society.

Quantum computers

Quantum computers are a largely experimental and very expensive at this time technology. With the need to cool them to physics experiment level ultra cold, and the expense that entails, to the confusion over how much speed up they might give over conventional silicon based computers and for what class of problem, they are a large investment, high risk research topic at this time. I won’t go into all the arguments (I haven’t read them all, and frankly I do not have the expertise that would make me confident in any opinion I might form) but Scott Aaronson’s blog on computational complexity and quantum computation is probably the best source for those interested. Claims on speedups either achieved or hoped to be achieved on practical problems range from a factor of 1 to thousands (and I might have that upper bound wrong). In the old days just waiting 10 or 20 years would let Moore’s Law get you there. Instead we have seen well over a decade of sustained investment in a technology that people are still arguing over whether it can ever work. To me this is yet more evidence that the end of Moore’s Law is encouraging new investment and new explorations.

Unimaginable stuff

Even with these various innovations around, triggered by the end of Moore’s Law, the best things we might see may not yet be in the common consciousness. I think the freedom to innovate, without the overhang of Moore’s Law, the freedom to take time to investigate curious corners, may well lead to a new garden of Eden in computational models. Five to ten years from now we may see a completely new form of computer arrangement, in traditional silicon (not quantum), that is doing things and doing them faster than we can today imagine. And with a further thirty years of development those chips might be doing things that would today be indistinguishable from magic, just as today’s smart phone would have seemed like utter magic to 50 year ago me.


^{\big 1}Many times the popular press, or people who should know better, refer to something that is increasing a lot as exponential.  Something is only truly exponential if there is a constant ratio in size between any two points in time separated by the same amount.  Here the ratio is 2, for any two points a year apart.  The misuse of the term exponential growth is widespread and makes me cranky.

^{\big 2}Why the Chemical Heritage Foundation for this celebration? Both of Gordon Moore’s degrees (BS and PhD) were in physical chemistry!

^{\big 3}For those who read my first blog, once again see Roy Amara‘s Law.

^{\big 4}I had been a post-doc at the MIT AI Lab and loved using Lisp Machines there, but when I left and joined the faculty at Stanford in 1983 I realized that the more conventional SUN workstations being developed there and at spin-off company Sun Microsystems would win out in performance very quickly. So I built a software based Lisp system (which I called TAIL (Toy AI Language) in a nod to the naming conventions of most software at the Stanford Artificial Intelligence Lab, e.g., BAIL, FAIL, SAIL, MAIL)  that ran on the early Sun workstations, which themselves used completely generic microprocessors. By mid 1984 Richard Gabriel, I, and others had started a company called Lucid in Palo Alto to compete on conventional machines with the Lisp Machine companies. We used my Lisp compiler as a stop gap, but as is often the case with software, that was still the compiler used by Lucid eight years later when it ran on 19 different makes of machines. I had moved back to MIT to join the faculty in late 1984, and eventually became the director of the Artificial Intelligence Lab there (and then CSAIL). But for eight years, while teaching computer science and developing robots by day, I also at night developed and maintained my original compiler as the work horse of Lucid Lisp. Just as the Lisp Machine companies got swept away so too eventually did Lucid. Whereas the Lisp Machine companies got swept away by Moore’s Law, Lucid got swept away as the fashion in computer languages shifted to a winner take all world, for many years, of C.

^{\big 5}Full disclosure. DFJ is one of the VC’s who have invested in my company Rethink Robotics.

A Fair Fight?

In all science fiction movies and TV shows when there are aliens and the humans have to fight them it turns out to be close to a fair fight.  Of course the humans always win in the end, but there is plenty of drama as the fight is balanced and there is room for the pendulum to swing back and forth.

But what if we met aliens for real, and what if for some stupid reason they or we wanted to fight each other instead of study each other and perhaps revel in the wonderful idea that two different life forms who could communicate with each other had somehow found each other in this very empty Solar System/Galaxy/Universe that we occupy?

Would it possibly be a fair fight?

The Universe is 13.8 billion years old, and our galaxy is almost as old at 13.2 billion years. But the Earth is relative newcomer at 4.6 billion years old–there are many planets in our galaxy much older than ours and many much younger. Life arose on Earth about 4.1 billion years ago.  Mostly it has been single cell life ever since–it would usually lose in a fight with blasters. Animals, weird ones, but recognizable as animals, showed up about around 500 or 600 million years ago, or about 96\% into the life of the Universe. Humans first appeared about 5 million years ago, or just in the last 1\% of the 4\% of the age of the Universe, or about the last 0.04\% of the history of time.  Humans only figured out how to organize into armies no more that 10,000 years ago, or the last 0.2\% of human history or 0.00008\% of the age of the Universe. And we only got anything like guns in the last 400 years, or the last 4\% of that so then the last 0.0000032\% of the age of the Universe.  And we won’t get blasters, according to most science fiction, for another 200 years.

So, if another species that arose somewhere else in our Galaxy, and who has interstellar travel, and shows up here on Earth, and turns out to be nasty and wants to fight us (oh, wait, maybe that is us who wants to do the fighting), what is the chance that they will be anywhere near to the 600 year period of our history of guns turning into blasters, and not way beyond? What is the chance that we and they will have anywhere close to the same technological level so that a fight even makes sense?  Does even one single scene in any movie or TV show about humans fighting aliens have even the slightest chance of being technologically plausible?

Or let’s put it another way.  The lion is the apex predator in Africa.  But when it comes to an encounter with an American dentist the lion always ends up dead, and the dentist walks away unscathed. Every time.

Adding to the Alphabet of Life

There was a really important scientific result reported on this week^{\big 1} in the press. The original paper^{\big 2}, by a team at Scripps Research Institute in La Jolla, CA, a person in Grenoble, France, and a person in Henan, China, is behind a paywall at the National Academy of Science.

This team had previously introduced a new, unnatural base pair (UPB) into the DNA of an organism based on E. coli. In the past it had caused some toxicity to the organism and also tended to get deleted during reproduction.  The new result is that they synthetically modified the organism, getting rid of the toxicity, and showed that the UBP could survive 60 generations of reproduction.

Here is what normal DNA (deoxyribonucleic acid) looks like (from Wikimedia Commons):

There are two backbone chains, left and right, of alternating 2-deoxyribose and phosphate molecules joined by complementary pairs of nucleotide pairs of either Adeline (A) and Thymine (T) or of Guanine (G) and Cytosine (C).  So reading down the left side of this fragment of DNA we have the code ACTG, and reading up the right side we have CAGT.

There are lots of mechanisms about DNA and RNA that are not fully understood still, but DNA is used for two purposes.  The letters on it encode genetic sequences which are used to construct proteins (it gets more complex every decade as we understand more), the stuff of life, and it is used to make copies of itself so that one copy can remain in a parent cell and another copy goes to a new child cells.

For producing proteins the two strands or backbones are pried apart with a molecular machine moving along it, and and RNA molecule is built with complementary base pairs for sub-length of the DNA. RNA (ribonucleic acid) looks like this, with just one backbone chain where ribose (which has five Oxygen atoms rather than the four of deoxyribose) molecules and phosphate molecules alternate and single bases, one of the four letters, hang off at regular intervals.

The process of producing this RNA in this way is know as transcription.  It then gets translated by another mechanism into amino acids which are linked together to produce proteins.  In all life on earth the series of letters is used three at a time (which means 64 possible combinations of the four letters 64 = 4\times 4\times 4) of which in the “standard” setting 61 of the codings select for one of 20 amino acids, and the remaining three codings are used to say stop.  These 64 cases can easily be written down as a table for all the possible three letter sequences (which themselves are known as codons).  There are currently close to 30 (numbers change all the time…) variations on this code found in life on Earth–for instance vertebrates, invertebrates, and yeasts, each use their own slightly different version of the table in translating the DNA in the mitochondria of their cells, coding for a total of 23 amino acids (I think…)

But here is a thing one; since 1990 people have done experiments  where they have modified simple organisms to change the meanings of some codons to produce amino acids (there are many of them known in nature) which are not coded for in any natural system.  We will come back to this.

The second thing that happens  to DNA is in reproduction and that works as follows.  The double stranded DNA is fed into a little molecular machine which  unzips it where the base pairs join, and then lets a complementary base and newly constructed backbone attach to each half of the DNA, spitting out, in a continuous fashion two copies of the original DNA, where each copy has half of the actual atoms of the original.

Now what does this new paper do?  It has added a new pair of bases to an E. coli genome, and built a version of E. coli where that reproduction mechanism for DNA handles the new letters well, and where they existence of the new letters causes no real harm to the cell.

We can call the new bases by the letters X and Y, though as you can see from this diagram they have longer names.  This is figure 1A from the paper:

At the top we see a standard Cytosine-Guanine pair, and below that two variations of X and Y (the same X in the two cases) pairings.  In this later paper they have shown that they can build a robust semi synthetic organism that carries these X and Y letters in the DNA, and preserve those letters well over at least 60 generations–that means at least 60 consecutive zippings apart and copying of the DNA including the X’s and Y’s.  In one variation they experiment with all 16 possible three letter sequences which have X in the middle and one of the regular G, A, C, or T on either side.  They state that the “loss was minimal to undetectable in 13 of the 16 cases”.

For my commentary below lets call this thing two.  We have now seen unnatural base pairs in a living organism being reproduced reliably.

Now the next thing that one imagines these scientists must be excited about is getting the transcription mechanism to handle the new letters, and then expanding the translation table from 64 entries to some bigger number.  The theoretical maximum would be 216 = 6\times 6\times 6, though so far they have not shown any sequences that have X’s or Y’s adjacent to each other are preserved.  But let’s call this combined result of two mechanisms thing three.

Thing one and thing two have been demonstrated.  Thing three has not.

But why am I writing this post. It is because I think thing two is a big deal about what life elsewhere might look like.

There has been some debate over whether life everywhere might look at the molecular level just like life here on Earth. I.e., perhaps it is the case that there is only a one way to make life out the the chemistry that exists in our Universe (and we assume here for argument’s sake that chemistry is the same everywhere in the Universe though there is debate about that).

We already thought, due to the multiple natural translation tables in Earth life, admittedly small variations on each other, but also that thing one had been done and varied them further that it might be reasonable to expect life, if we ever find it, elsewhere in the Solar System of further afield, to have different translation tables. In fact that has been a key question if we were to find life on Mars. If it has the same translation tables as on Earth we might presume that both forms of life came from the same place, perhaps Mars.  We have identified many meteorites on Earth that were once part of Mars, blasted off the surface of Mars by a large impact and eventually falling to Earth millions of years later. Perhaps they brought life with them.  But if we found DNA-based life on Mars to have a very different translation table from that on Earth we would tend to think that the life had arisen twice independently.

Now with thing two having been demonstrated in this new paper we might expect DNA based life on Mars to be even more different than that on Earth, perhaps use]ing a different set of base pairs. Since we have XY and XY’ demonstrated in this paper, we could imagine that it is not such a big step to have life with none of GACT, but perhaps all based in XYZW, or PQRS, or perhaps IJKLMN. This opens up the possibilities mightily. It is no longer enough to assay samples from Mars for the four base nucleotides that we find on Earth and declare no life if we do not see them. Before we get ahead of ourselves however, we must wait for thing three to be demonstrated. But that will seal the fate of how we must look for life on Mars–in a much more expansive way.

Is there a thing four?  Yes, perhaps in another version of DNA/RNA based biology there are not three letters used for each amino acid.  In a simpler version there might be only two letters to determine a smaller number of possible amino acids, or in a more complex version four letters to determine a larger number.  The engineering challenges to modify Earth based life to perform this way are significant, so I would not expect to see that any time soon.  But it could have implications for life elsewhere.

Getting back to Earth biology people have been trying to understand how RNA and DNA showed up to make life anywhere. A fairly sure bet is that there were simpler mechanisms before the current mechanisms we see. Perhaps all that life got obliterated, competed away, by the much more stable RNA/DNA based life we see today. Or perhaps some of it is still hiding in isolated environments on Earth and we haven’t yet recognized it.

One hypothesis is that perhaps a much less stable form of life relied on the much simpler PNA (peptide nucleic acid) shown here, but using the same modern GACT.

This is a much simpler backbone and there are arguments that it could more easily have arisen spontaneously in the primordial soup, but it is not as stable as DNA for long term storage of genetic information.  People have been doing lab experiments for twenty years getting PNA with the standard GACT bases to interact with and transfer sequences with RNA and DNA.  There are independently arguments about how the redundant standard translation table (61 coding entries but only 20 different amino acids), could have evolved from a much simpler coding system.

I think thing two shows that we must be more expansive on what we believe the biochemistry of life elsewhere might be.

My own suspicion is that there is plenty of life out there that uses totally different coding systems, and totally different molecules than RNA and DNA.

And I am getting more and more convinced that our current tools for detecting life are “all the harder to see you with”!

^{\big 1}This particular story has some questionable wording in places. This is not an entirely new type of DNA. Rather it is completely conventional DNA but it carries a new pair of base nucleotides.

^{\big 2}Yorke Zhang, Brian M. Lamb, Aaron W. Feldman, Anne Xiaozhou Zhou, Thomas Lavergne, Lingjun Li, and Floyd E. Romesberg, A semisynthetic organism engineered for the stable expansion of the genetic alphabet, Proceedings of the National Academy of Science,

Research Needed on Robot Hands

This is a short piece I wrote for a workshop on what are good things to work on in robotics research.

One measure of success of robots is how many of them get deployed doing real work in the real world. One way to get more robots deployed is to reduce the friction that comes up during typical deployments. For intelligent robots in factories there are many sources of friction, some sociological, some financial, some concerning takt time, some concerning PLCs and other automation, but perhaps the most friction that can be attributed to a lack of relevant research results is the problem of getting a gripper suitable for a particular task.

Today in factories the most commonly used grippers are either a set of custom configured suction cups to pick up a very particular object, or one of a myriad of parallel jaw grippers varying over a large number of parameters, and custom fingers, again carefully selected for a particular object. In both cases just one grasp is used for that particular object. Getting the right gripper for initial deployment can be a weeks long source of friction, and then changing the gripper when new objects are to be handled is another source of friction. Furthermore, grip failure can be a major source of run time errors.

Human hands just work. Give them an object from a very wide class of objects and they grip that object, usually with a wide variety of possible grips. They sense when the grip is failing and adjust. They work reliably and quickly.

Building more general hands for robots that require very little customization, that can dynamically grasp millions of different sized and shaped objects, that can do so quickly, that have a long lifetime over millions of cycles, and that just work would have significant impact on deployment of robots in factories, in fulfillment centers, and in homes.

Things like SLAM took many hundreds of researchers working for many years with an ultimately well defined problem (that definition took a few years to appear), and with access to low cost robots that could be used to produce dynamic data sets in many different environments.

Right now it is hard to define a mathematical criterion for a good robot hand, i.e., we can see nothing, and may never see anything, of comparable clarity as we had for SLAM.

My strawman is that we will need concurrent progress in at least five areas, each feeding off the other, in order to come up with truly useful and general robot hands:

– new (low cost) mechanisms for both kinematics and force control
– materials to act as a skin (grasp properties and longevity)
– long life sensors that can be embedded in the skin and mechanism
– algorithms to dynamically adjust grasps based on sensing
– learning mechanisms on visual/3D data to inform hands for pregrasp

I think progress on one of these alone is hard to get adopted by research groups working on others. The constraints between them are not well understood and need to be challenged and adapted to by all the researchers. This is a tall order. This is why grippers on factory robots today look just like they did forty years ago.

Humanoids of Star Trek

I am a big Star Trek fan.  But there is one little problem…

How come all the races they meet are essentially humanoid, apart from the occasional pool of tar which both speaks and absorbs well loved security officers? Why is the whole Universe, well at least the whole of Alpha Quadrant of our Galaxy, full of aliens who are remarkably human like in size and form, even though the may have extra organs, always unseen except by the various “Doctors” in weird places in their torsos? Oh, and despite that, they all happen to be wonderfully sexually compatible with each other….

It all goes back to the sixties when The Original Series (TOS) was made. That was before computer graphics were anywhere good enough to be used on film, and so all the aliens had to be played by human actors.  If it could be arranged that only the voice of the human had to be “seen” by everyone then the form of the body could be as weird as a pool of tar. But if there needed to be visible interaction then the aliens had to have human form, because that is what the available actors had.

And we won’t go into how the universal translators (nice dodge!) know how to communicate in English with any alien race before anyone has heard them first speak a word, or a paragraph, in order to learn their language…

Unexpected Consequences of Self Driving Cars

Many new technologies have unexpected impacts on the physical or social world in which we live.

When the first IMPs^{\big 1} for the fledgling ARPANET were being built starting in 1969 at BBN^{\big 2} in Cambridge, MA, I think it safe to say that no one foresaw the devastating impact that the networking technology being developed would have on journalism thirty to fifty years later. Craigslist replaced classified ads in newspapers and took a huge amount of their revenue away, and then Google provided a new service of search for things that one might buy and at the same time delivered ads for those things, taking away much of the rest of advertising revenue from print, radio, and TV, the homes of most traditional journalism. Besides loss of advertising cutting the income stream for journalism, and thus cutting the number of employed journalists, the new avenues for versions of journalism are making it more difficult for traditional print journalists to compete, as John Markoff recently talked about in announcing his retirement from the New York Times.

A way of sharing main frame computer power between research universities ended up completely disrupting how we get our news, and perhaps even who we elected as President.

Where might new unexpected upendings of our lives be coming from?

Perhaps the new technology with the biggest buzz right now is self driving cars.

In this post I will explore two possible consequences of having self driving cars, two consequences that I have not seen being discussed, while various car companies, non-traditional players, and startups debate what level of autonomy we might expect in our cars and when. These potential consequences  are self-driving cars as social outcasts and anti-social behavior of owners.  Both may have tremendous and unexpected influence on the uptake of self-driving cars.  Both are more about the social realm than the technical realm, which is perhaps why technologists have not addressed them. And then I’ll finish, however, by dissing a non-technical aspect of self driving cars that has been overdone by technologists and other amateur philosophers with an all out flame. And yes, I am at best an amateur philosopher too. That’s why it is a flame.

But first…

Levels of Autonomy

There is general agreement on defining different levels of autonomy for cars, numbered 0 through 5, although different sources have slightly different specifications for them.  Here are the levels from the autonomous car entry in Wikipedia which attributes this particular set to the SAE (Society of Automotive Engineers):

  • Level 0: Automated system has no vehicle control, but may issue warnings.
  • Level 1: Driver must be ready to take control at any time. Automated system may include features such as Adaptive Cruise Control (ACC), Parking Assistance with automated steering, and Lane Keeping Assistance (LKA) Type II in any combination.
  • Level 2: The driver is obliged to detect objects and events and respond if the automated system fails to respond properly. The automated system executes accelerating, braking, and steering. The automated system can deactivate immediately upon takeover by the driver.
  • Level 3: Within known, limited environments (such as freeways), the driver can safely turn their attention away from driving tasks, but must still be prepared to take control when needed.
  • Level 4: The automated system can control the vehicle in all but a few environments such as severe weather. The driver must enable the automated system only when it is safe to do so. When enabled, driver attention is not required.
  • Level 5: Other than setting the destination and starting the system, no human intervention is required. The automatic system can drive to any location where it is legal to drive and make its own decision.

Some versions of level 4 specify the that there may be geographical restrictions, perhaps to places that have additional external infrastructure installed.

Today almost all new cars have level 1 autonomy features, and level 2 autonomy is becoming more common in production products. Some manufacturers are releasing software for level 4 though the legality and prudence of doing so right now is open to question.

There is much debate on how to have safe versions of level 2 and level 3 autonomy as both require a human to jump into the control loop when their attention has been wandering.  The time available for the person to reorient their concentration in order to respond to events in the world is often much shorter than what people really need.   I think most people agree that there might be a natural progression from level 4 to level 5, but there are different opinions on whether going from level 2 to level 3, or, more vociferously, from level 3 to level 4 are natural progressions.  As a result there are advocates for going straight to level 4, and there are many start up companies, and non-traditional players (e.g., Google) trying to go directly to level 4 or level 5 autonomy.

The rest of this post is about level 4 and level 5 autonomy.  What are the unexpected social consequences of having cars driving around without a human driver in command or at the ready to be in command?


1. Social Outcasts

Suppose you are walking along a twisting narrow country ride at night, with no shoulder and thick vegetation right at the edge of the pavement, and with no moon out, and you hear a car approaching.  What do you do?  I know what I would do!  I’d get off the road, climbing into the bushes if necessary, until the car had passed.  Why would I do that?  Because I would have no indication of whether the driver of the car had seen me and was going to avoid hitting me.

We all realize that on dark narrow country roads anonymous cars are first class citizens and we pedestrians are second class.  We willingly give cars the right of way.

But what about in the daytime (or even night time) in an urban area where you live?  There, pedestrians and cars interact all the time.  And much of that interaction is social interaction between the person on the street and the person behind the wheel.  Sometimes it is one way interaction, but often it is two way interaction.  Two questions arise.  If self driving cars can not participate in these interactions how will people feel about these new aliens sharing their space with them?  And in the interests of safety for pedestrians, how much will the performance of self driving cars need to be detuned relative to human driven cars, and how will that impact the utility of those cars, and degrade the driving experience of people that have traditional level 1 cars?

Within a few blocks of where I live  in Cambridge, MA, are two different microcosms of how people and cars interact.  Other neighborhoods will have other ways of interacting but the important point is how common interaction is.

The streets for a few blocks around where I live are all residential with triple decker apartment buildings or single or duplex houses on small lots.  The streets are narrow, and many of them are one-way.  There are very few marked pedestrian crossings.  People expect to be able to cross a street at any point, but they know there is give and take between drivers and pedestrians and there are many interactions between cars and people walking.  They do not think that cars are first class citizens and that they are second class.  Cars and people are viewed as equals, unlike on a narrow country road at night.

Cars and people interact in three ways that I have noticed in this area.  First, on the longer through roads the cars travel without stop signs, but there are stop signs on the entering or cross side streets.  People expect the right of way on the longer streets too, expecting that cars that have stopped on a side street will let them walk in front if they are about to step off the curb.  But people look to the driver for acknowledgement that they have been seen before they step in front of the car.  Second, when people want to cross a street between intersections or on one of the through streets without stop signs they wait for a bit of a gap between cars, step out cautiously if one is coming and confirm that the car is slowing down before committing to be in the middle of the road.   But often they will step off the curb and partially into the road not expecting the very next car to let them go, but the one that is second to reach where they are–they do expect that second car to let them cross.  And third, the sidewalks are narrow, and especially when there is snow can be hard to walk on (residents are responsible for the sidewalk in front of their properties, and can take a while to clear them) so in winter people often walk along the roads, trying to give room for the cars to go by, but nevertheless expecting the cars to be respectful of them and give them room to walk along the road.

A few blocks further away from where I live is a somewhat different environment, a commercial shopping, bar, and restaurant area (with the upper floors occupied by M.I.T. spin-off startups), known as Central Square^{\big 3}. There are marked pedestrian crossings there, and mostly people stick to crossing the roads at those designated places.  Things are a little less civil here, perhaps because more people driving through are not local residents from right around the neighborhood.

People step out tentatively into the marked cross walks and visually check whether on-coming drivers are slowing down, or indicate in some way that they have seen the pedestrian.  During the day it easy to see into the cars and get an idea of what the driver is paying attention to, and the same is actually true at night as there is enough ambient light around to see into most cars.  Pedestrians and drivers mostly engage in a little social interaction, and any lack of interaction is usually an indicator to the pedestrian that the driver has not seen them.  And when such a driver barrels through the crossing the pedestrians get angry and yell at the car, or even lean their hands out in front of the car to show the driver how angry they are.

Interestingly, many pedestrians reward good behavior by drivers.  Getting on the main street or off of the main street from or onto a small side street can often be tricky for a driver.  There are often so many people on the sidewalks that there is a constant flow of foot traffic crossing the exits or entrances of the side streets.   Drivers have to be patient and ready for a long wait to find a break.  Often pedestrians who have seen how patient a driver is being will voluntarily not step into the cross walk, and either with a head or hand signal indicate to a driver that they should head through the crossing.  And if the driver doesn’t respond they make the signal again–the pedestrian has given the turn to the driver and expects them to take it.

There are big AI perception challenges, just in my neighborhood, to get driverless cars to interact with people as well us driverful cars do. What if level 4 and level 5 autonomy self driving cars are not able to make that leap of fitting in as equals as current cars do?

Cars will clearly have to be able to perceive people walking along the street, even and especially on a snowy day, and not hit them.  That is just not debatable.  What is debatable is whether the cars will need to still pass them, or whether they will slowly follow people not risking passing them as a human driver would.  That slows down the traffic for both the owner of the driverless car, and for any human drivers.  The human drivers may get very annoyed with being stuck behind driverless cars.  Driverless cars would then be a nuisance.

In the little side streets, when at a stop sign, cars will have to judge when someone is about to cross in front of them.  But sometimes people are just chatting at the corner, or it is a parent and child waiting for the school bus that pulls up right there.  How long should the driverless car wait?  And might someone bully such cars by teasing them that they are about to step off the curb–people don’t try that with human drivers as there will soon be repercussions, but driverless cars doing any percussioning will just not be acceptable.

Since there are no current ways that driverless cars can give social signals to people, beyond inching forward to indicate that they want to go, how will they indicate to a person that they have seen them and it safe to cross in front of the car at a stop sign?  Perhaps the cars will instead need to be 100% guaranteed to let people go.  Otherwise without social interactions it would be like the case of the dark country road.  In that case driverless cars would have a privileged position compared to cars with human drivers and pedestrians.  That is not going to endear them to the residents.  “These damn driverless cars act like they own the road!”  So instead, driverless cars will need to be very wimpy drivers, slowing down traffic for everybody.

At a cross walk in Central Square driverless cars potentially might be stuck for hours. Will people take pity on them as they do on human drivers? To take advantage of this the cars would need to understand human social signals of giving them a turn, but without a reciprocal signal it is going to be confusing to the generous pedestrians and they may soon decide to not bother being nice to driverless cars at all. That will only make it more frustrating for a human driver stuck behind them, and in Central Square at least, that will quickly lead to big traffic jams. “Damn those driverless cars, they just jam the place up!”

According to this report from the UK, there are predictions that traffic on highways will slow down somewhat because of timid autonomous systems until some threshold of autonomous density is reached.   I think the dynamics where we consider the role of pedestrians is going to be very different and much more serious.

If self driving cars are not playing by the unwritten rules of how pedestrians and other drivers expect cars to interact, there will be ire directed at someone.  In the case of cars with level 2 or level 3 autonomy there will be a driver in the driver’s seat, and pedestrians will see them, see their concerns being ignored by the person, and direct their ire at that person, most likely the owner or the person temporarily using the car as a service.  If the car is under level 4 or level 5 autonomy it may be totally unoccupied, or have no seating in what would be the driver’s seat, and then the ire will be directed at that class of car.

I see a real danger of contempt arising for cars with level 4 and level 5 autonomy.   It will come from pedestrians and human drivers in urban areas.  And when there is contempt and lack of respect, people will not be shy about expressing that contempt.

At least one manufacturer  is afraid that human drivers will bully self driving cars operating with level two autonomy, so they are taking care that in their level 3 real world trials the cars look identical to conventional models, so that other drivers will not cut them off and take advantage of the heightened safety levels that lead to autonomous vehicle driving more cautiously.

2. Anti-social Behavior of Owners

The flip side of autonomous cars not understanding social mores well enough, is owners of self driving cars using them as a shield to be anti-social themselves.

Up from Central Square towards Harvard Square is a stretch of Massachusetts Avenue that is mixed residential and commercial, with metered parking.  A few weeks ago I needed to stop at the UPS store there and ship a heavy package.  There were no free parking spots so I soon found myself cruising up and down along about a 100 meter stretch, waiting for one to open up.  The thought occurred to me that if I had had a level 4 or 5 self driving car I could have left it to do that circling, while I dropped into the store.

Such is the root of anti-social behavior. Convenience for the individual, me not having to find a parking spot, versus over exploitation of the commons, filling the active roadway with virtually parked cars. Without autonomous vehicles UPS locations that are in places without enough parking shed some of their business to locations that have more extensive parking. That dynamic of self balancing may change once car owners have an extra agent at their beck and call, the self driving system of their automobiles.

We have seen many groups, including Tesla, talk about the advantage to individuals  of having their cars autonomously dealing with parking, so from a technical point of view I think this capability is one that is being touted as an advantage of autonomous cars. However, it gets to interact with human nature and then anti-social behavior can arise.

I think there will be plenty of opportunity for people to take other little short cuts with their autonomous cars. I’m sure the owners will be more creative than I can be, but here are three additional examples.

(1) People will jump out of their car at a Starbucks to run in and pick up their order knowingly leaving it not in a legal parking spot, perhaps blocking others, but knowing that it will take care of getting out of the way if some other car needs to move or get by. That will be fine in the case there is no such need, but in the case of need it will slow everything down just a little. And perhaps the owner will be able to set the tolerance on how uncomfortable things have to get before the car moves. Expect to see lots of annoyed people. And before long grocery store parking lots, especially in a storm, will just be a sea of cars improperly parked waiting for their owners.

(2) This is one for the two (autonomous) car family. Suppose someone is going to an event in the evening and there is not much parking nearby. And suppose autonomous cars are now always prowling neighborhoods waiting for their owners to summon them, so it takes a while for any particular car to get through the traffic to the pick up location. Then the two car family may resort to a new trick so that they don’t have to wait quite so long as others for their cars to get to the front door pick up at the conclusion of the big social event. They send one of their cars earlier in the day to find the closest parking spot that it can, and it settles in for a long wait. They use their second car to drop them at the event and send it home immediately. When the event is over their first autonomous car is right there waiting for them–the cost to the commons was a parking spot occupied all day by one of their cars.

(3) In various suburban schools that my kids went to when they were young there was a pick up ritual, which I see being repeated today when I drive past a school at the right time. Mothers, mostly, would turn up in their cars just before dismissal time and line up in the order that they arrived with the line backing out beyond the school boundary often. When school was over the teachers would come outside with all the kids and the cars would pull up to the pick up point^{\big 4}, the parents and teachers would cooperate to get the kids into their car seats, and off would go the cars with the kids, one at a time. When the first few families have fully driverless cars, one can imagine them sending their cars to wait in line first, so that their kids get picked up first and brought home. Not only does that mean that other parents would have to invest more of their personal time waiting in order to get their kids earlier, while the self driving car owners do not, but it ends up putting more responsibility on the teachers. Expect to see push back on this practice from the schools. But people will still try it.

Early on in the transition to driverless cars the 1% will have a whole new way to alienate the rest of the society. If you don’t think so, take a drive south from San Francisco on 101 in the morning and see the Teslas speeding down the left most lane.

What This Means

There are currently only fifteen fully driverless train systems in the United States, mostly in airports, and all with at most a handful of miles of track, all of which is completely spatially separated from any rights of way for any vehicles or pedestrians outside of the systems.  The first large scale driverless mass transit system in the US is going to be one that is under construction in Honolulu at this time, scheduled to be in initial operation in 2020 (though in late 2014 it was scheduled to begin operation in 2017).

There have been designs for larger scale systems to be driverless, for almost fifty years–for instance the San Francisco BART (Bay Area Rapid Transit) trains, first introduced in 1972 had lots of control automation features right at the beginning.  Failures and accidents however meant that many manual systems were added and sometimes later removed, sometimes having serious negative impact on overall efficiency of the system.

The aspirations for driverless train systems most closely correspond to level 4 autonomy for cars, but in very restrictive geographical environments.  Level 5 autonomy for trains would correspond to trains on tracks with level crossings, or street cars that share space with automobiles and pedestrians.  No one is advocating for, or testing, level 5 train autonomy at this moment.

Note also, that train navigation is very much simpler than automobile navigation.  There are guide rails!  They physically restrict were the trains can go.  And note further that all train systems are very much operated by organizations full of specialists.  Individual consumers do not go out and buy trains and use them personally–but that is what we are expecting will happen with individual consumers buying and using self driving cars.

Level 4 autonomy for trains is much easier than level 4 autonomy for cars.  Likewise for level 5.  But we hardly have any level 4 autonomous trains in the US.

Gill Pratt, CEO of Toyota Research Institute^{\big 5} said just a few days ago that “none of us in the automobile or IT industries are close to achieving true Level 5 autonomy”.

The preceding two sections talked about two ways in which self driving cars are going to get a bad name for themselves, as social outcasts in situations where there are pedestrians and other drivers, and in enabling anti-social behavior on behalf of their owners. Even ignoring the long tail of technical problems remaining to be solved for level 5 autonomy, to which Pratt refers, I think we are going to see push back from the public against level 5 and against widespread level 4 autonomy.  This pushback is going to come during trials and early deployments.  It may well be fierce.  People are going to be agitated.

Technically we will be able to make reasonable systems with level 4 autonomy in the not too distant future, but the social issues will mean that the domains of freedom for level 4 autonomous vehicles will be rather restricted.

We’ll see autonomous trucks convoying behind a single human occupied truck (perhaps itself a level 3 vehicle) in designated lanes on highways. But once off the highway we’ll demand individual humans in each truck to supervise the off highway driving.

Just as in airports where we have had self driving trains for quite a while we’ll see limited geographic domains where we have level 4 autonomous cars operating in spaces where there are no pedestrians and no other human drivers.

For instance, it will not be too long before we’ll have garages where drivers drop off their cars which then go and park themselves with only inches on each side in tightly packed parking areas. Your car will take up much less room than a human parked car, so there will be an economic incentive to develop these parking garages.

Somewhat later we might see level 4 autonomy for ride hailing services in limited areas of major cities.  The ride will have to begin and terminate within a well defined geographic area where it is already the case that pedestrian and automobile traffic is well separated by fairly strong social norms about the use of  walk signals at the corner of every block.  Some areas of San Francisco might work for this.

We might also see level 4 autonomy on some delivery vehicles in dense urban environments.  But they will need to be ultra deferential to pedestrians, and not operate and clog things up for other cars during peak commuting periods.  This could happen on a case by case basis in not too many years, but I think it will be a long time before it gets close to being universally deployed as a means of delivery.

We’ll see a slower than techies expect deployment of level 4 autonomy, with a variety of different special cases leading the way.  Level 5 autonomy over large geographical areas is going to be a long time coming.  Eventually it will come, but not as quickly as many expect today.

The futurist Roy Amara was well known for saying: We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.

That is where we are today.  People are overestimating how quickly level 5 autonomy will come, and even over estimating how widespread level 4 autonomy will be any time soon.  They are seeing the technical possibilities and not seeing the resistance that will come with autonomous agents invading human spaces, be they too rude or overly polite. But things will march on and at some point every single car will be level 5 autonomy and we’ll no longer let people drive.  Eventually it will creep up on us and we’ll hardly notice^{\big 6} when it does happen.

Eventually manual driving disappear in all but specialized entertainment zones.  But by then we won’t notice.  It is inevitable.  But, that day will not be soon.  And the flying cars will be even later.

And now we get to a little flaming:


There is a serious question about how safe is safe.  35,000 people in the US are killed in motor vehicle accidents per year, with about 1.25 million world wide.  Right now all these deaths involve human drivers. They are both horribly large numbers.  Over the last 120 years we, the human race, has decided that such high numbers of deaths are acceptable for the usefulness that automobiles provide.

My guess is that we will never see close to such high numbers of deaths involving driverless cars.  We just will not find them acceptable, and instead we will delay adopting levels 4 and 5 autonomy, at the cost of more overall lives lost, rather than have autonomous driving systems cause many deaths at all.  Rather than 35,000 annual deaths in the US it will not be acceptable unless it is a relatively tiny number.  Ten deaths per year may be deemed too much, even though it could be viewed as minus 34,990 deaths.  A very significant improvement over the current state of affairs.

It won’t be rational. But that is how it is going to unfold.

Meanwhile, there has been a cottage industry of academics and journalists looking for click bait (remember, their whole business model got disrupted by the Internet–they are truly desperate, and have been driven a little mad), asking questions about whether we will trust our cars to make moral decisions when they are faced with horrible choices.

You can go here to a web site at M.I.T. to see the sorts of moral decisions people are saying that autonomous cars will need to make.  When the brakes suddenly fail should the car swerve to miss a bunch of babies in strollers and instead hit a gaggle of little old ladies?  Which group should the car decide to kill and which to save, and who is responsible for writing the code that makes these life and death decisions?

Here’s a question to ask yourself. How many times when you have been driving have you had to make a forced decision on which group of people to drive into and kill? You know, the five nuns or the single child? Or the ten robbers or the single little old lady? For every time that you have faced such decision, do you feel you made the right decision in the heat of the moment? Oh, you have never had to make that decision yourself? What about all your friends and relatives? Surely they have faced this issue?

And that is my point. This is a made up question that will have no practical impact on any automobile or person for the forseeable future. Just as these questions never come up for human drivers they won’t come up for self driving cars. It is pure mental masturbation dressed up as moral philosophy. You can set up web sites and argue about it all you want. None of that will have any practical impact, nor lead to any practical regulations about what can or can not go into automobiles. The problem is both non existant and irrelevant.

Nevertheless there is endless hand wringing and theorizing, in this case at Newsweek, about how this is an oh so important problem that must be answered before we entrust our cars to drive autonomously.

No it is not an important question, and it is not relevant. What is important is to make self driving cars as safe as possible. And handling the large tail of perceptual cases that arise in the real world will be key to that.

Over the years many people have asked me and others whether our robots are “three laws safe”. They are referring to Asimov’s three laws from his science fiction books in the 1950’s about humanoid robots.

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

But those who have actually read Asimov’s book know that Asimov used these laws as a source of plot, where ambiguities led to a plot twist, or where, through a clever set up, conflicts between the laws were introduced. They were a joke!  It has not stopped the press breathlessly picking up on this as an important factor for robots.  Almost as bad as how the press picks up on the Turing test (itself a rhetorical device used by Alan Turing to make a point, not an actual certification of intelligent behavior).  Not that it is all the fault of the press.  There are plenty of academics (and recently Lords, physicists, and billionaires) who have also chosen to draw attention to a supposed barrier to the use of AI–whether machines will be moral.  There is nothing sensible to say on these issues at this time.

For Asimov’s laws none of our robots or perception systems can figure out the state of the world well enough for any robot today, or in the forseeable future to figure out when which law applies. And we won’t have cars that can tell nuns from robbers–how about robbers dressed as nuns, all the better when out on a bank robbing spree?

The Newsweek article, somewhat tongue in cheek, suggests:
To handle these relative preferences, we could equip people with beacons on their cellphones to signal nearby cars that they are a certain type of person (child, elderly, pedestrian, cyclist). Then programmers could instruct their autonomous systems to make decisions based on priorities from surveys or experiments like the Moral Machine.
Err, yeah.  This is going to work well, as no robber is ever going to choose the nun setting on their phone–I’m sure they will identify themselves as a robber, as they should!

My favorite answer to this general moral dilemma, known as the trolley problem, was given by Nicholas, the two year old son of E. J. Masicampo who teachs a moral philosophy class. Seen here dad sets up Nicholas’ wooden train set so that taking one fork will kill one person, and the other fork will kill five. Asked what should the train do, Nicholas moves the singleton to lie on the same track as the other five, then drives his train into all six of them, scatters them all, and declares “oh, oh”!



^{\big 1}Interface Message Processors. Today they would be referred to as Internet protocol routers.

^{\big 2}Bolt, Beranek and Newman in Cambridge, MA, a company that was always known as BBN. As distinct from BBN, the Buckingham Browne and Nichols school in Cambridge, MA — no doubt many employees of BBN sent their kids to school at BBN.

^{\big 3}Like all things called “Squares” in Massachusetts there is absolutely nothing to do with squareness in Central Square.  It is just a region of Massachusetts Avenue in Cambridge where there is so much commercial activity that there are zero buildings with residential occupancy at ground level.

^{\big 4}My kids all went to a private pre-school in Concord, MA, and almost all the parents owned dark blue Volvo 240DL station wagons.  Although our kids could all tell their parents’ car from the others at the grocery store, it just didn’t work at this pre-school.  The kids could never tell when it was their parent rolling up for the next pickup.  That was back when the 1% was a few more percentage points of the population, and not quite as hollowed out as now…

^{\big 5}Full disclosure.  I am on the advisory board for the Toyota Research Institute, but this blog post represents only my own thoughts on autonomous driving.

^{\big 6}Most people failed to notice that a technology, analog TV,  that had been omnipresent for most of their lives was overtaken and then one day it just disappeared as it did in the US on June 12, 2009.  Poof!  It was gone.