Blog – Page 3 – Rodney Brooks

December 16, 2020 — Quick Takes

An Analogy For The State Of AI

rodneybrooks.com/an-analogy-for-the-state-of-ai/

In surveys of AI “experts” on when we are going to get to human level intelligence in our AI systems, I am usually an outlier, predicting it will take ten or twenty times longer than the second most pessimistic person surveyed. Others have a hard time believing that it is not right around the corner given how much action we have seen in AI over the last decade.

Could I be completely wrong? I don’t think so (surprise!), and I have come up with an analogy that justifies my beliefs. Note, I started with the beliefs, and then found an analogy that works. But I think it actually captures why I am uneasy with the predictions of just two, or three, or seven decades, that abound for getting to human level intelligence. It’s a more sophisticated and detailed version of the story about how building longer and longer ladders will not get us to the Moon.

The analogy is to heavier than air flight.

All the language that follows is expressing that analogy.

Starting in 1956 our AI research got humans up in the air in hot air balloons, in tethered kites, and in gliders. About a decade ago Deep Learning cracked the power to weight ratio of engines that let us get to heavier than air powered flight.

If we look at the arc of the 20^th century, heavier than air flight transformed our world in major ways.

It had a major impact on war in just its first two decades, and four decades in it completely transformed how wars were fought, and continued that transformation for the rest of the century.

From the earliest days it changed supply chains for goods with high ratios of value to mass. First it was with mail–the speed of sending long messages (telegraphs were good only for short messages), and then later the speed of getting bank instructions and letters of credit, and receipts, all paper based, around the globe transforming commerce into a global enterprise. Later it could be used for the supply chains of intrinsically high value goods, including digital devices, and even bluefin tuna.

It also transformed human mobility, giving rise to global business even in the PZ¹ era. But it also gave the richest parts of the world a new level of personal mobility and a new sort of leisure for them. Airplanes caused overcrowding in Incan Citadels and palaces of divine kings, steadily getting worse as more countries and larger populations reached the wealth threshold of personal air travel. Who knew? I doubt that this implication was on the mind of either Wilbur or Orville.

Note that for the bulk of the 20^th century most heavier than air flight used a human pilot on board, and even today that is still true for the majority of flying vehicles. The same for AI. Humans will still be in the loop for AI applications, at one level or another, for many decades.

This analogy between AI and heavier than air flight embodies two things that I firmly believe about AI.

First, there is tremendous potential for our current version of AI. It will have enormous economic, cultural, and geopolitical impact. It will change life for ordinary people over the next many decades. It will create great riches for some. It will lead to different world views for many. Like heavier than air flight it will transform our world in ways in which we can not yet guess.

Second, we are not yet done with the technology by a long shot. The airplanes of 1910 (and AI of today) in hindsight look incredibly dangerous, their use was really only for daredevils without too much regard for fatal consequences, and no one but a complete nutcase flies a 1910 technology airplane any more. But the airplanes did get better, and the technology underwent radical changes. No modern airplane engine works at all like those of 1903 or even 1920. Deep Learning, too, will be replaced over time. It will seem quaint, and barely viable in retrospect. And woefully energy inefficient.

But what about human level intelligence? Well, my friends, I am afraid that in this analogy that lies on the Moon. No matter how much we improve on our basic architecture of our current AI over the next 100 years, we are just not going to get there. We’re going to need to invent something else. Rockets. To be sure, the technologies used in airplanes and rockets are not completely disjoint. But rockets are not just over developed heavier than air flying machines. They use a different set of principles, equations, and methodologies.

And now the real kicker for this analogy. We don’t yet know how deep is our gravity well. We don’t yet know how to build human level intelligence. Our current approaches to AI (and neuroscience) might be so far off that in retrospect it will seen to have been so wacky that it will be classified as “not even wrong”. (For those who do not know the reference that is a severe diss.)

Our gravity well? For a fixed density of its innards the surface gravity of a planet is proportional to its radius or diameter. If our Earth was twice its actual diameter, gravity would be twice what it is for us, and today’s chemical rockets probably wouldn’t have enough oomph to get people into orbit. Our space faring aspirations would be much harder to achieve, even at the tiny deployed level we have so far, 117 years after our first manned heavier than air flights. But could it be worse than two times? Earth with a 32,000 mile diameter, rather than our 8,000 mile diameter and gosh, chemical rockets might never be enough (and we would probably be much shorter and squatter…). That is how little we know today. We don’t know what the technology of human level intelligence is going to look like, so we can’t estimate how hard it is going to be to achieve. Two hundred years ago no one could have given a convincing argument on whether chemical powered rockets would get us to the Moon or not. We didn’t know the relationships between chemical reactions and Earth’s gravity that now give us the answer.

That human level intelligence up there on the Moon is going to be out of reach for a long, long time. We just do not know for how long, but we surely know that we are not going to fly there on winged aircraft.

¹Pre Zoom.

August 3, 2020 — Quick Takes

How Much Things Can Change

rodneybrooks.com/how-much-things-can-change/

This post is about how much things can change in the world over a lifetime. I’m going to restrict my attention to science, though there are many parallels in technology, human rights, and social justice.

I was born in late 1954 so I am 65 years old. I figure I have another 30 years, with some luck, of active intellectual life. But if I look backward and forward within my family, I knew as an adult some of my grandparents who were born late in the nineteenth century, and I expect that both I will know some of my own grandchildren when they are adults, and that they will certainly live into the twenty second century. My adult to adult interactions with members of my own direct genetic line will span five generations and well over two hundred years from the beginning to the end of their collective lives, from the nineteenth to the twenty second century.

I’m going to show how shocking the changes have been in science throughout just my lifetime, how even more shocking the changes have been since my grandparents were born, and by induction speculate on how much more shock there will be during my grandchildren’s lifetimes. All people who I have known.

Not everything will change, but certainly some things that we treat as truth and obvious today will no longer seem that way by early next century. We can’t know exactly which of them will be discarded, but I will put up quite a few candidates. I would be shocked if at least some of them have not fallen by the wayside a century from now.

How Has Science Changed Since Relatives I knew were born?

My oldest grandparent was born before the Michelson-Morley experiment of 1887 that established that the universe was not filled with aether, but rather with vacuum.

Even when all four of my grandparents were first alive neither relativity nor quantum mechanics had been thought of. Atoms with a nucleus of protons and neutrons, surrounded by a cloud of electrons were unknown. X rays and other radiation had not been detected.

The Earth was thought to be just 20 to 40 million years old, and it wasn’t until after I was born that the current estimates were first published.

It was two months after my father was born that Edwin Hubble described galaxies and declared that we lived in one of many, in our case the Milky Way. While my father was young the possibility that some elements might be fissile was discovered, along with the idea that some might be fusible. He was an adult back from the war when the big bang theory of the universe was first proposed, and it is only in the last 30 years that alternatives were largely shouted down.

In my lifetime we started out with nine planets, went down to eight, and now have observed thousands of them in nearby star systems. Plate tectonics, which first revealed that continents were not historically statically in place and explained both earthquakes and volcanoes were first hypothesized after I was born.

Crick and Watson determined the structure of DNA just the year before I was born. I was a toddler when Crick hypothesized the DNA to RNA translation and transcription mechanism, in school when the first experimental results showing how it might work came in, and in college before it was mostly figured out. Then it was realized that most animal and plant DNA does not code for proteins, and so it was labeled as junk DNA. Gradually over time other functions for that DNA have been discovered and it is now called non-coding DNA. All its functions have still not been worked out.

I was in graduate school when it was figured out that the split of all life on Earth into prokaryotes (cells without a nucleus) and eukaryotes (cells with a nucleus) was inadequate. All animals, plants, and fungi, belong to the latter class. But in fact there are two very distinct sorts of prokaryotes, both single celled, the bacteria and the archaea. The latter were completely unknown until the first ones were found in 1977. Now the tree of life has archaea and eukaryotes branching off from a common point on a different branch than the bacteria. We are more closely related to the unknown archaea than we are to bacteria. We had completely missed a major type of living organism; the archaea on Earth have a combined mass of more than three times that of all animals. We were just plain unaware of them–they are predominantly located in deep subsurface environments, so admittedly they were not hiding in plain sight.

In just the last few years we have realized that human bodies contain ten times more cells that are bacteria, than they contain cells that have our DNA in them, though the bacterial mass is only about 3% of our body weight. Before that we thought we were mostly us.

The physical structure of neurons and the way they connect with each other was first discovered when my grandparents were already teenagers and young adults. The rough way that neurons operate was elucidated in the decade before my birth, but the first paper that laid out a functional explanation of how neurons could operate in an ensemble did not occur until I was already in kindergarten (in the “What the Frog’s Eye Tells the Frog’s Brain paper”). Half the cells in brains, the glia, were long thought to be physical support and suppliers of nutrients to the neurons, playing no direct role in what neural systems did. In the second half of the twentieth century we have come to understand that they play a role in neurotransmission, and modulate all sorts of behavior of the neurons. There is still much to be learned. More recently the role of small molecules diffusing locally in the brain have been shown to also affect how neurons operate.

What is known in science about cosmology, physics, the mechanisms of life, and neuroscience had changed drastically since my grand parents were born, and has continued to change right up until today. Our scientific beliefs have not been static, and have constantly evolved.

Will science continue to change?

It seems entirely unlikely to me that my grandchildren will one day be able to say that up until they were born scientific ideas came and went with accepted truth regularly changing, but since they were born science has been very stable in its set of accepted truths.

Things will continue to change. Below I have put a few things that I think could change from now into the beginning of the next century. I am not saying that any particular one of these will be what changes. And I would be very surprised if more than half of these will be adopted. But I have selected the ideas that currently gnaw at me and do not feel as solid as some other ideas in science. Some will no doubt become more solid. But it will not surprise me so much if any individual one of these turns into accepted wisdom.

Cosmology:

There is no dark matter.
The Universe is not expanding.
The big bang was wrong.

Physics:

There is a big additional part of quantum mechanics to be understood.
String theory is bogus.
The many worlds interpretation is decided to be confused and discarded.

Life:

We discover that there is a common ancestor to archaea, bacteria, and eukarya, a fourth domain of life, that still exists in some places on Earth–and it is clearly a predecessor to the three that we know about now in that it does not have the full modern DNA version of genetics, but instead is a mixture of DNA and RNA based genetics, or purely RNA, or perhaps purely PNA, and has a simplified ancestral coding scheme.
We discover life elsewhere in the solar system and it is clearly not related to life on Earth. It is different, with different components and mechanisms, and its abiogenesis was clearly independent of the one on Earth.
We detect life on a planet that we can observe in a nearby solar system.
We detect an unambiguously artificial signal from much further away.

Neuroscience:

We discover the principles of a rich control system in plants that is not based on neurons, but nevertheless explains the complex behaviors of plants that we can observe when we speed up videos of them operating in the world. So much for electro-centrism!
We move away from computational neuroscience with a new set of metaphors that turn out to have both better explanatory power and provide tools that explain otherwise obtuse aspects of neural systems.
We find not just a different metaphor, but actual new mechanisms in neural systems of which we have not previously been aware, and they become dominant in our understanding of the brain.

How to have impact in the world

If you want to be a famous scientist then figure one of these out. You will redirect major intellectual pursuits of mankind. But, it will be a long lonely road.

May 18, 2020 — Quick Takes

Peer Review

rodneybrooks.com/peer-review/

This blog is not peer reviewed at all. I write it, I put it out there, and people read it or not. It is my little megaphone that I alone control.

But I don’t think anyone, or at least I hope that no-one, thinks that I am publishing scientific papers here. They are my opinion pieces, and only worthwhile if there are people who have found my previous opinions to have turned out to be right in some way.

There has been a lot of discussion recently about peer review. This post is to share some of my experiences with peer review, both as an author and as an editor, from three decades ago.

In my opinion peer review is far from perfect. But with determination new and revolutionary ideas can get through the peer review process, though it may take some years. The problem is, of course, that most revolutionary ideas are wrong, so peer review tends to stomp hard on all of them. The alternative is to have everyone self publish and that is what is happening with the arXiv distribution service. Papers are getting posted there with no intent of ever undergoing peer review, and so they are effectively getting published with no review. This can be seen as part of the problem of populism where all self proclaimed experts are listened to with equal authority, and so there is no longer any expertise.

My Experience with Peer Review as an Author

I have been struggling with a discomfort about where the herd has been headed in both Artificial Intelligence (AI) and neuroscience since the summer of 1984. This was a time between my first faculty job at Stanford and my long term faculty position at MIT. I am still concerned and I am busy writing a longish technical book on the subject–publishing something as a book gets around the need for full peer review, by the way…

When I got to MIT in the fall of 1984 I shifted my research based on my concerns. A year later I was ready to talk about the what I was doing, and submitted a journal paper describing the technical idea and an initial implementation. Here is one of the two reviews.

It was encouraging, but both it and a second review recommended that the paper not be published. That would have been my first rejection. However, the editor, George Bekey, decided to publish it anyway, and it appeared as:

Brooks, R. A. “A Robust Layered Control System for a Mobile Robot“, IEEE Journal of Robotics and Automation, Vol. 2, No. 1, March 1986, pp. 14–23; also MIT AI Memo 864, September 1985.

Google Scholar reports just under 12,000 citations of this paper, my most cited paper ever. The approach to controlling robots, the subsumption architecture that it proposed led directly to the Roomba, a robot vacuum cleaner, which with over 30 million sold is the most produced robot ever. Furthermore the control architecture was formalized over the years by a series of researchers, and its descendant, behavior trees, is now the basis for most video games. (Both Unity and Unreal use behavior trees to specify behavior.) The paper still has multi billion dollar impact every year.

Most researchers who stray, believing the herd is wrong, end up heading off in their own wrong direction. I was extraordinarily lucky to choose a direction that has had incredible practical impact.

However, I was worried at a deeper intellectual level, and so almost simultaneously started writing about the philosophical underpinnings of research in AI, and how my approach differed. There the reviews were more brutal, as is shown in a review here:

This was a a review of lab memo AIM-899, Achieving Artificial Intelligence through Building Robots which I had submitted to a conference.This paper was the first place that I talked about the possibility of robot vacuum cleaners as an example of how the philosophical approach I was advocating could lead to new practical results.

The review may be a little hard to read in the image above. It says:

This paper is an extended, wandering complaint that the world does not view the author’s work as the salvation of mankind.

There is no scientific content here; little in the way of reasoned argument, as opposed to petulant assertions and non-sequiturs; and ample evidence of ignorance of the literature on these questions. The only philosopher cited is Dreyfus–but many of the issues raised have been treated more intelligibly by others (the chair definition problem etc. by Wittgenstein and many successors; the interpreted toy proscription by Searle; the modularity question by Fodor; the multiple behaviors ideas by Tinbergen; and the constructivist approach by Glymour (who calls it computational positivism). The argument about evolution leaks all over, and the discussion on abstraction indicates the author has little understanding of analytic thought and scientific investigation.

Ouch! This was like waving a red flag at a bull. I posted this and other negative reviews on my office door where they stayed for many years. By June of the next year I had added to it substantially, and removed the vacuum cleaner idea, but kept in all the things that the reviewer did not like, and provocatively retitled it Intelligence Without Representation. I submitted the paper to journals and got further rejections–more posts for my door. Eventually its fame had spread to the point that the Artificial Intelligence Journal, the mainstream journal of the field, published it unchanged (Artificial Intelligence Journal (47), 1991, pp. 139–159) and it now has 6,900 citations. I outlasted the criticism and got published.

That same year at the major international conference IJCAI: International Joint Conference on Artificial Intelligence I was honored to win the Computers and Thought award, quite a surprise to me, and I think to just about every one else. With that honor came an invitation to have a paper in the proceedings without the six page limit that applied to everyone else, and without the peer review process that applied to everyone else. My article was twenty seven pages long, double column, a critical review article of the history of AI, also with a provocative and complementary title, Intelligence Without Reason, (Proceedings of 12th Int. Joint Conf. on Artificial Intelligence, Sydney, Australia, August 1991, pp. 569–595). It now has over 3,100 citations.

My three most cited papers were either rejected under peer review or accepted with no peer review. So I am not exactly a poster child for peer reviewed papers.

My Experience With Peer Review As an Editor

In 1987 I co-founded a journal, the International Journal of Computer Vision. It was published by Kluwer as a hardcopy journal for many years, but now it is run by Springer and is totally online. It is now in its 128^th volume, and has had many hundreds of issues. I co-edited the first seven volumes which together had a total of twenty eight issues.

The journal has a very strong reputation and consistently ranks in the top handful of places to publish in computer vision, itself a very hot topic of research today.

As an editor I soon learned a lot of things.

If a paper was purely theoretical with lots of equations and no experiments involving processing an image it was much more likely to get accepted than a paper which did have experimental results. I attributed this to people being unduly impressed by mathematics (I had a degree in pure mathematics and was not as easily impressed by equations and complex notation). I suspected that many times the reviewers did not fully read and understand the mathematics as many of them had very few comments about the contents of such papers. If, however, a paper had experiments with real images (and back then computers were so slow it was rarely more than a handful of images that had been processed), the same reviewers would pick apart the output, faulting it for not being as good as they thought it should be.
I soon learned that one particular reviewer would always read the mathematics in detail, and would always find things to critique about the more mathematical papers. This seemed good. Real peer review. But soon I realized that he would always recommend rejection. No paper was ever up to his standard. Reject! There were other frequent rejecters, but none as dogmatic as this particular one.
Likewise I found certain reviewers would always say accept. Now it was just a matter of me picking the right three referees for almost any paper and I could know whether the majority of reviewers would recommend acceptance or rejection before I had even sent the paper off to be reviewed. Not so good.
I came to realize that the editor’s job was real, and it required me to deeply understand the topic of the paper, and the biases of the reviewers, and not to treat the referees as having the right to determine the fate of the paper themselves. As an editor I had to add judgement to the process at many steps along the way, and to strive for the process to improve the papers, but also to let in ideas that were new. I now came to understand George Bekey and his role in my paper from just a couple of years before.

Peer reviewing and editing is a lot more like the process of one on one teaching than it is of processing the results of a multiple choice exam. When done right it is about coaxing the best out of scientists, and encouraging new ideas to flourish and the field to proceed.

The UPSHOT?

Those who think that peer review is inherently fair and accurate are wrong. Those who think that peer review necessarily suppresses their brilliant new ideas are wrong. It is much more than those two simple opposing tendencies.

Peer review grew up in a world where there were many fewer people engaging in science than today. Typically an editor would know everyone in the world who had contributed to the field in the past, and would have enough time to understand the ideas of each new entrant to the field as they started to submit papers. It relied on personal connections and deep and thoughtful understanding.

That has changed just due to the scale of the scientific endeavor today, and is no longer possible in that form.

There is a clamor for double blind anonymous review, in the belief that that produces a level playing field. While in some sense that is true, it also reduces the capacity for the nurturing of new ideas. Clamorers need to be careful what they wish for–metaphorically it reduces them to competing in a speed trial, rather than being appreciated for virtuosity. What they get in return for zeroing the risk of being rejected on the basis of their past history or which institution they are from is that they are condemned to forever aiming their papers at the middle of a field of mediocrity, with little chance for achieving greatness.

Another factor is that the number of new journals has changed. Institutions, and sometimes whole countries, decide that the way for them to get a better name for themselves is to have a scientific journal, or thirty. They set them up and put one of their local people who has no real understanding of the flow of ideas in the particular field at the global scale, as editor. Now editing becomes a mechanical process, with no understanding of the content of the paper or the qualifications of who they ask to do the reviews. I know this to be true as I regularly get asked to review papers in fields in which I have absolutely no knowledge, by journal editors that I have never heard of, nor of their journal, nor its history. I have been invited to submit a review that can not possibly be a good review. I must induce that other reviews may also not be very good.

I don’t have a solution, but I hope my observations here might be interesting to some.

April 11, 2020 — Quick Takes

What Networks Will Co-Evolve With AI and Robotics?

rodneybrooks.com/what-networks-will-co-evolve-with-ai-and-robotics/

Again and again in human history networks spanning physical geography have both enabled and been enabled by the very same innovations. Networks are the catalysts for the innovations and the innovations are the catalysts for the networks. This is autocatalysis at human civilization scale.

The Roman empire brought for people within its expanding borders long distance trade, communication, peace and stability. Key to this was the network of roads, many of which survive as the routes of modern transportation systems, and ports. And the stability of that network was made possible by the things that the empire brought.

The Silk Road, a network of trade routes, enabled many civilizations that themselves supported the continued existence of those trade routes.

In the eighteenth century England’s network of canals enabled both the delivery of raw materials, coal for power, and access to ports for the finished goods, enabling the industrial revolution with the invention of factories. The canals were built on the wealth of the factory owners who formed syndicates to build those canals.

Later the train network enhanced and replaced the canal network in England. And building a train network in the United States enabled large scale farming in the mid-west to have access to markets on the east coast, and later to ports on both coasts to make the US a major source of food. At the same time, a second network, the telegraph, was overlaid on the same physical route system, first to operate the train network itself and later to form the basis of new forms of communications.

As the later telephone networks were built they ushered in the world of commerce and general business became the principal industry instead of farming. And as business grew the need for more extensive telephone networks with more available lines grew with it.

When Henry Ford started mass producing automobiles he realized that a network of roads was necessary for the masses to have somewhere to drive. And as there were more and more roads the demand for automobiles increased. As a side effect the roads came to replace much of the rail network for moving goods around the country.

The personal computer of the 1980’s was not ubiquitous in ordinary households until it was coupled to the second generation data packet network that had started out as a reliable communications network for the military and for sharing scarce computer resources in academia. The pull on network bandwidth lead to rapid growth of the Internet, and that enabled the World Wide Web, a network of information overlaid on the data packet network, that gave a real impetus for more people to own their own computer.

As commerce started to be carried out on the Web, demand rose even more, and ultimately large data centers needed to be built as the backend of that commerce system. Then those data centers got offered to other businesses and cloud computing became a network of computational resources, on top of what had been a network for moving data from place to place.

Cloud computing enabled the large scale training needed for deep networks, a computational technique very vaguely inspired by the network of neurons in the brain. Deep networks are what many people call AI today. Those networks and their demands for computation for training are driving the growth of the cloud computing network, and a world wide network of low paid piece workers who label data needed to drive the training, using the substrate of the Web network to move the data around, and to get paid.

Are we at the end game for AI driving networks? Or when we can get past the very narrow capabilities of deep networks to new AI technologies will there be new networks that arise and are autocatalytic with the new AI?

And what about robotics?

The disruptions to world supply chains from COVID-19 are only just beginning to be seen–there will be turbulence in many areas later in 2020. The exceptionally lean supply chains we have lived with over the last few years (relying on a network of shipping routes that rely on the standardization of shipping containers to grow and interact) are likely to feel pressure to get a little fatter. That is likely to increase the demand for robotics and automation in those supply chains, a phenomenon that we have already see starting over the last few years.

Another lesson which may be drawn from the current pandemic is that more automation is needed in healthcare, as trained medical professions have been pushed to their limits of endurance, besides being in personal mortal peril at times.

So what might be the new networks that arise over the next few years, demanded by the way we change automation, and supported by that very change?

Here are a few ideas, none of which seem particularly compelling at the moment, certainly not in comparison to Roman roads or to the Internet itself:

A commerce network of data sets, and of sets of weights for networks trained on large data sets.

A physical network of supply points, down to the city or county level, for major robot components; mobile bases for indoors or outdoors, legged tracked and wheeled; various sensor packages; human-robot interaction displays and sensors; and all sorts of arms with different characteristics. These can be assembled plug and play to produced appropriate robots as needed to respond to all sorts of emergency needs.

A network of smart sensors embedded in almost everything in our lives, which lives on top of the current Internet–this is already getting built and is called IoT (Internet of Things).

A new supply network of both partially finished goods (e.g., standard embedded processor boards) and materials (seventy eight different sorts of raw stock for 3D printers a generation or two out) so that much more manufacturing can be done closer to end customers, using automation and robots.

An automated distribution network down to the street corner level in cities, with short term storage units on the sidewalk (probably a little bigger than the green storage units that the United States Postal Service has on street corners throughout US cities). Automated vehicles would supply these, perhaps at off peak traffic times, and then smaller neighborhood sidewalk robots would distribute to individual houses, or people could come to pick up.

I’m not particularly proud of or happy with any of these ideas. But based on history over the last 2,000 plus years I am confident that some sort of new networks will soon arise. Please do not let my lack of imagination dissuade you from the idea that new forms of networks will be coming.

January 1, 2020 — Dated Predictions

Predictions Scorecard, 2020 January 01

rodneybrooks.com/predictions-scorecard-2020-january-01/

On January 1^st, 2018, I made predictions (here) about self driving cars, Artificial Intelligence and machine learning, and about progress in the space industry. Those predictions had dates attached to them for 32 years up through January 1^st, 2050.

I made my predictions because at the time I saw an immense amount of hype about these three topics, and the general press and public drawing conclusions about all sorts of things they feared (e.g., truck driving jobs about to disappear, all manual labor of humans about to disappear) or desired (e.g., safe roads about to come into existence, a safe haven for humans on Mars about to start developing) being imminent. My predictions, with dates attached to them, were meant to slow down those expectations, and inject some reality into what I saw as irrational exuberance.

As part of self certifying the seriousness of my predictions I promised to review them, as made on January 1^st, 2018, every following January 1^st for 32 years, the span of the predictions, to see how accurate they were.

On January 1st, 2019, I posted my first annual self appraisal of how well I did. This post, today, January 1st, 2020, is my second annual self appraisal of how well I did–I have 30 more annual appraisals ahead of me. I think in the two years since my predictions, there has been a general acceptance that certain things are not as imminent or as inevitable as the majority believed just then. So some of my predictions now look more like “of course”, rather than “really, that long in the future?” as they did then.

This is a boring update. Despite lots of hoopla in the press about self driving cars, Artificial Intelligence and machine learning, and the space industry, this last year, 2019, was not actually a year of big milestones. Not much that will matter in the long run actually happened in 2019.

Furthermore, this year’s summary indicates that so far none of my predictions have turned out to be too pessimistic. Overall I am getting worried that I was perhaps too optimistic, and had bought into the hype too much. There is only one dated prediction of mine that I am currently worried may have been too pessimistic–I won’t name it here as perhaps I will turn out to be right after all.

Repeat of Last Year’s Explanation of Annotations

As I said last year, I am not going to edit my original post, linked above, at all, even though I see there are a few typos still lurking in it. Instead I have copied the three tables of predictions below from last year’s update post, and have simply added a total of six comments to the fourth column. As with last year I have highlighted dates in column two where the time they refer to has arrived.

I tag each comment in the fourth column with a cyan colored date tag in the form yyyymmdd such as 20190603 for June 3^rd, 2019.

The entries that I put in the second column of each table, titled “Date” in each case, back on January 1^st of 2018, have the following forms:

NIML meaning “Not In My Lifetime, i.e., not until beyond December 31^st, 2049, the last day of the first half of the 21^st century.

NET some date, meaning “No Earlier Than” that date.

BY some date, meaning “By” that date.

Sometimes I gave both a NET and a BY for a single prediction, establishing a window in which I believe it will happen.

For now I am coloring those statements when it can be determined already whether I was correct or not.

I have started using LawnGreen (#7cfc00) for those predictions which were entirely accurate. For instance a BY 2018 can be colored green if the predicted thing did happen in 2018, as can a NET 2019 if it did not happen in 2018 or earlier. There are five predictions now colored green, the same ones as last year, with no new ones in January 2020.

I will color dates Tomato (#ff6347) if I was too pessimistic about them. No Tomato dates yet. But if something happens that I said NIML, for instance, then it would go Tomato, or if in 2020 something already had happened that I said NET 2021, then that too would have gone Tomato.

If I was too optimistic about something, e.g., if I had said BY 2018, and it hadn’t yet happened, then I would color it DeepSkyBlue (#00bfff). None of these yet either. And eventually if there are NETs that went green, but years later have still not come to pass I may start coloring them LightSkyBlue (#87cefa).

In summary then: Green splashes mean I got things exactly right. Red means provably wrong and that I was too pessimistic. And blueness will mean that I was overly optimistic.

So now, here are the updated tables.

Self Driving Cars

No predictions have yet been relevant for self driving cars, but I have augmented one comment from last year in this first table. Also, see some comments right after this title.

Prediction [Self Driving Cars]	Date	2018 Comments	Updates
A flying car can be purchased by any US resident if they have enough money.	NET 2036	There is a real possibility that this will not happen at all by 2050.
Flying cars reach 0.01% of US total cars.	NET 2042	That would be about 26,000 flying cars given today's total.
Flying cars reach 0.1% of US total cars.	NIML
First dedicated lane where only cars in truly driverless mode are allowed on a public freeway.	NET 2021	This is a bit like current day HOV lanes. My bet is the left most lane on 101 between SF and Silicon Valley (currently largely the domain of speeding Teslas in any case). People will have to have their hands on the wheel until the car is in the dedicated lane.
Such a dedicated lane where the cars communicate and drive with reduced spacing at higher speed than people are allowed to drive	NET 2024
First driverless "taxi" service in a major US city, with dedicated pick up and drop off points, and restrictions on weather and time of day.	NET 2022	The pick up and drop off points will not be parking spots, but like bus stops they will be marked and restricted for that purpose only.	20190101 Although a few such services have been announced every one of them operates with human safety drivers on board. And some operate on a fixed route and so do not count as a "taxi" service--they are shuttle buses. And those that are "taxi" services only let a very small number of carefully pre-approved people use them. We'll have more to argue about when any of these services do truly go driverless. That means no human driver in the vehicle, or even operating it remotely. 20200101 During 2019 Waymo started operating a 'taxi service' in Chandler, Arizona, with no human driver in the vehicles. While this is a big step forward see comments below for why this is not yet a driverless taxi service.
Such "taxi" services where the cars are also used with drivers at other times and with extended geography, in 10 major US cities	NET 2025	A key predictor here is when the sensors get cheap enough that using the car with a driver and not using those sensors still makes economic sense.
Such "taxi" service as above in 50 of the 100 biggest US cities.	NET 2028	It will be a very slow start and roll out. The designated pick up and drop off points may be used by multiple vendors, with communication between them in order to schedule cars in and out.
Dedicated driverless package delivery vehicles in very restricted geographies of a major US city.	NET 2023	The geographies will have to be where the roads are wide enough for other drivers to get around stopped vehicles.
A (profitable) parking garage where certain brands of cars can be left and picked up at the entrance and they will go park themselves in a human free environment.	NET 2023	The economic incentive is much higher parking density, and it will require communication between the cars and the garage infrastructure.
A driverless "taxi" service in a major US city with arbitrary pick and drop off locations, even in a restricted geographical area.	NET 2032	This is what Uber, Lyft, and conventional taxi services can do today.
Driverless taxi services operating on all streets in Cambridgeport, MA, and Greenwich Village, NY.	NET 2035	Unless parking and human drivers are banned from those areas before then.
A major city bans parking and cars with drivers from a non-trivial portion of a city so that driverless cars have free reign in that area.	NET 2027 BY 2031	This will be the starting point for a turning of the tide towards driverless cars.
The majority of US cities have the majority of their downtown under such rules.	NET 2045
Electric cars hit 30% of US car sales.	NET 2027
Electric car sales in the US make up essentially 100% of the sales.	NET 2038
Individually owned cars can go underground onto a pallet and be whisked underground to another location in a city at more than 100mph.	NIML	There might be some small demonstration projects, but they will be just that, not real, viable mass market services.
First time that a car equipped with some version of a solution for the trolley problem is involved in an accident where it is practically invoked.	NIML	Recall that a variation of this was a key plot aspect in the movie "I, Robot", where a robot had rescued the Will Smith character after a car accident at the expense of letting a young girl die.

Chandler is a suburb of Phoenix and is itself the 84th largest city in the US. With apologies to residents of Chandler, I do not think that it comes to mind as a major US city for most Americans. Furthermore, the service has so far not been open to the public, but instead started with just a few hundred people (out of a population of about one quarter of a million residents) who had previously been approved to use the service when there was a human safety driver on board. These riders are banned from talking about when things go wrong so we really don’t know how well the systems works. Over 2019 the number of riders has grown to 1,500 monthly users, and a total of about 100,000 rides. Recently there has been an announcement that a phone app will make the service available to more users.

BUT, while there is no human driver in the taxi there is a remote human safety driver for all rides, as detailed in this story. While the humans can monitor more than one vehicle at a time, obviously there is a scaling issue, and the taxis are not truly autonomous. To make them so would be a big step. Also the taxis do not operate when it is raining. That would be the peak usage time for taxis in most cities. But they just don’t operate in the rain.

So… no self driving taxi service yet, even in a relatively small city with a population density many times less than that of major US cities.

The last twelve months have seen a real shakeout in expectations for deployment of self driving cars. Companies are realizing that it is much harder than the came to believe for a while, and that there are many issues beyond simply “driving”, that need to be addressed. I previously talked about a some of those issues in on this blog in January and June of 2017.

To illustrate how predictions have been slipping, here is a slide that I made for talks based on a snapshot of predictions about driverless cars from March 27, 2017. The web address still seems to give the same predictions with a couple more at the end that I couldn’t fit on my slide. In parentheses are the years the predictions were made, and in blue are the dates for when the innovation was predicted to happen.

Recently I had added some arrows to this slide. The skinny red arrows point to dates that have passed without the prediction coming to pass. The fatter orange arrows point to cases where company executives have since come out with updated predictions that are later than the ones given here. E.g., in the fourth line from the bottom, the Daimler chairman had said in 2014 that fully autonomous vehicles could be ready by 2025. In November of 2019 the chairman announced a reality check on self driving cars, as one can see in numerous online stories. Here is the first paragraph of one report on his remarks:

Mercedes-Benz parent Daimler has taken a “reality check” on self-driving cars. Making autonomous vehicles safe has proven harder than originally thought, and Daimler is now questioning their future earnings potential, CEO Ola Kaellenius told Reuters and other media.

Other reports of the same story can be found here and here.

None of the original predictions have come to pass, and those still standing are getting rather sparse.

<rant>

At the same time, however, there have been more outrageously optimistic predictions made about fully self driving cars being just around the corner. I won’t name names, but on April 23^rdof 2019, i.e., less than nine months ago, Elon Musk said that in 2020 Tesla would have “one million robo-taxis” on the road, and that they would be “significantly cheaper for riders than what Uber and Lyft cost today”. While I have no real opinion on the veracity these predictions, they are what is technically called bullshit. Kai-Fu Lee and I had a little exchange on Twitter where we agreed that together we would eat all such Tesla robo-taxis on the road at the end of this year, 2020.

</rant>

Artificial Intelligence and Machine Learning

I had not predicted any big milestones for AI and machine learning for the current period, and indeed there were none achieved.

We have seen certain proponents be very proud of how much more compute they have, growing at many times what Moore’s Law at its best would provide. I think it is fair to say that the results of all that computing since 2012 are not very impressive when compared to what a single human brain, powered at just 20 Watts has been able to achieve in the same time frame — one just has to look at someone who’s 20^th birthday is today, January 1^st, 2020, and compare what they know now and what they can achieve now to what they could do in 2012.

And there has even been a little backlash about the carbon footprint that modern ML data sets cause in training. There are even tools and best practices for cutting down the carbon footprint of your ML research. People can argue about the details, but no one can make a case that the energy usage is not many orders of magnitude more than used by the meat machine inside people’s heads, and that human performance is way more impressive than any machine performance to date. People get fooled all the time by the slick marketing around each new achievement by the machine learning companies, but when you poke them you see that the achievements are rather pathetic compared to human performance.

Without any retraining make a Go playing program compete against a human on a 25 by 25 board, or even an 18 by 18 board. Or change all the colors of the pixels in a Quake Three Arena, or change the screen resolution, and humans will adapt seamlessly while the ML trained systems will have to start from zero again.

While ML conference attendance has gone up by a factor of 20 or so, the results are not so interestingly more powerful in terms of impact they have on the real world.

Right after the Artificial Intelligence and machine learning table I have some links to back up today’s assertion in it that there are more blog posts pushing back on DL as being all we will need to get to human level (whatever that might mean) Artificial Intelligence.

Prediction [AI and ML]	Date	2018 Comments	Updates
Academic rumblings about the limits of Deep Learning	BY 2017	Oh, this is already happening... the pace will pick up.	20190101 There were plenty of papers published on limits of Deep Learning. I've provided links to some right below this table. 20200101 Go back to last year's update to see them.
The technical press starts reporting about limits of Deep Learning, and limits of reinforcement learning of game play.	BY 2018		20190101 Likewise some technical press stories are linked below. 20200101 Go back to last year's update to see them.
The popular press starts having stories that the era of Deep Learning is over.	BY 2020		20200101 We are seeing more and more opinion pieces by non-reporters saying this, but still not quite at the tipping point where reporters come at and say it. Axios and WIRED are getting close.
VCs figure out that for an investment to pay off there needs to be something more than "X + Deep Learning".	NET 2021	I am being a little cynical here, and of course there will be no way to know when things change exactly.
Emergence of the generally agreed upon "next big thing" in AI beyond deep learning.	NET 2023 BY 2027	Whatever this turns out to be, it will be something that someone is already working on, and there are already published papers about it. There will be many claims on this title earlier than 2023, but none of them will pan out.
The press, and researchers, generally mature beyond the so-called "Turing Test" and Asimov's three laws as valid measures of progress in AI and ML.	NET 2022	I wish, I really wish.
Dexterous robot hands generally available.	NET 2030 BY 2040 (I hope!)	Despite some impressive lab demonstrations we have not actually seen any improvement in widely deployed robotic hands or end effectors in the last 40 years.
A robot that can navigate around just about any US home, with its steps, its clutter, its narrow pathways between furniture, etc.	Lab demo: NET 2026 Expensive product: NET 2030 Affordable product: NET 2035	What is easy for humans is still very, very hard for robots.
A robot that can provide physical assistance to the elderly over multiple tasks (e.g., getting into and out of bed, washing, using the toilet, etc.) rather than just a point solution.	NET 2028	There may be point solution robots before that. But soon the houses of the elderly will be cluttered with too many robots.
A robot that can carry out the last 10 yards of delivery, getting from a vehicle into a house and putting the package inside the front door.	Lab demo: NET 2025 Deployed systems: NET 2028
A conversational agent that both carries long term context, and does not easily fall into recognizable and repeated patterns.	Lab demo: NET 2023 Deployed systems: 2025	Deployment platforms already exist (e.g., Google Home and Amazon Echo) so it will be a fast track from lab demo to wide spread deployment.
An AI system with an ongoing existence (no day is the repeat of another day as it currently is for all AI systems) at the level of a mouse.	NET 2030	I will need a whole new blog post to explain this...
A robot that seems as intelligent, as attentive, and as faithful, as a dog.	NET 2048	This is so much harder than most people imagine it to be--many think we are already there; I say we are not at all there.
A robot that has any real idea about its own existence, or the existence of humans in the way that a six year old understands humans.	NIML

There are outlets now for non-journalists, perhaps practitioners in a scientific field, to write position papers that get widely referenced in social media. These position papers are often forerunners of what the popular press will soon start reporting.

During 2019 we saw many, many well informed such position papers/blogposts. We have seen explanations on how machine learning has limitations on when it makes sense to be used and that it may not be a universal silver bullet. There have been posts that deep learning may be hitting limits as it has no common sense. We have seen questions about the practical value of the results of deep learning on game playing as game playing is precisely where we have massive amounts of completely relevant data–problems in the real world more commonly have very little data and reasoning from other domains is imperative to figuring out how to make progress on the problem. And we have seen warnings that all the over-hype of machine and deep learning may lead to a new AI winter when those tens of thousands of jolly conference attendees will no longer have grants and contracts to pay for travel to and attendance at their fiestas.

I am very concerned about what will happen when the current machine/deep learning bubble bursts. We have seen the bursting of hype bubbles decimate AI research before. The self driving cars bubble and its bubble bursting having a potential negative impact in AI research also worries me.

Space

There were no target dates that have been hit or missed in the last year in the space launch domain, but I have made a couple of update comments in the following table, and then follow it with details in the text below.

Prediction [Space]	Date	2018 Comments	Updates
Next launch of people (test pilots/engineers) on a sub-orbital flight by a private company.	BY 2018		20190101 Virgin Galactic did this on December 13, 2018. 20200101 On February 22, 2019, Virgin Galactic had their second flight, this time with three humans on board, to space of their current vehicle. As far as I can tell that is the only sub-orbital flight of humans in 2019. Blue Origin's new Shepard flew three times in 2019, but with no people aboard as in all its flights so far.
A few handfuls of customers, paying for those flights.	NET 2020
A regular sub weekly cadence of such flights.	NET 2022 BY 2026
Regular paying customer orbital flights.	NET 2027	Russia offered paid flights to the ISS, but there were only 8 such flights (7 different tourists). They are now suspended indefinitely.
Next launch of people into orbit on a US booster.	NET 2019 BY 2021 BY 2022 (2 different companies)	Current schedule says 2018.	20190101 It didn't happen in 2018. Now both SpaceX and Boeing say they will do it in 2019. 20200101 Both Boeing and SpaceX had major failures with their systems during 2019, though no humans were aboard in either case. So this goal was not achieved in 2019. Both companies are optimistic of getting it done in 2020, as they were for 2019. I'm sure it will happen eventually for both companies.
Two paying customers go on a loop around the Moon, launch on Falcon Heavy.	NET 2020	The most recent prediction has been 4th quarter 2018. That is not going to happen.	20190101 I'm calling this one now as SpaceX has revised their plans from a Falcon Heavy to their still developing BFR (or whatever it gets called), and predict 2023. I.e., it has slipped 5 years in the last year.
Land cargo on Mars for humans to use at a later date	NET 2026	SpaceX has said by 2022. I think 2026 is optimistic but it might be pushed to happen as a statement that it can be done, rather than for an pressing practical reason.
Humans on Mars make use of cargo previously landed there.	NET 2032	Sorry, it is just going to take longer than every one expects.
First "permanent" human colony on Mars.	NET 2036	It will be magical for the human race if this happens by then. It will truly inspire us all.
Point to point transport on Earth in an hour or so (using a BF rocket).	NIML	This will not happen without some major new breakthrough of which we currently have no inkling.
Regular service of Hyperloop between two cities.	NIML	I can't help but be reminded of when Chuck Yeager described the Mercury program as "Spam in a can".

During a ground test of the SpaceX Crewed Dragon capsule, on April 20^th, 2019, it exploded catastrophically. This delayed the SpaceX program so that no manned test could be done in 2019. SpaceX traced the problem to a valve failure when starting up the capsule abort engines, needed during launch if the booster rocket is undergoing failure. They currently have a test scheduled for early 2020 where these engines will be ignited during a launch so that the capsule can safely fly away from the launch vehicle.

In December of 2019 Boeing had a major test of its CST-100 Starliner capsule, and ended up with both a failure and a success for the mission. It was supposed to be the final unmanned test of the vehicle, and was planned to dock with the International Space Station (ISS) and then do a soft landing on the ground. It launched on December 20^th and achieved orbit, but due to software failures it was the wrong orbit and there was not enough fuel left to get it to the ISS. This was a major failure. On the other hand it achieved a major success in doing a soft landing in New Mexico on December 22^nd.

Other Hype Magnets

I have not felt qualified to talk about the hype impact for both quantum computing and block chain. Just at the end of 2019 there was a very interesting blog post by Scott Aaronson, a true expert and theoretical contributor to the field of quantum computing, on how to read announcements about quantum computing results. I recommend it.

May 26, 2019 — Quick Takes

Guest Post by Phillip Alvelda: Pondering the Empathy Gap

rodneybrooks.com/guest-post-by-phillip-alveda-pondering-the-empathy-gap/

[Phillip Alvelda is an old friend from MIT, and CEO of Brainworks.]

Pondering how to close what seems to be a rapidly widening empathy gap here in the U.S. and globally.

I used to just be resigned to the fact that many of my white friends who had never felt, or experienced discrimination directed at themselves seem incapable of seeing or recognizing implicit, or even explicit, bias directed at others. I didn’t used to think of these people as mean or racist…just oblivious through lack of direct experience.

But now, with a nation inflamed by our own government inciting and validating hatred and bigotry, with brown asylum seekers and children dying in mass US internment camps, and LGBTQ and women’s’ rights under mounting assault, the discrimination has literally turned lethal. And the empathy gap is enabling these crimes against humanity to continue and grow in the US now, just like the silent majority in Weimar Germany allowed the Jewish genocide to advance.

I’ve come to see supporters of this corrupt and criminal administration as increasingly complicit in the ongoing crimes. It is no longer just a matter of not seeing discrimination that doesn’t impact your family directly.

Trump supporters and anyone who supports any of his Republican enablers must now find some way to look past the growing reports of discrimination, minority voter suppression and gerrymandering, hate crimes, repression, the roll back of women’s and LGBTQ rights, a measurable biased justice system, mass internment camps, and now even the murder of the weak and vulnerable kidnapped children that commit no crime other than to follow our own ancestors to seek freedom and opportunity in the US….. This growing mass of willfully blind conservatives have abandoned fair morality, and are direct enablers of evil.

We are now in an era I never thought to see in the US, when government manufactured propaganda is purposely driving the dehumanization of women, LGBTQ people, and people of color. The US empathy gap is widening rapidly. How can we fight these dark divisive forces and narrow the gap, when our polarized society can’t even agree on measurable objective realities like the climate crisis?

Otherwise, I fear the U.S. is on a path to dissolve into at least two countries, divided along a border between those states who value empathy and seek an inclusive and pluralistic future society, and those who seek to retreat to tribal protectionism of historical rights for a shrinking privileged majority.

That this struggle rises now really baffles me. Consider the world’s obviously increasing wealth and abundance, with declining poverty and starvation and increasing access to virtually unlimited renewable energy. The need for tribal dominance to horde resources is dissapearing. The need for borders to protect resources that are no longer scarce, is vanishing.

Just imagine if all of our military and arms spending, all of the money we spend enforcing borders and limiting access to food and medicine and energy and education were instead directed towards sharing this abundance!

Pluralism and empathy are clearly the answer. How can we get more people to realize this despite the onslaught of vitriol and tribal Incitement from the likes of Fox News?

May 17, 2019 — Quick Takes

AGI Has Been Delayed

rodneybrooks.com/agi-has-been-delayed/

A very recent article follows in the footsteps of many others talking about how the promise of autonomous cars on roads is a little further off than many pundits have been predicting for the last few years. Readers of this blog will know that I have been saying this for over two years now. Such skepticism is now becoming the common wisdom.

In this new article at The Ringer, from May 16^th, the author Victor Luckerson, reports:

Elon Musk, the driverless car is always right around the corner. At an investor day event last month focused on Tesla’s autonomous driving technology, the CEO predicted that his company would have a million cars on the road next year with self-driving hardware “at a reliability level that we would consider that no one needs to pay attention.” That means Level 5 autonomy, per the Society of Automotive Engineers, or a vehicle that can travel on any road at any time without human intervention. It’s a level of technological advancement I once compared to the Batmobile.

Musk has made these kinds of claims before. In 2015 he predicted that Teslas would have “complete autonomy” by 2017 and a regulatory green light a year later. In 2016 he said that a Tesla would be able to drive itself from Los Angeles to New York by 2017, a feat that still hasn’t happened. In 2017 he said people would be able to safely sleep in their fully autonomous Teslas in about two years. The future is now, but napping in the driver’s seat of a moving vehicle remains extremely dangerous.

When I saw someone tweeting that Musk’s comments meant that a million autonomous taxis would be on the road by 2020, I tweeted out the following:

Let’s count how many truly autonomous (no human safety driver) Tesla taxis (public chooses destination & pays) on regular streets (unrestricted human driven cars on the same streets) on December 31, 2020. It will not be a million. My prediction: zero. Count & retweet this then.

I think these three criteria need to be met before someone can say that we have autonomous taxis on the road.

The first challenge, no human safety driver, has not been met by a single experimental deployment of autonomous vehicles on public roads anywhere in the world. They all have safety humans in the vehicle. A few weeks ago I saw an autonomous shuttle trial along the paved beachside public walkways at the beach on which I grew up, in Glenelg, South Australia, where there were two “two onboard stewards to ensure everything runs smoothly” along with eight passengers. Today’s demonstrations are just not autonomous. In fact in the article above Luckerson points out that Uber’s target is to have their safety drivers intervene only once every 13 miles, but they are way off that capability at this time. Again, hardly autonomous, even if they were to meet that goal. Imagine having a breakdown of your car that you are driving once every 13 miles–we expect better.

And if normal human beings can’t simply use these services (in Waymo’s Phoenix trial only 400 pre-approved people are allowed to try them out) and go anywhere that they can go in a current day taxi, then really the things deployed will not be autonomous taxis. They will be something else. Calling them taxis would be redefining what a taxi is. And if you can just redefine words on a whim there is really not much value to your words.

I am clearly skeptical about seeing autonomous cars on our roads in the next few years. In the long term I am enthusiastic. But I think it is going to take longer than most people think.

In response to my tweet above, Kai-Fu Lee, a very strong enthusiast about the potential for AI, and a large investor in Chinese AI companies, replied with:

If there are a million Tesla robo-taxis functioning on the road in 2020, I will eat them. Perhaps @rodneyabrooks will eat half with me?

I readily replied that I would be happy to share the feast!

Luckerson talks about how executives, in general, are backing off from their previous predictions about how close we might be to having truly autonomous vehicles on our roads. Most interestingly he quotes Chris Urmson:

Chris Urmson, the former leader of Google’s self-driving car project, once hoped that his son wouldn’t need a driver’s license because driverless cars would be so plentiful by 2020. Now the CEO of the self-driving startup Aurora, Urmson says that driverless cars will be slowly integrated onto our roads “over the next 30 to 50 years.”

Now let’s take note of this. Chris Urmson was the leader of Google’s self-driving car project, which became Waymo around the time he left, and is the CEO of a very well funded self-driving start up. He says “30 to 50 years”. Chris Urmson has been a leader in the autonomous car world since before it entered mainstream consciousness. He has lived and breathed autonomous vehicles for over ten years. No grumpy old professor is he. He is a doer and a striver. If he says it is hard then we know that it is hard.

I happen to agree, but I want to use this reality check for another thread.

If we were to have AGI, Artificial General Intelligence, with human level capabilities, then certainly it ought to be able to drive a car, just like a person, if not better. Now a self driving car does not need to have general human level intelligence, but a self driving car is certainly a lower bound on human level intelligence. Urmson, a strong proponent of self driving cars says 30 to 50 years.

So what does that say about predictions that AGI is just around the corner? And what does it say about it being an existential threat to humanity any time soon. We have plenty of existential threats to humanity lining up to bash us in the short term, including climate change, plastics in the oceans, and a demographic inversion. If AGI is a long way off then we can not say anything sensible today about what promises or threats it might provide as we need to completely re-engineer our world long before it shows up, and when it does show up it will be in a world that we can not yet predict.

Do people really say that AGI is just around the corner? Yes, they do…

Here is a press report on a conference on “Human Level AI” that was held in 2018. It reports that $37\%$ of respondents to a survey at that conference said they expected human level AI to be around in 5 to 10 years. Now, I must say that looking through the conference site I see more large hats than cattle, but these are mostly people with paying corporate or academic jobs, and $37\%$ of them think this.

Ray Kurzweil still maintains, in Martin Ford’s recent book that we will see a human level intelligence by 2029–in the past he has claimed that we will have a singularity by then as the intelligent machines will be so superior to human level intelligence that they will exponentially improve themselves (see my comments on belief in magic as one of the seven deadly sins in predicting the future of AI). Mercifully the average prediction of the 18 respondents for this particular survey was that AGI would show up around 2099. I may have skewed that average a little as I was an outlier amongst the 18 people at the year 2200. In retrospect I wish I had said 2300 and that is the year I have been using in my recent talks.

And a survey taken by the Future of Life Institute (warning: that institute has a very dour view of the future of human life, worse than my concerns of a few paragraphs ago) says were are going to get AGI around 2050.

But that is the low end of when Urmson thinks we will have autonomous cars deployed. Suppose he is right about his range. And suppose I am right that autonomous driving is a lower bound on AGI, and I believe it is a very low bound. With these very defensible assumptions then the seemingly sober experts in Martin Ford’s new book are on average wildly optimistic about when AGI is going to show up.

AGI has been delayed.

March 19, 2019 — Reviews

A Better Lesson

rodneybrooks.com/a-better-lesson/

Just last week Rich Sutton published a very short blog post titled The Bitter Lesson. I’m going to try to keep this review shorter than his post. Sutton is well known for his long and sustained contributions to reinforcement learning.

In his post he argues, using many good examples, that over the 70 year history of AI, more computation and less built in knowledge has always won out as the best way to build Artificial Intelligence systems. This resonates with a current mode of thinking among many of the newer entrants to AI that it is better to design learning networks and put in massive amounts of computer power, than to try to design a structure for computation that is specialized in any way for the task. I must say, however, that at a two day work shop on Deep Learning last week at the National Academy of Science, the latter idea was much more in vogue, something of a backlash against exactly what Sutton is arguing.

I think Sutton is wrong for a number of reasons.

One of the most celebrated successes of Deep Learning is image labeling, using CNNs, Convolutional Neural Networks, but the very essence of CNNs is that the front end of the network is designed by humans to manage translational invariance, the idea that objects can appear anywhere in the frame. To have a Deep Learning network also have to learn that seems pedantic to the extreme, and will drive up the computational costs of the learning by many orders of magnitude.
There are other things in image labeling that suffer mightily because the current crop of CNNs do not have certain things built in that we know are important for human performance. E.g., color constancy. This is why the celebrated example of a traffic stop sign with some pieces of tape on it is seen as a 45 mph speed limit sign by a certain CNN trained for autonomous driving. No human makes that error because they know that stop signs are red, and speed limit signs are white. The CNN doesn’t know that, because the relationship between pixel color in the camera and the actual color of the object is a very complex relationship that does not get elucidated with the measly tens of millions of training images that the algorithms are trained on. Saying that in the future we will have viable training sets is shifting the human workload to creating massive training sets and encoding what we want the system to learn in the labels. This is just as much building knowledge in as it would be to directly build a color constancy stage. It is sleight of hand in moving the human intellectual work to somewhere else.
In fact, for most machine learning problems today a human is needed to design a specific network architecture for the learning to proceed well. So again, rather than have the human build in specific knowledge we now expect the human to build the particular and appropriate network, and the particular training regime that will be used. Once again it is sleight of hand to say that AI succeeds without humans getting into the loop. Rather we are asking the humans to pour their intelligence into the algorithms in a different place and form.
Massive data sets are not at all what humans need to learn things so something is missing. Today’s data sets can have billions of examples, where a human may only require a handful to learn the same thing. But worse, the amount of computation needed to train many of the networks we see today can only be furnished by very large companies with very large budgets, and so this push to make everything learnable is pushing the cost of AI outside that of individuals or even large university departments. That is not a sustainable model for getting further in intelligent systems. For some machine learning problems we are starting to see a significant carbon foot print due to the power consumed during the learning phase.
Moore’s Law is slowing down, so that some computer architects are reporting the doubling time in amount of computation on a single chip is moving from one year to twenty years. Furthermore the breakdown of Dennard scaling back in 2006 means that the power consumption of machines goes up as they perform better, and so we can not afford to put even the results of machine learning (let alone the actual learning) on many of our small robots–self driving cars require about 2,500 Watts of power for computation–a human brain only requires 20 Watts. So Sutton’s argument just makes this worse, and makes the use of AI and ML impractical.
Computer architects are now trying to compensate for these problems by building special purpose chips for runtime use of trained networks. But they need to lock in the hardware to a particular network structure and capitalize on human analysis of what tricks can be played without changing the results of the computation, but with greatly reduced power budgets. This has two drawbacks. First it locks down hardware specific to particular solutions, so every time we have a new ML problem we will need to design new hardware. And second, it once again is simply shifting where human intelligence needs to be applied to make ML practical, not eliminating the need for humans to be involved in the design at all.

So my take on Rich Sutton’s piece is that the lesson we should learn from the last seventy years of AI research is not at all that we should just use more computation and that always wins. Rather I think a better lesson to be learned is that we have to take into account the total cost of any solution, and that so far they have all required substantial amounts of human ingenuity. Saying that a particular solution style minimizes a particular sort of human ingenuity that is needed while not taking into account all the other places that it forces human ingenuity (and carbon footprint) to be expended is a terribly myopic view of the world.

This review, including this comment, is seventy six words shorter than Sutton’s post.

January 1, 2019 — Dated Predictions

Predictions Scorecard, 2019 January 01

rodneybrooks.com/predictions-scorecard-2019-january-01/

On January 1st, 2018, I made predictions (here) about self driving cars, Artificial Intelligence and machine learning, and about progress in the space industry. Those predictions had dates attached to them for 32 years up through January 1st, 2050.

So, today, January 1st, 2019, is my first annual self appraisal of how well I did. I’ll try to do this every year for 32 years, if I last that long.

I am not going to edit my original post, linked above, at all, even though I see there are a few typos still lurking in it. Instead I have copied the three tables of predictions below. I have changed the header of the third column in each case to “2018 Comments”, but left the comments exactly as they were, and added a fourth column titled “Updates”. In one case I fixed a typo (about self driving taxis in Cambridgeport and Greenwich Village) in the left most column. I have started highlighting the dates in column two where the time they refer to has arrived, and I am starting to put comments in the updates fourth column.

I will tag each comment in the fourth column with a cyan colored date tag in the form yyyymmdd such as 20190603 for June 3^rd, 2019.

The entries that I put in the second column of each table, titled “Date” in each case, back on January 1^st of 2018, have the following forms:

NIML meaning “Not In My Lifetime, i.e., not until beyond December 31^st, 2049, the last day of the first half of the 21^st century.

NET some date, meaning “No Earlier Than” that date.

BY some date, meaning “By” that date.

Sometimes I gave both a NET and a BY for a single prediction, establishing a window in which I believe it will happen.

For now I am coloring those statements when it can be determined already whether I was correct or not.

I will color dates Tomato (#ff6347) if I was too pessimistic about them. No Tomato dates yet. But if something happens that I said NIML, for instance then it would go Tomato, or if in 2019 something already had happened that I said NET 2020, then that too would go Tomato.

In summary then: Green splashes mean I got things exactly right. Red means provably wrong and that I was too pessimistic. And blueness will mean that I was overly optimistic.

So now, here are the updated tables. So far none of my predictions have been at all wrong–there is only one direction to go from here!

No predictions have yet been relevant for self driving cars, but I have added one comment in this first table.

Prediction [Self Driving Cars]	Date	2018 Comments	Updates
A flying car can be purchased by any US resident if they have enough money.	NET 2036	There is a real possibility that this will not happen at all by 2050.
Flying cars reach 0.01% of US total cars.	NET 2042	That would be about 26,000 flying cars given today's total.
Flying cars reach 0.1% of US total cars.	NIML
First dedicated lane where only cars in truly driverless mode are allowed on a public freeway.	NET 2021	This is a bit like current day HOV lanes. My bet is the left most lane on 101 between SF and Silicon Valley (currently largely the domain of speeding Teslas in any case). People will have to have their hands on the wheel until the car is in the dedicated lane.
Such a dedicated lane where the cars communicate and drive with reduced spacing at higher speed than people are allowed to drive	NET 2024
First driverless "taxi" service in a major US city, with dedicated pick up and drop off points, and restrictions on weather and time of day.	NET 2022	The pick up and drop off points will not be parking spots, but like bus stops they will be marked and restricted for that purpose only.	20190101 Although a few such services have been announced every one of them operates with human safety drivers on board. And some operate on a fixed route and so do not count as a "taxi" service--they are shuttle buses. And those that are "taxi" services only let a very small number of carefully pre-approved people use them. We'll have more to argue about when any of these services do truly go driverless. That means no human driver in the vehicle, or even operating it remotely.
Such "taxi" services where the cars are also used with drivers at other times and with extended geography, in 10 major US cities	NET 2025	A key predictor here is when the sensors get cheap enough that using the car with a driver and not using those sensors still makes economic sense.
Such "taxi" service as above in 50 of the 100 biggest US cities.	NET 2028	It will be a very slow start and roll out. The designated pick up and drop off points may be used by multiple vendors, with communication between them in order to schedule cars in and out.
Dedicated driverless package delivery vehicles in very restricted geographies of a major US city.	NET 2023	The geographies will have to be where the roads are wide enough for other drivers to get around stopped vehicles.
A (profitable) parking garage where certain brands of cars can be left and picked up at the entrance and they will go park themselves in a human free environment.	NET 2023	The economic incentive is much higher parking density, and it will require communication between the cars and the garage infrastructure.
A driverless "taxi" service in a major US city with arbitrary pick and drop off locations, even in a restricted geographical area.	NET 2032	This is what Uber, Lyft, and conventional taxi services can do today.
Driverless taxi services operating on all streets in Cambridgeport, MA, and Greenwich Village, NY.	NET 2035	Unless parking and human drivers are banned from those areas before then.
A major city bans parking and cars with drivers from a non-trivial portion of a city so that driverless cars have free reign in that area.	NET 2027 BY 2031	This will be the starting point for a turning of the tide towards driverless cars.
The majority of US cities have the majority of their downtown under such rules.	NET 2045
Electric cars hit 30% of US car sales.	NET 2027
Electric car sales in the US make up essentially 100% of the sales.	NET 2038
Individually owned cars can go underground onto a pallet and be whisked underground to another location in a city at more than 100mph.	NIML	There might be some small demonstration projects, but they will be just that, not real, viable mass market services.
First time that a car equipped with some version of a solution for the trolley problem is involved in an accident where it is practically invoked.	NIML	Recall that a variation of this was a key plot aspect in the movie "I, Robot", where a robot had rescued the Will Smith character after a car accident at the expense of letting a young girl die.

Right after the Artificial Intelligence and machine learning table I have some links to back up my assertions.

Prediction [AI and ML]	Date	2018 Comments	Updates
Academic rumblings about the limits of Deep Learning	BY 2017	Oh, this is already happening... the pace will pick up.	20190101 There were plenty of papers published on limits of Deep Learning. I've provided links to some right below this table.
The technical press starts reporting about limits of Deep Learning, and limits of reinforcement learning of game play.	BY 2018		20190101 Likewise some technical press stories are linked below.
The popular press starts having stories that the era of Deep Learning is over.	BY 2020
VCs figure out that for an investment to pay off there needs to be something more than "X + Deep Learning".	NET 2021	I am being a little cynical here, and of course there will be no way to know when things change exactly.
Emergence of the generally agreed upon "next big thing" in AI beyond deep learning.	NET 2023 BY 2027	Whatever this turns out to be, it will be something that someone is already working on, and there are already published papers about it. There will be many claims on this title earlier than 2023, but none of them will pan out.
The press, and researchers, generally mature beyond the so-called "Turing Test" and Asimov's three laws as valid measures of progress in AI and ML.	NET 2022	I wish, I really wish.
Dexterous robot hands generally available.	NET 2030 BY 2040 (I hope!)	Despite some impressive lab demonstrations we have not actually seen any improvement in widely deployed robotic hands or end effectors in the last 40 years.
A robot that can navigate around just about any US home, with its steps, its clutter, its narrow pathways between furniture, etc.	Lab demo: NET 2026 Expensive product: NET 2030 Affordable product: NET 2035	What is easy for humans is still very, very hard for robots.
A robot that can provide physical assistance to the elderly over multiple tasks (e.g., getting into and out of bed, washing, using the toilet, etc.) rather than just a point solution.	NET 2028	There may be point solution robots before that. But soon the houses of the elderly will be cluttered with too many robots.
A robot that can carry out the last 10 yards of delivery, getting from a vehicle into a house and putting the package inside the front door.	Lab demo: NET 2025 Deployed systems: NET 2028
A conversational agent that both carries long term context, and does not easily fall into recognizable and repeated patterns.	Lab demo: NET 2023 Deployed systems: 2025	Deployment platforms already exist (e.g., Google Home and Amazon Echo) so it will be a fast track from lab demo to wide spread deployment.
An AI system with an ongoing existence (no day is the repeat of another day as it currently is for all AI systems) at the level of a mouse.	NET 2030	I will need a whole new blog post to explain this...
A robot that seems as intelligent, as attentive, and as faithful, as a dog.	NET 2048	This is so much harder than most people imagine it to be--many think we are already there; I say we are not at all there.
A robot that has any real idea about its own existence, or the existence of humans in the way that a six year old understands humans.	NIML

With regards to academic rumblings about deep learning, in 2017 there was a new cottage industry in attacking deep learning by constructing fake images for which a deep learning network gave high scores for ridiculous interpretations. These are known as adversarial attacks on deep learning, and some defenders counter claim that such images will never arrive in practice.

But then in 2018 others found images that were completely natural that fooled particular deep learning networks. A group of researchers from Auburn University in Alabama show how an otherwise well trained network can just completely misclassify objects with unusual orientations, in ways which no human would get wrong at all. Here are some examples:

We humans can see why or how a network might get the first one wrong for instance. It is a large yellow object across a snowy road. But other clues, like the size of the person standing in front of it immediately get us to understand that it is a school bus on its side across the road, and we are looking at its roof.

And here is a paper from researchers at York University and the University of Toronto (both in Toronto) with this abstract:

We showcase a family of common failures of state-of-the art object detectors. These are obtained by replacing image sub-regions by another sub-image that contains a trained object. We call this “object transplanting”. Modifying an image in this manner is shown to have a non-local impact on object detection. Slight changes in object position can affect its identity according to an object detector as well as that of other objects in the image. We provide some analysis and suggest possible reasons for the reported phenomena.

In all their images a human can easily see that an object (e.g., an elephant, say, and hence the very clever title of the paper, “The Elephant in the Room”) has been pasted on to a real scene, and both understand the real scene and identify the object pasted on. The deep learning network can often do neither.

Other academics took to more popular press outlets to express their concerns that the press was overhyping deep learning, and showing what the limits are in reality. There was a piece by Michael Jordan of UC Berkeley in Medium, an op-ed in the New York Times by Gary Marcus and Ernest Davis of NYU and a story on the limits of Google Translate in the Atlantic by Douglas Hofstadter of Indiana University at Bloomington.

As for stories in the technical press there were many that sounded warning alarms about how deep learning was not necessarily going to the greatest most important technical breakthrough in the history of mankind. I must admit, however, that more than 99% of the popular press stories did lean towards that far fetched conclusion, especially in the headlines.

Here is PC Magazine talking about the limits in language understanding, Forbes magazine on the overhyping of deep learning. A national security newsletter quotes a Nobel prizewinner on AI:

Intuition, insight, and learning are no longer exclusive possessions of human beings: any large high-speed computer can be programed to exhibit them also.

This was said by Herb Simon in 1958. The newsletter goes on to warn that over hype is nothing new in AI and that it could well lead to another AI winter. Harvard Magazine reports on the dangers applying a an inadequate AI system to decision making about humans. And many many outlets reported on an experimental Amazon recruiting tool that learned biases against women candidates from looking at how humans had evaluated CVs.

The press is not yet fully woke with regard to AI, and deep learning in particular, but there are signs and examples of wokeness showing up all over.

Developments in space were the most active for this first year, and fortunately both my optimism and pessimism were well place and were each rewarded.

Prediction [Space]	Date	2018 Comments	Updates
Next launch of people (test pilots/engineers) on a sub-orbital flight by a private company.	BY 2018		20190101 Virgin Galactic did this on December 13, 2018.
A few handfuls of customers, paying for those flights.	NET 2020
A regular sub weekly cadence of such flights.	NET 2022 BY 2026
Regular paying customer orbital flights.	NET 2027	Russia offered paid flights to the ISS, but there were only 8 such flights (7 different tourists). They are now suspended indefinitely.
Next launch of people into orbit on a US booster.	NET 2019 BY 2021 BY 2022 (2 different companies)	Current schedule says 2018.	20190101 It didn't happen in 2018. Now both SpaceX and Boeing say they will do it in 2019.
Two paying customers go on a loop around the Moon, launch on Falcon Heavy.	NET 2020	The most recent prediction has been 4th quarter 2018. That is not going to happen.	20190101 I'm calling this one now as SpaceX has revised their plans from a Falcon Heavy to their still developing BFR (or whatever it gets called), and predict 2023. I.e., it has slipped 5 years in the last year.
Land cargo on Mars for humans to use at a later date	NET 2026	SpaceX has said by 2022. I think 2026 is optimistic but it might be pushed to happen as a statement that it can be done, rather than for an pressing practical reason.
Humans on Mars make use of cargo previously landed there.	NET 2032	Sorry, it is just going to take longer than every one expects.
First "permanent" human colony on Mars.	NET 2036	It will be magical for the human race if this happens by then. It will truly inspire us all.
Point to point transport on Earth in an hour or so (using a BF rocket).	NIML	This will not happen without some major new breakthrough of which we currently have no inkling.
Regular service of Hyperloop between two cities.	NIML	I can't help but be reminded of when Chuck Yeager described the Mercury program as "Spam in a can".

July 15, 2018 — Essays

[FoR&AI] Steps Toward Super Intelligence IV, Things to Work on Now

rodneybrooks.com/forai-steps-toward-super-intelligence-iv-things-to-work-on-now/

[This is the fourth part of a four part essay–here is Part I.]

We have been talking about building an Artificial General Intelligence agent, or even a Super Intelligence agent. How are we going to get there? How are we going get to ECW and SLP? What do researchers need to work on now?

In a little bit I’m going to introduce four pseudo goals, based on the capabilities and competences of children. That will be my fourth big list of things in these four parts of this essay. Just to summarize so the numbers and lists don’t get too confusing here is what I have described and proposed over these four sub essays:

Part I	4 Previous approaches to AI
Part II	2 New Turing Test replacements
Part III	7 (of many) things that are currently hard for AI
Part IV	4 Ways to make immediate progress

But what should AI researchers actually work on now?

I think we need to work on architectures of intelligent beings, whether they live in the real world or in cyber space. And I think that we need to work on structured modules that will give the base compositional capabilities, ground everything in perception and action in the world, have useful spatial representations and manipulations, provide enough ability to react to the world on short time scales, and to adequately handle ambiguity across all these domains.

First let’s talk about architectures for intelligent beings.

Currently all AI systems operate within some sort of structure, but it is not the structure of something with ongoing existence. They operate as transactional programs that people run when they want something.

Consider AlphaGo, the program that beat 18 time world Go champion, Lee Sedol, in March of 2016. The program had no idea that it was playing a game, that people exist, or that there is two dimensional territory in the real world–it didn’t know that a real world exists. So AlphaGo was very different from Lee Sedol who is a living, breathing human who takes care of his existence in the world.

I remember seeing someone comment at the time that Lee Sedol was supported by a cup of coffee. And Alpha Go was supported by 200 human engineers. They got it processors in the cloud on which to run, managed software versions, fed AlphaGo the moves (Lee Sedol merely looked at the board with his own two eyes), played AlphaGo’s desired moves on the board, rebooted everything when necessary, and generally enabled AlphaGo to play at all. That is not a Super Intelligence, it is a super basket case.

So the very first thing we need is programs, whether they are embodied or not, that can take care of their own needs, understand the world in which they live (be it the cloud or the physical world) and ensure their ongoing existence. A Roomba does a little of this, finding its recharger when it is low on power, indicating to humans that it needs its dust bin emptied, and asking for help when it gets stuck. That is hardly the level of self sufficiency we need for ECW, but it is an indication of the sort of thing I mean.

Now about the structured modules that were the subject of my second point.

The seven examples I gave, in Part III, of things which are currently hard for Artificial Intelligence, are all good starting points. But they were just seven that I chose for illustrative purposes. There are a number of people who have been thinking about the issue, and they have come up with their own considered lists.

Some might argue, based on the great success of letting Deep Learning learn not only spoken words themselves but the feature detectors for early processing of phonemes that we are better off letting learning figure everything out. My point about color constancy is that it is not something that naturally arises from simply looking at online images. It comes about in the real world from natural evolution building mechanisms to compensate for the fact that objects don’t actually change their inherent color when the light impinging on them changes. That capability is an innate characteristic of evolved organisms whenever it matters to them. We are most likely to get there quicker if we build some of the important modules ahead of time.

And for the hard core learning festishists here is a question to ask them. Would they prefer that their payroll department, their mortgage provider, or the Internal Revenue Service (the US income tax authority) use an Excel spreadsheet to calculate financial matters for them, or would they trust these parts of their lives to a trained Deep Learning network that had seen millions of examples of spreadsheets and encoded all that learning in weights in a network? You know what they are going to answer. When it comes to such a crunch even they will admit that learning from examples is not necessarily the best approach.

Gary Marcus, who I quoted along with Ernest Davis about common sense in Part III, has talked about his list of modules¹ that are most important to build in. They are:

Representations of objects
Structured, algebraic representations
Operations over variables
A type-token distinction
A capacity to represent sets, locations, paths, trajectories, obstacles and enduring individuals
A way of representing the affordances of objects
Spatiotemporal contiguity
Causality
Translational invariance
Capacity for cost-benefit analysis

Others will have different explicit lists, but as long as people are working on innate modules that can be combined within a structure of some entity with an ongoing existence and its own ongoing projects, that can be combined within a system that perceives and acts on its world, and that can be combined within a system that is doing something real rather than a toy online demonstration, then progress will be being made.

And note, we have totally managed to avoid the question of consciousness. Whether either ECW or SLP need to conscious in any way at all, is, I think, an open question. And it will remain so as long as we have no understanding at all of consciousness. And we have none!

HOW WILL WE KNOW IF WE ARE GETTING THERE?

Alan Turing introduced The Imitation Game, in his 1950 paper Computing Machinery and Intelligence. His intent was, as he said in the very first sentence of the paper, to consider the question “Can Machines Think?”. He used the game as a rhetorical device to discuss objections to whether or not a machine could be capable of “thinking”. And while he did make a prediction of when a machine would be able to play the game (a 70% change of fooling a human that the machine was a human in the year 2000), I don’t think that he meant the game as a benchmark for machine intelligence.

But the press, over the years, rather than real Artificial Intelligence researchers, picked up on this game and it became known as the Turing Test. For some, whether or not a machine could beat a human at this parlor game, became the acid test of progress in Artificial Intelligence. It was never a particularly good test, and so the big “tournaments” organized around it were largely ignored by serious researchers, and eventually pretty dumb chat bots that were not at all intelligent started to get crowned as the winners.

Meanwhile real researchers were competing in DARPA competitions such as the Grand Challenge, Urban Grand Challenge (which lead directly to all the current work on self driving cars), and the Robot Challenge.

We could imagine tests or competitions being set up for how well an embodied and a disembodied Artificial Intelligence system perform at the ECW and SLP tasks. But I fear that like the Turing Test itself these new tests would get bastardized and gamed. I am content to see the market choose the best versions of ECW and SLP–unlike a pure chatterer that can game the Turing Test, I think such systems can have real economic value. So no tests or competitions for ECWs and SLPs.

I have never been a great fan of competitions for research domains as I have always felt that it leads to group think, and a lot of effort going into gaming the rules. And, I think that specific stated goals can lead to competitions being formed, even when none may have been intended, as in the case of the Turing Test.

Instead I am going to give four specific goals here. Each of them is couched in terms of the competence of capabilities of human children of certain ages.

The object recognition capabilities of a two year old.
The language understanding capabilities of a four year old.
The manual dexterity of a six year old.
The social understanding of an eight year old.

Like most people’s understanding of what is pornography or art there is no formal definition that I want to use to back up these goals. I mean them in the way that generally informed people would gauge the performance of an AI system after extended interaction with it, and assumes that they would also have had extended interactions with children of the appropriate age.

These goals are not meant to be defined by “performance tests” that children or an AI system might take. They are meant as unambiguous levels of competence. The confusion between performance and competence was my third deadly sin in my recent post about the mistakes people make in understanding how far along we are with Artificial Intelligence.

If we are going to make real progress towards super, or just every day general, Artificial Intelligence then I think it is imperative that we concentrate on general competence in areas rather than flashy hype bait worthy performances.

Down with performance as a measure, I say, and up with the much fuzzier notion of competence as a measure of whether we are making progress.

So what sort of competence are we talking about for each of these for cases?

2 year old Object Recognition competence. A two year old already has color constancy, and can describe things by at least a few color words. But much more than this they can handle object classes, mapping what they see visually to function.

A two year old child can know that something is deliberately meant to function as a chair even if it is unlike any chair they have seen before. It can have a different number of legs, it can be made of different material, its legs can be shaped very oddly, it can even be a toy chair meant for dolls. A two year old child is not fazed by this at all. Despite having no visual features in common with any other chair the child has ever seen before the child can declare a new chair to be a chair. This is completely different from how a neural network is able to classify things visually.

But more than that, even, a child can see something that is not designed to function as a chair, and can assess whether the object, or location can be used as a chair. The can see a rock and decide that it can be sat upon, or look for a better place where there is something that will functionally act as a seat back.

So two year old children have sophisticated understandings of classes of objects. Once, while I was giving a public talk, a mother felt compelled to leave with her small child who was making a bit of a noisy fuss. I called her back and asked her how old the child was. “Two” came the reply. Perfect for the part of the talk I was just getting to. Live, with the audience watching I made peace with the little girl and asked if she could come up on stage with me. Then I pulled out my key ring, telling the audience that this child would be able to recognize the class of a particular object that she had never seen before. Then I held up one key and asked the two year old girl what it was. She looked at me with puzzlement. Then said, with a little bit of scorn in her voice, “a key”, as though I was an idiot for not knowing what it was. The audience loved it, and the young girl was cheered by their enthusiastic reaction to her!

But wait, there is more! A two year old can do one-shot visual learning from multiple different sources. Suppose a two year old has never been exposed to a giraffe in any way at all. Then seeing just one of a hand drawn picture of a giraffe, a photo of a giraffe, a stuffed toy giraffe, a movie of a giraffe, or seeing one in person for just a few seconds, will forever lock the concept of a giraffe into that two year old’s mind. That child will forever be able to recognize a giraffe as a giraffe, whatever form it is represented in. Most people have never seen a live giraffe, and none have ever seen a live dinosaur, but the are easy for anyone to recognize.

Try that, Deep Learning. One example, in one form!

4 year old Language Understanding competence. Most four year old children can not read or write, but they can certainly talk and listen. They well understand the give and take of vocal turn-taking, know when they are interrupting, and know when someone is interrupting them. They understand and use prosody to great effect, along with animation of their faces, heads and whole bodies. Likewise they read these same cues from other speakers, and make good use of both projecting and detecting gaze direction in conversations amongst multiple people, perhaps as side conversations occur.

Four year old children understand when they are in conversation with someone, and (usually) when that conversation has ended, or the participants have changed. If there are three of four people in a conversation they do not need to name who they are delivering remarks to, nor to hear their name at the beginning of an utterance in order to understand when a particular remark is directed at them–they use all the non-spoken parts of communication to make the right inferences.

All of this is very different from today’s speech with agents such as the Amazon Echo, or Google Home. It is also different in that a four year old child can carry the context generated by many minutes of conversation. They can understand incomplete sentences, and can generate short meaningful interjections of just a word or two that make sense in context and push forward everyone’s mutual understanding.

A four year old child, like the remarkable progress in computer speech understanding over the last five years due to Deep Learning, can pick out speech in noisy environments, tuning out background noise and concentrating on speech directed at them, or just what they want to hear from another ongoing conversation not directed at them. They can handle strong accents that they have never heard before and still extract accurate meaning in discussions with another person.

They can deduce gender and age from the speech patterns of another, and they are finely attuned to someone they know speaking differently than usual. They can understand shouted, whispered, and sung speech. They themselves can sing, whisper and shout, and often do so appropriately.

And they are skilled in the complexity of sentences that they can handle. They understand many subtleties of tense, they can talk in and understand hypotheticals. Then can engage in and understand nonsense talk, and weave a pattern of understanding through it. They know when the are lying, and can work to hide that fact in their speech patterns.

They are so much more language capable than any of our AI systems, symbolic or neural.

6 year old Manual Dexterity competence. A six year old child, unless some super prodigy, is not able to play Chopin on the piano. But they are able to do remarkable feats of manipulation, with their still tiny hands, that no robot can do. When they see an object for the first time they fairly reliably estimate whether they can pick it up one handed, two handed, or two arms and whole body (using their stomach or chests as an additional anchor region), or not at all. For a one handed grasp they preshape their hand as they reach towards it having decided ahead of time what sort of grasp they are going to use. I’m pretty sure that a six old can do all these human grasps:

[I do not know the provenance of this image–I found it at a drawing web site here.] A six year old can turn on faucets, tie shoe laces, write legibly, open windows, raise and lower blinds if they are not too heavy, and they can use chopsticks in order to eat, even with non-rigid food. They are quite dexterous. With a little instruction they can cut vegetables, wipe down table tops, open and close food containers, open and close closets, and lift stacks of flat things into and out of those closets.

Six year old children can manipulate their non-rigid clothes, fold them, though not as well as skilled adult (I am not a skilled adult in this regard…), manipulate them enough to put them on and off themselves, and their dolls.

Furthermore, they can safely pick up a cat and even a moderately sized dog, and often are quite adept and trustworthy picking up their very young siblings. They can caress their grandparents.

They can wipe their bums without making a mess (most of the time).

ECW will most likely need to be able to do all these things, with scaled up masses (e.g., lifting or dressing a full sized adult which is beyond the strength capabilities of a six year old child).

We do not have any robots today that can do any of these things in the general case where a robot can be placed in a new environment with new instances of objects that have not been seen before, and do any of these tasks.

Going after these levels of manipulation skill will result in robots backed by new forms of AI that can do the manual tasks that we expect of humans, and that will be necessary for giving care to other humans.

8 year old Social Understanding competence. By age eight children are able to articulate their own beliefs, desires, and intentions, at least about concrete things in the world. They are also able to understand that other people may have different beliefs, desires, and intentions, and when asked the right questions can articulate that too.

Furthermore, they can reason about what they believe versus what another person might believe and articulate that divergence. A particular test for this is known as the “false-belief task”. There are many variations on this, but essentially what happens is that an experimenter lets a child see a person make an observation of a person seeing that Box A contains, say, a toy elephant, and that Box B is empty. That person leaves the room, and the experimenter then, in full sight of the child moves the toy elephant to Box B. They then ask the child which box contains the toy elephant, and of course the child says Box B. But the crucial question is to ask the child where the person who left the room will look for the toy elephant when they are asked to find it after they have come back into the room. Once the child is old enough (and there are many experiments and variations here) they are able to tell the experimenter that the person will look in Box A, knowing that is based on a belief the person has which is now factually false.

There is a vast literature on this and many other aspects of understanding other people, and also a vast literature on testing such knowledge for very young children but also for chimpanzees, dogs, birds, and other animals on what they might understand–without the availability of language these experiments can be very hard to design.

And there are many many aspects of social understanding, including inferring a person’s desire or intent from their actions, and understanding why they may have those desires and intents. Some psychological disorders are manifestations of not being able to make such inferences. But in our normal social environment we assume a functional capability in many of these areas about others with whom we are interacting. We don’t feel the need to explain certain things to others as surely they will know from what they are observing. And we also observe the flow of knowledge ourselves and are able to make helpful suggestions as we see people acting in the world. We do this all the time, pointing to things, saying “over there”, or otherwise being helpful, even to complete strangers.

Social understanding is the juice that makes us humans into a coherent whole. And, we have versions of social understanding for our pets, but not for our plants. Eight year old children have enough of it for much of every day life.

Improvement in Competence will lead the way

These competencies of two, four, six, and eight year old children will all come into play for ECW and SLP. Without these competencies, our intelligent systems will never seem natural or as intelligent as us. With these competencies, whether they are implemented in ways copied from humans or not (birds vs airplanes) our intelligent systems will have a shot at appearing as intelligent as us. They are crucial for an Artificial Generally Intelligent system, or for anything that we will be willing to ascribe Super Intelligence to.

So, let’s make progress, real progress, not simple hype bait, on all four of these systems level goals. And then, for really the first time in sixty years we will actually be part ways towards machines with human level intelligence and competence.

In reality it will just be a small part of the way, and even less of the way to towards Super Intelligence.

It turns out that constructing deities is really really hard. Even when they are in our own image.

¹“Innateness, AlphaZero, and Artificial Intelligence“, Gary Marcus, submitted to arXiv, January 2018.