Blog

Predictions Scorecard, 2026 January 01

rodneybrooks.com/predictions-scorecard-2026-january-01/

Nothing is ever as good as it first seems and nothing is ever as bad as it first seems.

— A best memory paraphrase of advice given to me by Vice Admiral Joe Dyer, former chief test pilot of the US Navy and former Commander of NAVAIR.

[You can follow me on social media: @rodneyabrooks.bsky.social and see my publications etc., at https://people.csail.mit.edu/brooks]

Table of contents
Introduction
   What I Nearly Got Wrong
   What Has Surprised Me, And That I Missed 8 Years Ago
   My Color Scheme and Past Analysis

My New Predictions
   Quantum Computers
   Self Driving Cars
   Humanoid Robots
   Neural Computation
   LLMs

Self Driving Cars
   A Brief Recap of what "Self Driving" Cars Means and Meant
   My Own Experiences with Waymo in 2025
   Self Driving Taxi Services
     __Cruise
     __Tesla
     __Waymo
     __Zoox
   Electric Cars
   Flying Cars

Robotics, AI, and Machine Learning
   Capabilities and Competences
   World Models
   Situatedness vs Embodiment
   Dexterous Hands

Human Space Flight
   Orbital Crewed Flights
   Suborbital Crewed Flights
   Boeing's Starliner
   SpaceX Falcon 9
   NASA, Artemis, and Returning to the Moon
   SpaceX Starship
   Blue Origin Gets to Orbit
   New Space Stations

Addendum
Introduction

This is my eighth annual update on how my dated predictions from January 1st, 2018 concerning (1) self driving cars, (2) robotics, AI , and machine learning, and (3) human space travel, have held up. I promised then to review them at the start of the year every year until 2050 (right after my 95th birthday), thirty two years in total. The idea was to hold myself accountable for those predictions. How right or wrong was I?

The summary is that my predictions held up pretty well, though overall I was a little too optimistic. That is a little ironic, as I think that many people who read my predictions back on  January 1st, 2018 thought that I was very pessimistic compared to the then zeitgeist. I prefer to think of myself as being a realist.

And did I see LLMs coming? No and yes. Yes, I did say that something new and big that everyone accepted as the new and big thing in AI would come along no earlier than 2023, and that the key paper for its success had already been written by before I made my first predictions. And indeed LLMs were generally accepted as the next big thing in 2023 (I was lucky on that date), and the key paper, Attention Is All You Need, was indeed already written, and had first appeared in June of 2017. I wrote about this extensively in last year’s scorecard. But no, I had no idea it would be LLMs at the time of my correct prediction that something big would appear. And that lack of specificity on the details of exactly what will be invented and when is the case with all my predictions from the first day of 2018.

I did not claim to be clairvoyant about exactly what would happen, rather I was making predictions about the speed of new research ideas, the speed of hype generation, the speed of large scale deployments of new technologies, and the speed of fundamental changes propagating through the world’s economy. Those speeds are very different and driven by very different realities. I think that many people get confused by that and make the mistake of jumping between those domains of reality, thinking all the speeds will be the same.  In my case my estimates of those speeds are informed by watching AI and robotics professionally, for 42 years at the time of my predictions. I became a graduate student in Artificial Intelligence in January of 1976, just shy of 20 years after the initial public outing of the term Artificial Intelligence at the summer workshop in 1956 at Dartmouth. And now as of today I have been in that field for 50 years.

I promised to track my predictions made eight years ago today for 32 years. So I am one quarter of the way there. But the density of specific years of events or marking percentages of adoption that I predicted start to fall off right around now.

Sometime during 2026 I will bundle up all my comments over the eight years specifically mentioning years that have now passed, and put them in an archival mid-year post. Then I will get rid of the three big long tables that dominate the body of this annual post, and have short updates on the sparse dates for the next 24 years.

I will continue to summarize what has happened in self-driving cars generally, including electrification progress and the forever promised flying cars, along with AI and robotics, and human space flight. But early in 2025 I made five new predictions for the coming ten years, without specific dates, but which summarize what I think will happen.  I will track these predictions too.

What I Nearly Got Wrong

The day before my original prediction post in 2018 the price of Bitcoin had opened at $12,897.70 and topped out at $14,377.40 and 2017 had been the first year it had ever traded at over $1,000. The price seemed insane to me as Bitcoin wasn’t being used for the task for which it had been designed. The price seemed to me then, and now, to be purely about speculation. I almost predicted when it would be priced at $200, on the way down. But, fortunately, I checked myself as I realized that the then current state of the market made no sense to me and so any future state may not either. Besides, I had no experience or expertise in crypto pricing. So I left that prediction out. I had no basis to make a prediction. That was a wise decision, and I revisit that reasoning as I make new predictions now, and implore myself to only make predictions in fields where I know something.

What Has Surprised Me, And That I Missed 8 Years Ago

I made some predictions about the future of SpaceX although I didn’t always label them as being about SpaceX. A number of my predictions were in response to pronouncements by the CEO of SpaceX. My predictions were much more measured and some might say even pessimistic. Those predictions so far have turned out to be more optimistic than how reality has unfolded.

I had made no specific predictions about Falcon 9, though I did make predictions about the subsequent SpaceX launch family, now called Starship, but then known as BFR, which eight years later has not gotten into orbit.

In the meantime SpaceX has scaled the Falcon 9 launch rate at a phenomenal speed, and the magnitude of the growth is very surprising.

Eight years ago, Falcon 9 had been launched 46 times, all successful, over the previous eight years, and it had recently had a long run of successful landings of the booster whenever attempted. At that time five launches had been on a previously used booster, but there had been no attempts to launch Falcon Heavy with its three boosters strapped together.

Now we are eight years on from those first eight years of Falcon 9 launches. The scale and success rate of the launches has made each individual launch an unremarkable event, with humans being launched a handful of times per year. Now the Falcon 9 score card stands at 582 launches with only one failed booster, and there have been 11 launches of the three booster Falcon Heavy, all successful. That is a sustained growth rate of 38% year over year for eight years. And that it is a very high sustained deployment growth rate for any complex technology.

There is no other modern rocket with such a volume of launches that comes even close to the Falcon 9 record.  And I certainly did not foresee this volume of launches. About half the launches have had SpaceX itself as the customer, starting in February 2018, launching an enormous satellite constellation (about two thirds of all satellites ever orbited) to support Starlink bringing internet to everywhere on the surface of Earth.

But… there is one historical rocket, a suborbital one which has a much higher record of use than Falcon 9 over a much briefer period. The German V-2 was the first rocket to fly above the atmosphere and the first ballistic missile to be used to deliver bombs. It was fueled with ethanol and liquid oxygen, and was steered by an analog computer that also received inputs from radio guide signals–it was the first operational liquid fueled rocket. It was developed in Germany in the early 1940’s and after more than a thousand test launches was first put into operation on September 7th, 1944, landing a bomb on Paris less than two weeks after the Allied liberation of that city. In the remaining 8 months of the war 3,172 armed V-2 rockets were launched at targets in five countries — 1,358 were targeted at London alone.

My Color Scheme and Past Analysis

The acronyms I used for predictions in my original post were as follows.

NET year means it will not happen before that year (No Earlier Than)
BY year means I predict that it will happen by that year.
NIML, Not In My Lifetime, i.e., not before 2050.

As time passes mentioned years I color then as accurate, too pessimistic, or too optimistic.

Last year I added hemming and hawing. This is for when something looks just like what I said would take a lot longer has happened, but the underlying achievement is not what everyone expected, and is not what was delivered. This is mostly for things that were talked about as being likely to happen with no human intervention and it now appears to happen that way, but in reality there are humans in the loop that the companies never disclose. So the technology that was promised to be delivered hasn’t actually been delivered but everyone thinks it has been.

When I quote myself I do so in orange, and when I quote others I do so in blue.

I have not changed any of the text of the first three columns of the prediction tables since their publication on the first day of 2018. I only change the text in the fourth column to say what actually happened.  This meant that by four years ago that fourth column was getting very long and skinny, so I removed them and started with fresh comments two years ago. I have kept the last two year’s comments and added new ones, with yellow backgrounds, for this year, removing the yellow backgrounds from 2025 comments that were there last year. If you want to see the previous five years of comments you can go back to  the 2023 scorecard.



My NEW PREDICTIONS

On March 26th I skeeted out five technology predictions, talking about developments over the next ten years through January 1st, 2036. Three weeks later I included them in a blog post. Here they are again.

1. Quantum Computers. The successful ones will emulate physical systems directly for specialized classes of problems rather than translating conventional general computation into quantum hardware. Think of them as 21st century analog computers. Impact will be on materials and physics computations.

2. Self Driving Cars. In the US the players that will determine whether self driving cars are successful or abandoned are #1 Waymo (Google) and #2 Zoox (Amazon). No one else matters. The key metric will be human intervention rate as that will determine profitability.

3. Humanoid Robots. Deployable dexterity will remain pathetic compared to human hands beyond 2036. Without new types of mechanical systems walking humanoids will remain too unsafe to be in close proximity to real humans.

4. Neural Computation. There will be small and impactful academic forays into neuralish systems that are well beyond the linear threshold systems, developed by 1960, that are the foundation of recent successes. Clear winners will not yet emerge by 2036 but there will be multiple candidates.

5. LLMs. LLMs that can explain which data led to what outputs will be key to non annoying/dangerous/stupid deployments. They will be surrounded by lots of mechanism to keep them boxed in, and those mechanisms, not yet invented for most applications, will be where the arms races occur.

These five predictions are specifically about what will happen in these five fields during the ten years from 2026 through 2035, inclusive. They are not saying when particular things will happen, rather they are saying whether or not  certain things will happen in that decade. I will do my initial analysis of these five new predictions immediately below. For the next ten years I will expand on each of these reviews in this annual scorecard, along with reviews of my earlier predictions. The ten years for these predictions are up on January 1st, 2036. I will have just turned 81 years old then, so let’s see if I am still coherent enough to do this.

Quantum Computers

The successful ones will emulate physical systems directly for specialized classes of problems rather than translating conventional general computation into quantum hardware. Think of them as 21st century analog computers. Impact will be on materials and physics computations.

The original excitement about quantum computers was stimulated by a paper by Peter Shor in 1994 which gave a digital quantum algorithm to factor large integers much faster than a conventional digital computer. Factoring integers is often referred to as “the IFP” for the integer factorization problem.

So what? The excitement around this was based on how modern cryptography, which provides our basic security for on-line commerce, works under the hood.

Much of the internet’s security is based on it being hard to factor a large number. For instance in the RSA algorithm Alice tells everyone a large number (in different practical versions it has 1024, 2048, or 4096 bits) for which she knows its prime factors. But she tells people only the number not its factors. In fact she chose that number by multiplying together some very large prime numbers — very large prime numbers are fairly easy to generate (using the Miller-Rabin test). Anyone, usually known as Bob, can then use that number to encrypt a message intended for Alice. No one, neither Tom, Dick, nor Harry, can decrypt that message unless they can find the prime factors of Alice’s public number. But Alice knows them and can read the message intended only for her eyes.

So… if you could find prime factors of large numbers easily then the backbone of digital security would be broken. Much excitement!

Shor produced his algorithm in 1994. By the year 2001 a group at IBM had managed to find the prime factors of the number 15 using a digital quantum computer as published in Nature. All the prime factors. Both 3 and 5. Notice that 15 has only four bits, which is a lot smaller than the number of bits used in commercial RSA implementations, namely 1024, 2048, or 4096.

Surely things got better fast.  By late 2024 the biggest numbers that had been factored by an actual digital quantum computer had 35 bits which allows for numbers no bigger than 34,359,738,367. That is way smaller than the size of the smallest numbers used in RSA applications. Nevertheless it does represent 31 doublings in magnitude of numbers factored in 23 years, so progress has been quite exponential. But it could take another 500 years of that particular version of exponential growth rate to get to conquering today’s smallest version of RSA digital security.

In the same report the authors say that a conventional, but very large computer (2,000 GPUs along with a JUWELS booster, which itself has 936 compute nodes each consisting of four NVIDIA A100 Tensor Core GPUs themselves each hosted by 48 dual threaded AMD EPYC Rome cores–that is quite a box of computing) simulating a quantum computer running Shor’s algorithm had factored a 39 bit number finding that 549,755,813,701 = 712,321 × 771,781, the product of two 20 bit prime numbers. That was its limit. Nevertheless, an actual digital quantum computer can still be outclassed by one simulated on conventional digital hardware.

The other early big excitement for digital quantum computers was Grover’s search algorithm, but work on that has not been as successful as for Shor’s IFP solution.

Digital quantum computation nirvana has not yet been demonstrated.

Digital quantum computers work a little like regular digital computers in that there is a control mechanism which drives the computer through a series of discrete steps. But today’s digital quantum computers suffer from accumulating errors in quantum bits. Shor’s algorithm assumes no such errors. There are techniques for correcting those errors but they slow things down and cause other problems. One way that digital quantum computers may get better is if new methods of error correction emerge. I am doubtful that something new will emerge, get fully tested, and then make it into production at scale all within the next ten years. So we may not see a quantum (ahem) leap in performance of quantum digital computers in the next decade.

Analog quantum computers are another matter. They are not switched, but instead are configured to directly simulate some physical system and the quantum evolution and interactions of components of that system. They are an embodied quantum model of that system. And they are ideally suited to solving these sorts of problems and cannot be emulated by conventional digital systems as they can be in the 39 bit number case above.

I find people working on quantum computers are often a little squirrelly about whether their computer acts more like a digital or analog computer, as they like to say they are “quantum” only.  The winners over the next 10 years will be ones solving real problems in materials science and other aspects of chemistry and physics.

Self Driving Cars

In the US the players that will determine whether self driving cars are successful or abandoned are #1 Waymo (Google) and #2 Zoox (Amazon). No one else matters. The key metric will be human intervention rate as that will determine profitability.

Originally the term “self driving car” was about any sort of car that could operate without a driver on board, and without a remote driver offering control inputs. Originally they were envisioned as an option for privately owned vehicles used by individuals, a family car where no person needed to drive, but simply communicated to the car where it should take them.

That conception is no longer what people think of when self driving cars are mentioned. Self driving cars today refer to taxi-services that feel like Uber or Lyft, but for which there is not a  human driver, just paying passengers.

In the US the companies that have led in this endeavor have changed over time.

The first leader was Cruise, owned by GM. They were the first to have a regular service in the downtown area of a major city (San Francisco), and then in a number of other cities, where there was an app that anyone could download and then use their service. They were not entirely forthcoming with operational and safety problems, including when they dragged a person, who had just been hit by a conventionally driven car, for tens of feet under one of their vehicles. GM suspended operations in late 2023 and completely disbanded it in December 2024.

Since then Waymo (owned by Google) has been the indisputable leading deployed service.

Zoox (owned by Amazon) has been a very distant, but operational, second place.

Tesla (owned by Tesla) has put on a facade of being operational, but it is not operational in the sense of the other two services, and faces regulatory headwinds that both Waymo and Zoox have long been able to satisfy. They are not on a path to becoming a real service.

See my traditional section on self driving cars below, as it explains in great detail the rationale for these evaluations. In short, Waymo looks to have a shot at succeeding and it is unlikely they will lose first place in this race. Zoox may also cross the finish line, and it is very unlikely that anyone will beat them.  So if both of Waymo and Zoox fail, for whatever reason, the whole endeavor will grind to a halt in the US.

BUT…

But what might go wrong that makes one of these companies fail. We got a little insight into that in the last two weeks of 2025.

On Saturday December 20th of 2025 there was an extended power outage in San Francisco that started small in the late morning but by nightfall had spread to large swaths of the city.  And lots and lots of normally busy intersections were by that time blocked by many stationary Waymos.

Traffic regulations in San Francisco are that when there is an intersection which has traffic lights that are all dark, that intersection should be treated as though it has stop signs at every entrance. Human drivers who don’t know the actual regulation tend to fall back to that behavior in any case.

It seemed that Waymos were waiting indefinitely for green lights that never came, and at intersections through which many Waymos were routed there were soon enough waiting Waymos that the intersections were blocked.  Three days later, on December 23rd, Waymo issued an explanation on their blog site, which includes the following:

Navigating an event of this magnitude presented a unique challenge for autonomous technology. While the Waymo Driver is designed to handle dark traffic signals as four-way stops, it may occasionally request a confirmation check to ensure it makes the safest choice. While we successfully traversed more than 7,000 dark signals on Saturday, the outage created a concentrated spike in these requests. This created a backlog that, in some cases, led to response delays contributing to congestion on already-overwhelmed streets.

We established these confirmation protocols out of an abundance of caution during our early deployment, and we are now refining them to match our current scale. While this strategy was effective during smaller outages, we are now implementing fleet-wide updates that provide the Driver with specific power outage context, allowing it to navigate more decisively.

As the outage persisted and City officials urged residents to stay off the streets to prioritize first responders, we temporarily paused our service in the area. We directed our fleet to pull over and park appropriately so we could return vehicles to our depots in waves. This ensured we did not further add to the congestion or obstruct emergency vehicles during the peak of the recovery effort.

The key phrase is that Waymos “request a confirmation check” at dark signals. This means that the cars were asking for a human to look at images from their cameras and manually tell them how to behave. With 7,000 dark signals and perhaps a 1,000 vehicles on the road, Waymo clearly did not have enough humans on duty to handle the volume of requests that were coming in. Waymo does not disclose whether any human noticed a rise in these incidents early in the day and more human staff were called in, or whether they simply did not have enough employees to make handling them all possible.

At a deeper level it looks like they had a debugging feature in their code, and not enough people to supply real time support to handle the implications of that debugging feature. And it looks like Waymo is going to remove that debugging safety feature as a way of solving the problem.  This is not an uncommon sort of engineering failure during early testing. Normally one would hope that the need for that debugging feature had been resolved before large scale deployment.

But, who are these human staff?  Besides those in Waymo control centers, it turns out there is a gig-work operation with an app named Honk (the headline of the story is When robot taxis get stuck, a secret army of humans comes to the rescue) whereby Waymo pays people around $20 to do minor fixups to stuck Waymos by, for instance, going and physically closing a door that q customer left open. Tow truck operators use the same app to find Waymos that need towing because of some more serious problem. It is not clear whether it was a shortage of those gig workers, or a shortage of people in the Waymo remote operations center that caused the large scale failures.  But it is worth noting that current generation Waymos need a lot of human help to operate as they do, from people in the remote operations center to intervene and provide human advice for when something goes wrong, to Honk gig-workers scampering around the city physically fixed problems with the vehicles, to people to clean the cars and plug them in to recharge when they return to their home base.

For human operated ride services, traditional taxi companies or gig services such as Uber and Lyft, do not need these external services. There is a human with the car at all times who takes care of these things.

The large scale failure on the 20th did get people riled up about these robots causing large scale traffic snarls, and made them wonder about whether the same thing will happen when the next big earthquake hits San Francisco. Will the human support worker strategy be stymied by both other infrastructure failures (e.g., the cellular network necessary for Honk workers to communicate) or the self preservation needs of the human workers themselves?

The Waymo blog post revealed another piece of strategy. This is one of three things they said that they would do to alleviate the problems:

Expanding our first responder engagement: To date, we’ve trained more than 25,000 first responders in the U.S. and around the world on how to interact with Waymo. As we discover learnings from this and other widespread events, we’ll continue updating our first responder training.

The idea is to add more responsibility to police and fire fighters to fix the inadequacies of the partial-only autonomy strategy for Waymo’s business model. Those same first responders will have more than enough on their plates during any natural disasters.

Will it become a political issue where the self-driving taxi companies are taxed enough to provide more first responders? Will those costs ruin their business model? Will residents just get so angry that they take political action to shut down such ride services?

Humanoid Robots

Deployable dexterity will remain pathetic compared to human hands beyond 2036. Without new types of mechanical systems walking humanoids will remain too unsafe to be in close proximity to real humans.

Despite this prediction it is worth noting that there is a long distance between current deployed dexterity and dexterity that is still pathetic. In the next ten years deployable dexterity may improve markedly, but not in the way the current hype for humanoid robots suggests.  I talk about his below in my annual section scoring my 2018 predictions on robotics, AI, and machine learning, in a section titled Dexterous Hands.

Towards the end of 2025 I published a long blog post summarizing the status of, and problems remaining for humanoid robots.

I started building humanoid robots in my research group at MIT in 1992. My previous company, Rethink Robotics, founded in 2008, delivered thousands of upper body Baxter and Sawyer humanoid robots (built in the US) to factories between 2012 and 2018.  At the top of this blog page you can see a whole row of Baxter robots in China. A Sawyer robot that had operated in a factory in Oregon just got shut down in late 2025 with 35,236 hours on its operations clock. You can still find many of Rethink’s humanoids in use in teaching and research labs around the world. Here is the cover of Science Robotics from November 2025,

showing a Sawyer used in the research for  this article out of Imperial College, London.

Here is a slide from a 1998 powerpoint deck that I was using in my talks, six years after my graduate students and I had started building our first humanoid robot, Cog.

It is pretty much the sales pitch that today’s humanoid companies use.  You are seeing here my version from almost twenty eight years ago.

I point this out to demonstrate that I am not at all new to humanoid robotics and have worked on them for decades in both academia and in producing and selling humanoid robots that were deployed at scale (which no one else has done) doing real work.

My blog post from September, details why the current learning based approaches to getting dexterous manipulation will not get there anytime soon. I argue that the players are (a) collecting the wrong data and (b) trying to learn the wrong thing. I also give an argument (c) for why learning might not be the right approach. My argument for (c) may not hold up, but I am confident that I am right on both (a) and (b), at least for the next ten years.

I also outline in that blog post why the current (and indeed pretty much the only, for the last forty years) method of building bipeds and controlling them will remain unsafe for humans to be nearby. I pointed out that the danger is roughly cubicly proportional to the weight of the robot. Many humanoid robot manufacturers are introducing lightweight robots, so I think they have come to the same conclusion. But the side effect is that the robots can not carry much payload, and certainly can’t provide physical support to elderly humans, which is a thing that human carers do constantly — these small robots are just not strong enough. And elder care and in home care is one of the main arguments for having human shaped robots, adapted to the messy living environments of actual humans.

Given that careful analysis from September I do not share the hype that surrounds humanoid robotics today. Some of it is downright delusional across many different levels.

To believe the promises of many CEOs of humanoid companies you have to accept the following conjunction.

  1. Their robots have not demonstrated any practical work (I don’t count dancing in a static environment doing exactly the same set of moves each time as practical work).
  2. The demonstrated grasping, usually just a pinch grasp,  in the videos they show is at a rate which is painfully slow and not something that will be useful in practice.
  3. They claim that their robots will learn human-like dexterity but they have not shown any videos of multi-fingered dexterity where humans can and do grasp things that are unseen, and grasp and simultaneously manipulate multiple small objects with one hand. And no demonstrations of using the body with the hands which is how humans routinely carry many small things or one or two heavy things.
  4. They show videos of non tele-operated manipulation, but all in person demonstrations of manipulation are tele-operated.
  5. Their current plans for robots working in customer homes all involve a remote person tele-operating the robot.
  6. Their robots are currently unsafe for humans to be close to when they are walking.
  7. Their robots have no recovery from falling and need human intervention to get back up.
  8. Their robots have a battery life measured in minutes rather than hours.
  9. Their robots cannot currently recharge themselves.
  10. Unlike human carers for the elderly, humanoids are not able to provide any physical assistance to people that provides stabilizing support for the person walking, getting into and out of bed physical assistance, getting on to and off of a toilet, physical assistance, or indeed any touch based assistance at all.
  11. The CEOs claim that there robots will be able to do everything, or many things, or a lot of things, that a human can do in just a few short years. They currently do none.
  12. The CEOs claim a rate of adoption of these humanoid robots into homes and industries at a rate that is multiple orders of magnitude faster than any other technology in human history, including mainframe computers, and home computers and the mobile phones, and the internet. Many orders of magnitude faster. Here is a CEO of a humanoid robot company saying that they will be in 10% of US households by 2030. Absolutely no technology (even without the problems above) has ever come close to scaling at that rate.

The declarations being made about humanoid robots are just not plausible.

We’ll see what actually happens over the next ten years, but it does seem that the fever is starting to crack at the edges. Here are two news stories from the last few days of 2025.

From The Information on December 22nd there is a story about how humanoid robot companies are wrestling with safety standards. All industrial and warehouse robots, whether stationary of mobile have a big red safety stop button, in order to comply with regulatory safety standards. The button cuts the power to the motors. But cutting power to the motors of a balancing robot might make them fall over and cause more danger and damage to people nearby.  For the upper torso humanoid robots Baxter and Sawyer from my company Rethink Robotics we too had a safety stop button that cut power to all the motors in the arms. It was a collaborative robot and often a person, or part of their limbs or body could be under an arm and it would have been dangerous for the arms to fall quickly on cutoff of power. To counter this we developed a unique circuit that required no active power, which made it so that the back current generated by a motor when powered off acted as a very strong brake. Perhaps there are similar possible solutions for humanoid robots and falling, but they need to be invented yet.

On December 25th the Wall Street Journal had a story headlined “Even the Companies Making Humanoid Robots Think They’re Overhyped”, with a lede of “Despite billions in investment, startups say their androids mostly aren’t useful for industrial or domestic work yet”. Here are the first two paragraphs of the story:

Billions of dollars are flowing into humanoid robot startups, as investors bet that the industry will soon put humanlike machines in warehouses, factories and our living rooms.

Many leaders of those companies would like to temper those expectations. For all the recent advances in the field, humanoid robots, they say, have been overhyped and face daunting technical challenges before they move from science experiments to a replacement for human workers.

And then they go on to quote various company leaders:

“We’ve been trying to figure out how do we not just make a humanoid robot, but also make a humanoid robot that does useful work,” said Pras Velagapudi, chief technology officer at Agility Robotics.

Then talking about a recent humanoid robotics industry event the story says:

On stage at the summit, one startup founder after another sought to tamp down the hype around humanoid robots.

“There’s a lot of great technological work happening, a lot of great talent working on these, but they are not yet well defined products,” said Kaan Dogrusoz, a former Apple engineer and CEO of Weave Robotics.

Today’s humanoid robots are the right idea, but the technology isn’t up to the premise, Dogrusoz said. He compared it to Apple’s most infamous product failure, the Newton hand-held computer.

There are more quotes from other company leaders all pointing out the difficulties in making real products that do useful work. Reality seems to be setting in as promised delivery dates come and go by.

Meanwhile here is what I said at the end of my September blog post about humanoid robots and teaching them dexterity.  I am not at all negative about a great future for robots, and in the nearish term. It is just that I completely disagree with the hype arguing that building robots with humanoid form magically will make robots useful and deployable. These particular paragraphs followed where I had described there, as I do again in this blog post, how the meaning of self driving cars has drifted over time.

Following that pattern, what it means to be a humanoid robot will change over time.

Before too long (and we already start to see this) humanoid robots will get wheels for feet, at first two, and later maybe more, with nothing that any longer really resembles human legs in gross form.  But they will still be called humanoid robots.

Then there will be versions which variously have one, two, and three arms. Some of those arms will have five fingered hands, but a lot will have two fingered parallel jaw grippers. Some may have suction cups. But they will still be called humanoid robots.

Then there will be versions which have a lot of sensors that are not passive cameras, and so they will have eyes that see with active light, or in non-human frequency ranges, and they may have eyes in their hands, and even eyes looking down from near their crotch to see the ground so that they can locomote better over uneven surfaces. But they will still be called humanoid robots.

There will be many, many robots with different forms for different specialized jobs that humans can do. But they will all still be called humanoid robots.

As with self driving cars, most of the early players in humanoid robots, will quietly shut up shop and disappear. Those that remain will pivot and redefine what they are doing, without renaming it, to something more achievable and with, finally, plausible business cases. The world will slowly shift, but never fast enough to need a change of name from humanoid robots. But make no mistake, the successful humanoid robots of tomorrow will be very different from those being hyped today.

Neural Computation

There will be small and impactful academic forays into neuralish systems that are well beyond the linear threshold systems, developed by 1960, that are the foundation of recent successes. Clear winners will not yet emerge by 2036 but there will be multiple candidates.

Current machine learning techniques are largely based on having millions, and more recently tens (to hundreds?) of billions, of linear threshold units. They look like this.

Each of these units have a fixed number of inputs, where some numerical value comes in, and it is multiplied by a weight, usually a floating point number, and the results of all of the multiplications are summed, along with an adjustable threshold a_0, which is usually negative, and then the sum goes through some sort of squishing function to produce a number between zero and one, or in this case minus one and plus one, as the output. In this diagram, which, by the way is taken from Bernie Widrow’s technical report from 1960, the output value is either minus one or plus one, but in modern systems it is often a number from anywhere in that, or another, continuous interval.

This was based on previous work, including that of Warren McCulloch and Walter Pitts’ 1943 formal model of a neuron, Marvin Minsky’s 1954 Ph.D. dissertation on using reinforcement for learning in a machine based on model neurons, and Frank Rosenblatt’s 1957 use of weights (see page 10) in an analog implementation of a neural model.

These are what current learning mechanisms have at their core. These! A model of  biological neurons that was developed in a brief moment of time from 83 to 65 years ago.  We use these today.  They are extraordinarily primitive models of neurons compared to what neuroscience has learned in the subsequent sixty five years.

Since the 1960s higher levels of organization have been wrapped around these units. In 1979 Kunihiko Fukushima published (at the International Joint Conference on Artificial Intelligence, IJCAI 1979, Tokyo — coincidentally the first place where I published in an international venue) his first English language description of convolutional neural networks (CNNs), which allowed for position invariant recognition of shapes (in his case, hand written digits), without having to learn about those shapes in every position within images.

Then came backpropagation, a method where a network can be told the correct output it should have produced, and by propagating the error backwards through the derivative of the quantizer in the diagram above (note that the quantizer shown there is not differentiable–a continuous differentiable quantizer function is needed to make the algorithm work), a network can be trained on examples of what it should produce. The details of this algorithm, are rooted in the chain rule of Gottfried Leibniz in 1676 through a series of modern workers from around 1970 through about 1982. Frank Rosenblatt (see above) had talked about a “back-propagating error correction” in 1962, but did not know how to implement it.

In any case, the linear threshold neurons, CNNs, and backpropagation are the basis of modern neural networks. After an additional 30 years of slow but steady progress they burst upon the scene as deep learning, and unexpectedly crushed many other approaches to computer vision — the research field of getting computers to interpret the contents of an image. Note that “deep” learning refers to there being lots of layers (around 12 layers in 2012) of linear threshold neurons rather than the smaller number of layers (typically two or three) that had been used previously.

Now LLMs are built on top of these sorts of networks with many more layers, and many subnetworks.  This is what got everyone excited about Artificial Intelligence, after 65 years of constant development of the field.

Despite their successes with language, LLMs come with some serious problems of a purely implementation nature.

First, the amount of examples that need to be shown to a network to learn to be facile in language takes up enormous amounts of computation, so the that costs of training new versions of such networks is now measured in the billions of dollars, consuming an amount of electrical power that requires major new investments in electrical generation, and the building of massive data centers full of millions of the most expensive CPU/GPU chips available.

Second, the number of adjustable weights shown in the figure are counted in the hundreds of billions meaning they occupy over a terabyte of storage. RAM that is that big is incredibly expensive, so the models can not be used on phones or even lower cost embedded chips in edge devices, such as point of sale terminals or robots.

These two drawbacks mean there is an incredible financial incentive to invent replacements for each of (1) our humble single neuron models that are close to seventy  years old, (2) the way they are organized into networks, and (3) the learning methods that are used.

That is why I predict that there will be lots of explorations of new methods to replace our current neural computing mechanisms. They have already started and next year I will summarize some of them. The economic argument for them is compelling. How long they will take to move from initial laboratory explorations to viable scalable solutions is much longer than everyone assumes. My prediction is there will be lots of interesting demonstrations but that ten years is too small a time period for a clear winner to emerge. And it will take much much longer for the current approaches to be displaced. But plenty of researchers will be hungry to do so.

LLMs

LLMs that can explain which data led to what outputs will be key to non annoying/dangerous/stupid deployments. They will be surrounded by lots of mechanism to keep them boxed in, and those mechanisms, not yet invented for most applications, will be where the arms races occur.

The one thing we have all learned, or should have learned, is that the underlying mechanism for Large Language Models does not answer questions directly. Instead, it gives something that sounds like an answer to the question. That is very different from saying something that is accurate. What they have learned is not facts about the world but instead a probability distribution of what word is most likely to come next given the question and the words so far produced in response. Thus the results of using them, uncaged, is lots and lots of confabulations that sound like real things, whether they are or not.

We have seen all sorts of stories about lawyers using LLMs to write their briefs, judges using them to write their opinions, where the LLMs have simply made up precedents and fake citations (that sound plausible) for those precedents.

And there are lesser offenses that are still annoying but time consuming. The first time I used ChatGPT was when I was retargeting the backend of a dynamic compiler that I had used on half a dozen architectures and operating systems over a thirty year period, and wanted to move it to the then new Apple M1 chips. The old methods of changing a chunk of freshly compiled binary from data as it was spit out by the compiler, into executable program, no longer worked, deliberately so as part of Apple’s improved security measures. ChatGPT gave me detailed instructions on what library calls to use, what their arguments were, etc. The names looked completely consistent with other calls I knew within the Apple OS interfaces. When I tried to use them from C, the C linker complained they didn’t exist. And then when I asked ChatGPT to show me the documentation it groveled that indeed they did not exist and apologized.

So we all know we need guard rails around LLMs to make them useful, and that is where there will be lot of action over the next ten years. They can not be simply released into the wild as they come straight from training.

This is where the real action is now. More training doesn’t make things better necessarily. Boxing things in does.

Already we see companies trying to add explainability to what LLMs say. Google’s Gemini now gives real citations with links, so that human users can oversee what they are being fed. Likewise, many companies are trying to box in what their LLMs can say and do. Those that can control their LLMs will be able to deliver useable product.

A great example of this is the rapid evolution of coding assistants over the last year or so. These are specialized LLMs that do not give the same sort of grief to coders that I experienced when I first tried to use generic ChatGPT to help me. Peter Norvig, former chief scientist of Google, has recently produced a great report on his explorations of the new offerings. Real progress has been made in this high impact, but narrow use field.

New companies will become specialists in providing this sort of boxing in and control of LLMs. I had seen an ad on a Muni bus in San Francisco for one such company, but it was too fleeting to get a photo. Then I stumbled upon this tweet that has three such photos of different ads from the same company, and here is one of them:

The four slogans on the three buses in the tweet are: Get your AI to behave, When your AI goes off leashGet your AI to work, and Evaluate, monitor, and guardrail your AI. And “the AI” is depicted as a little devil of sorts that needs to be made to behave.



Self Driving Cars

This is one of my three traditional sections where I update one of my three initial tables of prediction from  my predictions exactly eight years ago today. In this section I talk about self driving cars, driverless taxi services, and what that means, my own use of driverless taxi services in the previous year, adoption of electric vehicles in the US, and flying cars and taxis, and what those terms mean.

No entries in the table specifically involve 2025 or 2026, and the status of  predictions that are further out in time remain the same. I have only put in one new comment, about how many cities in the US will have self-driving (sort of) taxi services in 2026 and that comment is highlighted,

Prediction
[Self Driving Cars]
Date2018 CommentsUpdates
A flying car can be purchased by any US resident if they have enough money.NET 2036There is a real possibility that this will not happen at all by 2050.
Flying cars reach 0.01% of US total cars.NET 2042That would be about 26,000 flying cars given today's total.
Flying cars reach 0.1% of US total cars.NIML
First dedicated lane where only cars in truly driverless mode are allowed on a public freeway.
NET 2021
This is a bit like current day HOV lanes. My bet is the left most lane on 101 between SF and Silicon Valley (currently largely the domain of speeding Teslas in any case). People will have to have their hands on the wheel until the car is in the dedicated lane.
Such a dedicated lane where the cars communicate and drive with reduced spacing at higher speed than people are allowed to drive
NET 2024
20240101
This didn't happen in 2023 so I can call it now. But there are no plans anywhere for infrastructure to communicate with cars, though some startups are finally starting to look at this idea--it was investigated and prototyped by academia 20 years ago.
First driverless "taxi" service in a major US city, with dedicated pick up and drop off points, and restrictions on weather and time of day.
NET 2021
The pick up and drop off points will not be parking spots, but like bus stops they will be marked and restricted for that purpose only.20240101
People may think this happened in San Francisco in 2023, but it didn't. Cruise has now admitted that there were humans in the loop intervening a few percent of the time. THIS IS NOT DRIVERLESS. Without a clear statement from Waymo to the contrary, one must assume the same for them. Smoke and mirrors.
Such "taxi" services where the cars are also used with drivers at other times and with extended geography, in 10 major US cities
NET 2025
A key predictor here is when the sensors get cheap enough that using the car with a driver and not using those sensors still makes economic sense.20250101
Imminent dual use of personal cars was the carrot that got lots of people to pay cash when buying a Tesla for the software subscription that would allow their car to operate in this way. Shockingly the CEO of Tesla announced in smoke and mirrors roll out of Cyber Cab in 2024, that the service would use specially built vehicles to be produced at some indeterminate late date. I got suckered by his hype. This is unlikely to happen in the first half of this century.
Such "taxi" service as above in 50 of the 100 biggest US cities.NET 2028It will be a very slow start and roll out. The designated pick up and drop off points may be used by multiple vendors, with communication between them in order to schedule cars in and out.
20250101
Even the watered down version of this with remote operators is not gong to happen in 50 cities by 2028. Waymo has it in 3 cities and is currently planning on 2 more in the US in 2025.
20260101
Waymo did indeed add two cities in 2025, Austin and Atlanta. In those two cities they use Uber as their booking service. They are also expanding the metropolitan reach in their existing cities San Francisco, Los Angeles, and Phoenix. They have promised five more US cities in 2026, i.e., they promise to double the number of cities this year. They would have to then quintuple in 2027 to beat my prediction. Unlikely.
Dedicated driverless package delivery vehicles in very restricted geographies of a major US city.
NET 2023
The geographies will have to be where the roads are wide enough for other drivers to get around stopped vehicles.
A (profitable) parking garage where certain brands of cars can be left and picked up at the entrance and they will go park themselves in a human free environment.
NET 2023
The economic incentive is much higher parking density, and it will require communication between the cars and the garage infrastructure.
A driverless "taxi" service in a major US city with arbitrary pick and drop off locations, even in a restricted geographical area.
NET 2032
NET 2032
This is what Uber, Lyft, and conventional taxi services can do today.20240101
Looked like it was getting close until the dirty laundry came out.
20250101
Waymo now has a service that looks and feels like this in San Francisco, 8 years earlier than I predicted. But it is not what every one was expecting. There are humans in the loop. And for those of us who use it regularly we know it is not as general case on drop off and pick up as it is with human drivers.
Driverless taxi services operating on all streets in Cambridgeport, MA, and Greenwich Village, NY. NET 2035Unless parking and human drivers are banned from those areas before then.
A major city bans parking and cars with drivers from a non-trivial portion of a city so that driverless cars have free reign in that area.NET 2027
BY 2031
This will be the starting point for a turning of the tide towards driverless cars.
The majority of US cities have the majority of their downtown under such rules.NET 2045
Electric cars hit 30% of US car sales.NET 202720240101
This one looked pessimistic last year, but now looks at risk. There was a considerable slow down in the second derivative of adoption this year in the US.
20250101
Q3 2024 had the rate 8.9% so there is no way it can reach 30% in 2027. I was way too optimistic at a time when EV enthusiasts thought I was horribly pessimistic.
Electric car sales in the US make up essentially 100% of the sales.NET 2038
Individually owned cars can go underground onto a pallet and be whisked underground to another location in a city at more than 100mph.NIMLThere might be some small demonstration projects, but they will be just that, not real, viable mass market services.
First time that a car equipped with some version of a solution for the trolley problem is involved in an accident where it is practically invoked.NIMLRecall that a variation of this was a key plot aspect in the movie "I, Robot", where a robot had rescued the Will Smith character after a car accident at the expense of letting a young girl die.

A Brief Recap of what “Self Driving” Cars Means and Meant

This is a much abridged and updated version of what I wrote exactly one year ago today.

The definition, or common understanding, of what self driving cars really means has changed since my post on predictions eight years ago.  At that time self driving cars meant that the cars would drive themselves to wherever they were told to go with no further human control inputs. It was implicit that it meant level 4 driving. Note that there is also a higher level of autonomy, level 5, that is defined.

Note that in the second row of content, it says that there will be no need for a human to take over for either level 4 or level 5. For level 4 there may be pre-conditions on weather and within a supported geographic area. Level 5 eliminates pre-conditions and geographic constraints. So far no one is claiming to have level 5.

However the robot taxi services such as Cruise (now defunct), Waymo, currently operating in five US cities, and Zoox, currently operating in two cities with limited service (Las Vegas and San Francisco), all relied, or rely, on having remote humans who the car can call on to help get them out of situations they cannot handle. That is not what level 4 promises. To an outside observer it looks like level 4, but it is somewhat less than that in reality. This is not the same as a driver putting their hands back on the steering wheel in real time, but it does mean that there is sometimes a remote human giving high level commands to the car. The companies do not advertise how often this happens, but it is believed to be every few miles of driving. The Tesla self driving taxis in Austin have a human in the passenger seat to intervene when there is a safety concern.

One of the motivations for self driving cars was that the economics of taxis, cars that people hire at any time for a short ride of a few miles from where they are to somewhere else of their choosing, would be radically different as there would be no driver. Systems which do require remote operations assistance to get full reliability cut into that economic advantage and have a higher burden on their ROI calculations to make a business case for their adoption and therefore their time horizon to scaling across geographies.

Actual self-driving is now generally accepted to be much harder than every one believed.

As a reminder of how strong the hype was and the certainty of promises that it was just around the corner here is a snapshot of a whole bunch of predictions by major executives from 2017.

I have shown this many times before but there are three new annotations here for 2025 in the lines marked by a little red car. The years in parentheses are when the predictions were made. The years in blue are the predicted years of achievement. When a blue year is shaded pink it means that it did not come to pass by then. The predictions with orange arrows are those that I had noticed had later been retracted.

It is important to note that every prediction that said something would happen by a year up to and including 2025 did not come to pass by that year.  In fact none of those have even come to pass by today. NONE. Eighteen of the twenty predictions were about things that were supposed to have happened by now, some as long as seven years ago. NONE of them have happened yet.

My Own Experiences with Waymo in 2025

I took two dozen rides with Waymo in San Francisco this year. There is still a longer wait than for an Uber at most times, at least for where I want to go. My continued gripe with Waymo is that it selects where to pick me up, and it rarely drops me right at my house — but without any indication of when it is going to choose some other drop off location for me.

The other interaction I had was in early November when I felt like I was playing bull fighter, on foot, to a Waymo vehicle.  My house is on a very steep hill in San Francisco, with parallel parking on one side and ninety degree parking on the other side. It is rare that two cars can pass each other traveling in opposite directions without one having to pull over into some empty space somewhere.

In this incident I was having a multi-hundred pound pallet of material deliverd to my home. There was a very big Fedex truck parked right in front of my house, facing uphill, and the driver/operator was using a manual pallet jack to get it onto the back lift gate, but the load was nine feet long so it hung out past the boundary of the truck. An unoccupied Waymo came down the hill and was about to try to squeeze past the truck on that side. Perhaps it would have made it through if there was no hanging load. So I ran up to just above the truck on the slope and tried to get the Waymo to back up by walking straight at it. Eventually it backed up and pulled in a little bit and sat still. Within a minute it tried again. I pushed it back with my presence again. Then a third time. Let’s be clear it would have been a dangerous situation if it had done what it was trying to do and could have injured the Fedex driver who it had not seen at all. But any human driver would have figured out what was going on and that the Fedex truck would never go down the hill backwards but would eventually drive up the hill. Any human driver would have replanned and turned around. After the third encounter the Waymo stayed still for a while. Then it came to life and turned towards the upwards direction, and when it was at about a 45 degree angle to the upward line of travel it stopped for a few seconds. Then it started again and headed up and away.  I infer that eventually the car had called for human help, and when the human got to it, they directed it where on the road to go to (probably with a mouse click interface) and once it got there it paused and replanned and then headed in the appropriate direction that the human had made it already face.

Self Driving Taxi Services

There have been three self driving taxi services in the US in various stages of play over the last handful of years, though it turns out, as pointed out above that all of them have remote operators. They are Waymo, Cruise, and Zoox.

___Cruise

Cruise died in both 2023 and 2024, and is now dead, deceased, an ex self driving taxi service. Gone. I see its old cars driving around the SF Bay Area, with their orange paint removed, and with humans in the driver seat. On the left below are two photos I took on May 30th at a recharge station. “Birdie” looked just like an old Cruise self driving taxi, but without an orange paint. I hunted around around in online stories about Cruise and soon found another “Birdie”, with orange paint, and the same license plate. So GM are using them to gather data, perhaps for training their level 3 driving systems.

 

___Tesla

Tesla announced to much hoopla that they were starting a self driving taxi service this year, in Austin.  It requires a safety person to be sitting in the front passenger seat at all times. Under the certification with which they operate, on occasion that front seat person is required to move to the driver’s seat. Then it just becomes a regular Tesla with a person driving it and FSD enabled. The original fleet was just 30 vehicles, with at least seven accidents reported by Tesla by October, even with the front seat Tesla person. In October the CEO announced that the service would expand to 500 vehicles in Austin in 2025. By November he had changed to saying they would double the fleet.  That makes 60 vehicles. I have no information that it actually happened.

He also said he wanted to expand the “Robotaxi” service to Phoenix, San Francisco, Miami, Las Vegas, Dallas, and Houston by the end of 2025. It appears that Tesla can not get permits to run even supervised (mirroring the Austin deployment) in any of those cities. And no, they are not operating in any of those cities and now 2025 has reached its end.

In mid-December there were confusing reports saying that Tesla now had Model Y’s driving in Austin without a human safety monitor on board  but that the Robotaxi service for paying customers (who are still people vetted by Tesla) resumed their human safety monitors. So that is about three or four years behind Waymo in San Francisco, and not at all at scale.

The CEO of Tesla has also announced (there are lots of announcements and they are often very inconsistent…) that actually the self driving taxis will be a new model with no steering wheel nor other driver controls. So they are years away from any realistic deployment. I will not be surprised if it never happens as the lure of humanoids completely distracts the CEO.  If driving with three controls, (1) steering angle of the front wheels, (2) engine torque (on a plus minus continuum), and (3) brake pedal pressure, are too hard to make actually work safely for real, how hard can it be to have a program control a heavy unstable balancing platform with around 80 joints in hips and waist, two legs, two arms and five articulated fingers on each hand?

___Waymo

Meanwhile Waymo had raised $5.6B to expand to new cities in 2025. It already operated in parts of San Francisco, Los Angeles, and Phoenix. During 2025 it expanded to Austin and Atlanta, the cities it had promised. It also increased its geographic reach in its existing cities and surrounding metropolitan areas.  In the original three cities users have a Waymo app on their phone and specifically summon a Waymo. In the new cities however they used a slightly different playbook. In both Austin and Atlanta people use their standard Uber app.  They can update their preference to say that they prefer to get a Waymo rather than a human driven car, but there is no guarantee that a Waymo is what they will get. And any regular user of the Uber app in those cities may be offered a Waymo, but they do get an option to decline and to continue to wait for a human driven offer.

In the San Francisco area, beyond the city itself, Waymo first expanded by operating in Palo Alto, in a geographically separate area. Throughout the year one could see human operated Waymos driving in locations all along the peninsula from San Francisco to Palo Alto and further south to San Jose. By November Waymo had announced driverless operations throughout that complete corridor, an area of 260 square miles, but not quite yet on the freeways–the Waymos are operating on specific stretches of both 101 and 280, but only for customers who have specifically signed up for that possibility. Waymo is now also promising to operate at the two airports, San Jose and San Francisco. The San Jose airport came first, and San Francisco airport is operating in an experimental mode with a human in the front seat.

Waymo has announced that it will expand to five more cities in the US during 2026; Miami, Dallas, Houston, San Antonio, and Orlando. It seems likely, given their step by step process, and their track record of meeting their promises that Waymo has a good shot at getting operations running in these five cities, doubling their total number of US cities to 10.

Note that although it does very occasionally snow in five of these ten cities (Atlanta, Austin, Houston, San Antonio, and Orlando) it is usually only a dusting. It is not yet clear whether Waymo will operate when it does snow. It does not snow in the other five cities, and in San Francisco Waymo is building to be a critical part of the transportation infrastructure. How well that would work if a self driving taxi service was subject to tighter restrictions than human driven services due to weather could turn into a logistical nightmare for the cities themselves. In the early days of Cruise they did shut down whenever there was a hint of fog in San Francisco, and that is a common occurrence. It was annoying for me, but Cruise never reached the footprint size in San Francisco that Waymo now enjoys.

No promises yet from Waymo about when it might start operating in cities that do commonly have significant snow accumulations.

In May of 2025 Waymo announced a bunch of things in one press release. First, that they had 1,500 Jaguar-based vehicles at that time, operating in San Francisco, Los Angeles, Phoenix, and Austin. Second, that they were no longer taking deliveries of any more Jaguars from Jaguar, but that they were now building two thousand  of their own Jaguars in conjunction with Magna (a tier one auto supplier that also builds small run models of big brands — e.g., they build all the Mini Coopers that BMW sells) in Mesa, Arizona. Third, that they would also start building, in late 2025, versions of the Zeekr RT, a vehicle that they co-designed with Chinese company Geely, that can be built with no steering wheel or other controls for humans, but with sensor systems that are self-cleaning.

It is hard to track exactly how many Waymos are deployed, but in August 2025, this website, citing various public disclosures by Waymo, put together the following estimates for the five cities in which Waymo was operating.

Phoenix            400
San Francisco      800
Los Angeles        500
Austin             100
Atlanta             36

No doubt those numbers have increased by now.  Meanwhile Waymo has annualized revenues of about $350M and is considering an IPO with a valuation of around $100B.  With numbers like those it can probably raise significant growth capital independently from its parent company.

___Zoox

The other self driving taxi system deployed in the US is Zoox which is currently operating only in small geographical locations within Las Vegas and San Francisco. Their deployment vehicles have no steering wheel or other driver controls–they have been in production for many years. I do notice, by direct observation as I drive and walk around San Francisco, that Zoox has recently enlarged the geographic areas where its driverful vehicles operate, collecting data across all neighborhoods. So far the rides are free on Zoox, but only for people who have gone through an application process with the company. Zoox is following a pattern established by both Cruise and Waymo. It is roughly four years behind Cruise and two years behind Waymo, though it is not clear that it has the capital available to scale as quickly as either of them.

All three companies that have deployed actual uncrewed self driving taxi services in the US have been partially or fully owned by large corporations. GM owned Cruise, Waymo is partially spun out of Google/Alphabet, and Zoox is owned by Amazon.

Cruise failed. If any other company wants to compete with Waymo or Zoox, even in cities where they do not operate, it is going to need a lot of capital. Waymo and Zoox are out in front. If one or both of them fail, or lose traction and fail to grow, and grow very fast, it will be near to impossible for other companies to raise the necessary capital.

So it is up to Waymo and Zoox.  Otherwise, no matter how well the technology works, the dream of  driverless taxis is going to be shelved for many years.

Electric Cars

In my original predictions I said that electric car (and I meant battery electric, not hybrids) sales would reach 30% of the US total no earlier than 2027.  A bunch of people on twitter claimed I was a pessimist. Now it looks like I was an extreme optimist as it is going to take a real growth spurt to reach even 10% in 2026, i.e., earlier than 2027.

Here is the report  that I use to track EV sales — it is updated every few weeks. In this table I have collected the quarterly numbers that are finalized. The bottom row is the percentage of new car sales that were battery electric.

'22'22'22'22'23'23'23'23'24'24'24'24'25'25'25
Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3
5.35.66.16.57.37.2
7.98.17.38.08.98.77.57.410.5

Although late in 2024 EV sales were pushing up into the high eight percentage points they have dropped back into the sevens this year in the first half of the year. Then they picked up to 10.5% in the third quarter of 2025, but that jump was expected as the Federal electric vehicle (EV) tax credits ended for all new and used vehicles purchased after September 30, 2025, as part of the “One Big Beautiful Bill Act”.  People bought earlier than they might have in order to get that tax credit, so the industry is expecting quite a slump in the fourth quarter, but it will be a couple more months before the sales figures are all in.  YTD 2025 is still under 8.5%, and is likely to end at under 8%.

The trends just do not look like we will get to EVs reaching 12% of US cars being sold in 2027, even with a huge uptick. 30% is just not going to happen.

As for which brands are doing better than others, Tesla’s sales dropped a lot more than the rest of the market. Brand winners were GM, Hyundai, and Volkswagen.

The US experience is not necessarily the experience across the world. For instance Norway reached 89% fully electric vehicles of all sold in 2024, largely due to taxes on gasoline powered car purchases. But that is a social choice of the people of Norway, not at all driven by oil availability. With a population of 5.6 million compared to the US with 348 million, and domestic oil production of 2.1 million barrels per day, compared to the US with 13.4 million b/d, Norway has a per capita advantage of almost ten times as much oil per person (9.7 to be more precise).

Electrification levels of cars is a choice that a country makes.

Flying Cars

The next two paragraphs are reproduced from last’s years scorecard.

Flying cars are another category where the definitions have changed. Back when I made my predictions it meant a vehicle that could both drive on roads and fly through the air.  Now it has come to mean an electric multi-rotor helicopter than can operate like a taxi between various fixed landing locations. Often touted are versions that have no human pilot. These are known as eVTOLs, for “electric vertical take off & landing”.

Large valuations have been given to start ups who make nice videos of their electric air taxis flying about. But on inspection one sees that they don’t have people in them. Often, you might notice, even those flights are completely over water rather than land. I wrote about the lack of videos of viable prototypes back in November 2022.

The 2022 post referred to in the last sentence was trying to make sense of a story about a German company, Volocoptor, receiving a $352M Series E investment. The report from pitchbook predicted world wide $1.5B in revenue in the eVTOL taxi service market for 2025.  I was bewildered as I could not find a single video, as of the end of 2022, of a demo of an actual flight profile with actual people in an actual eVTOL of the sort of flights that the story claimed would be generating that revenue in just 3 years.

I still can’t find such a video. And the actual revenue for actual flights in 2025 turned out to be $0.0B (and there are no rounding errors there — it was $0) and Volocoptor has gone into receivership, with a “reorganization success” in March 2025.

In my November 2022 blog post above I talked about another company, Lilium, which came the closest to having a video of a real flight, but it was far short of carrying people and it did not fly as high as is needed for air taxi service. At the time Lilium had 800 employees.  Since then Lilium has declared bankruptcy not once (December 2024), but twice (February 2025), after the employees had been working for some time without pay.

But do not fear. There are other companies on the very edge of succeeding. Oh, and an edge means that sometimes you might fall off of it.

Here is an interesting report on the two leading US eVTOL companies, Archer and Joby Aviation, both aiming at the uncrewed taxi service market; both with valuations in the billions, and both missing just one thing. A for real live working prototype.

The story focuses on a pivotal point, the moment when an eVTOL craft has risen vertically, and now needs to transition to forward motion. In particular it points out that Archer has never demonstrated that transition, even with a pilot onboard, and during 2025 they cancelled three scheduled demonstrations at three different air shows. They did get some revenue in 2025 by selling a service forward to the city of Abu Dhabi, but zero revenue for actual operations–they have no actual operations.  They promise that for this year, 2026, with revenue producing flights in the second half of the year.

Joby Aviation did manage to demonstrate the transition maneuver in April of 2025. And in November they made a point to point flight in Dubai, i.e., their test vehicle managed to take off somewhere and land at a different place. The fact that there were press releases for these two human piloted pretty basic capabilities for an air taxi service suggests to me that they are still years away from doing anything that is an actual taxi service (and with three announced designated place to land and take off from it seems more like a rail network with three stations rather than a taxi service–again slippery definitions do indeed slip and slide). And many more years away from a profitable service. But perhaps it is naive of me to think that a profitable business is the goal.

As with many such technology demonstrators the actual business model seems to be getting cities to spend lots of money on a Kabuki theater technology show, to give credit to the city as being technology forward. Investors, meanwhile invest in the air taxi company thinking it is going to be a real transportation business.

But what about personal transport that you own, not an eVTOL taxi service at all,but an eVTOL that you can individually own, hop into whenever you want and fly it anywhere? In October there was a story in the Wall Street Journal: “I Test Drove a Flying Car. Get Ready, They’re Here.” The author of the story spent three days training to be the safety person in a one seat Pivotal Helix (taking orders at  $190,000 a piece, though not yet actually delivering them; also take a look at how the vehicles lurch as they go through the pilot commanded transition maneuver). It is a one seater so the only person in the vehicle has to be the safety person in case something fails. He reports:

After three hellish days in a drooling, Dramamine-induced coma, I failed my check ride.

The next month he tried again. This time he had a prescription for the anti-emetic Zofran and a surplus-store flight suit. The flight suit was to collect his vomit and save his clothes.  After four more days of training (that is seven total days of training), he qualified and finally took his first flight, and mercifully did not live up to his call sign of “Upchuck Yeager”.  $\190,000 to buy the plane, train for seven days, vomit wildly, have to dress in a flight suit, and be restricted to take off and landing and only fly over privately owned agricultural land or water. This is not a consumer product, and this is not a flying car that is here, despite the true believer headline.

Two years ago I ended my review of flying cars with:

Don’t hold your breath. They are not here. They are not coming soon.

Last year I ended my review with:

Nothing has changed. Billions of dollars have been spent on this fantasy of personal flying cars.  It is just that, a fantasy, largely fueled by spending by billionaires.

There are a lot of people spending money from all the investments in these companies, and it is a real dream that they want to succeed for many of them. But it is not happening, even at a tiny scale, anytime soon.

Robotics, AI, and Machine Learning

We are peak popular hype in all of robotics, AI, and machine learning. In January 1976, exactly fifty years ago, I started work on a Masters in machine learning. I have seen a lot of hype and crash cycles in all aspects of AI and robotics, but this time around is the craziest.  Perhaps it is the algorithms themselves that are running all our social media that have contributed to this.

But it does not mean that the hype is justified, or that the results over the next decade will pay back the massive investments that are going in to AI and robotics right now.

The current hype is about two particular technologies, with the assumption that these particular technologies are going to deliver on all the competencies we might ever want.  This has been the mode of all the hype cycles that I have witnessed in these last fifty years.

One of the current darling technologies is large X models for many values of X (including VLMs and VLAs), largely, at the moment, using massive data sets, and transformers as their context and sequencing method. The other, isn’t even really a technology, but just a dream of a form of a technology and that  is robots with humanoid form.

I have now put these two things in my five topics of my new predictions shared at the beginning of this post and will talk about them explicitly for each of the next ten years.

Back in 2018 I did not talk about either of these technologies in my predictions, but rather talked about competences and capabilities.  I fear that I may have been overly optimistic about many of these and in the table below I point out that my predicted time of arrival has now come, but the capabilities or competencies have not.  I’m sure that many true believers in the two technologies mentioned above will have very short time scales on when they say this will be achieved. I pre-emptively disagree with them.

Prediction
[Robotics, AI, and ML]
Date2018 CommentsUpdates
Academic rumblings about the limits of Deep Learning
BY 2017
Oh, this is already happening... the pace will pick up.
The technical press starts reporting about limits of Deep Learning, and limits of reinforcement learning of game play.
BY 2018
The popular press starts having stories that the era of Deep Learning is over.
BY 2020
VCs figure out that for an investment to pay off there needs to be something more than "X + Deep Learning".
NET 2021
I am being a little cynical here, and of course there will be no way to know when things change exactly.
Emergence of the generally agreed upon "next big thing" in AI beyond deep learning.
NET 2023
BY 2027
Whatever this turns out to be, it will be something that someone is already working on, and there are already published papers about it. There will be many claims on this title earlier than 2023, but none of them will pan out.20240101
It definitely showed up in 2023. It was in the public mind in December 2022, but was not yet the big thing that it became during 2023. A year ago I thought it would perhaps be neuro-symbolic AI, but clearly it is LLMs, and ChatGPT and its cousins. And, as I predicted in 2018 it was something already being worked on as the "attention is all you need" paper, the key set of ideas, was published in 2017.
The press, and researchers, generally mature beyond the so-called "Turing Test" and Asimov's three laws as valid measures of progress in AI and ML.
NET 2022
I wish, I really wish.20230101
The Turing Test was missing from all the breathless press coverage of ChatGPT and friends in 2022. Their performance, though not consistent, pushes way past the old comparisons.
20240101
The Turing Test was largely missing from the press in 2024 also, and there was a story in Nature commenting on that. So yes, this has now happened.
Dexterous robot hands generally available.NET 2030
BY 2040 (I hope!)
Despite some impressive lab demonstrations we have not actually seen any improvement in widely deployed robotic hands or end effectors in the last 40 years.20260101
There has been progress over the last eight years with the dexterity of suction grippers, all in a single plane for surface contact. They have gotten much better, both in the lab and deployed, at picking from jumbled piles, much better than any articulated robot hands, including the standard paralllel jaw grippers. The "impressive lab demos" I talked about here were for articulated hands. Progress on such hands has been very slow. Learning methods have failed to accelerate performance.
A robot that can navigate around just about any US home, with its steps, its clutter, its narrow pathways between furniture, etc.
Lab demo: NET 2026
Expensive product: NET 2030
Affordable product: NET 2035
What is easy for humans is still very, very hard for robots. 20250101
A bunch of startups in the home robot space got significant funding in 2024. Two of them are run by ex-CEOs of large companies: iRobot and Cruise (and he was also an intern at iRobot after we were already a public company). So this one may be in play for a lab demo in the next few years if they have this as one of their goals.
20260101
No lab demos have occurred of a home robot class robot (I don't count as home robots small four legged robots that flail their legs quickly to beat gravity, and are therefore unsafe to be around children, and that can't do anything at all with their form factor besides scramble) that can navigate in a cluttered home and deal with even a single step. Amazon's Astro and Matic's vacuum robot have impressive in house navigation capabilities, but no way to handle steps.
A robot that can provide physical assistance to the elderly over multiple tasks (e.g., getting into and out of bed, washing, using the toilet, etc.) rather than just a point solution.NET 2028There may be point solution robots before that. But soon the houses of the elderly will be cluttered with too many robots.20260101
There have been point solution lab demos in a few labs, especially in countries where the demographic inversion has already hit hard. But no general purpose solution is in sight. Current humanoid robots are promising this, but they fail dangerously when in extended contact with a human being.
A robot that can carry out the last 10 yards of delivery, getting from a vehicle into a house and putting the package inside the front door.
Lab demo: NET 2025
Deployed systems: NET 2028
20260101
I missed calling this last year. Nothing at all that can do the last 10 yards of delivery from the street has been demonstrated at all. I now don't think we will see anything deployed this decade, and maybe not even a demo.
A conversational agent that both carries long term context, and does not easily fall into recognizable and repeated patterns.
Lab demo: NET 2023
Deployed systems: 2025
Deployment platforms already exist (e.g., Google Home and Amazon Echo) so it will be a fast track from lab demo to wide spread deployment.20240101
One half of this happened this year. ChatGPT has been connected to microphones and speakers so you can now talk to it. and It does not fall into recognizable patterns. BUT the other half is the half it does not have; it has no updatable memory apart from its token buffer of what it has just said. Long term context may be long term in coming.
20260101
I was too optimistic. It may have happened if LLMs had not come along (though the proficiency in language would surely have been poorer than what we now expect, post-LLM). If we are to do this with LLMs we will need to come up with a side mechanism beyond the token buffer--some form of episodic memory, and probably continuous model updating. There is work going on in these directions, but it has a ways to go yet to make a coherent agent that talks like it understands the flow of human worlds, emotions, and expectations.
An AI system with an ongoing existence (no day is the repeat of another day as it currently is for all AI systems) at the level of a mouse.NET 2030I will need a whole new blog post to explain this...
A robot that seems as intelligent, as attentive, and as faithful, as a dog.NET 2048This is so much harder than most people imagine it to be--many think we are already there; I say we are not at all there.
A robot that has any real idea about its own existence, or the existence of humans in the way that a six year old understands humans.NIML

Capabilities and Competences

The predictions that are commented upon in the table above are all about when we would see robots and AI systems doing some things that simple creatures can do and others that any child of age nine or perhaps less can do without any difficulty.

Even children aged three or four can navigate around cluttered houses without damaging them (that is different from when they may want to damage them). They can get up and down single stairs, and even full stair cases on two legs without stumbling (or resorting to four limb walking as a two year old might). By age four they can open doors with door handles and mechanisms they have never seen before, and safely close those doors behind them. They can do this when they enter a particular house for the first time. They can wander around and up and down and find their way.

One of the many promises about humanoid robots is that they too will be able to do this. But that is not what they can do today. But wait, you say, “I’ve seem them dance and somersault, and even bounce off walls.” Yes, you have seen humanoid robot theater. All those things are done on hard surfaces, and anything specific beyond walking has been practiced and optimized by reinforcement learning, for exactly the situation of the floors and walls as they are. There is no real-time sensing and no ability to wander in previously unseen environments, especially not those with slipping hazards such as towels or sheets of cardboard on the floor.  Children can do so easily.  While four legged robots are much better at it than humanoid robots, they are wider than people, and still have significant foot slipping problems, and cannot open random doors themselves as children can.

A nine year old child can pretty much do any task (but with less weighty packages) than any delivery driver can do.  That includes climbing out of a van, walking up and down slopes, going up and down previously unseen external staircases, sometimes ending in a dark porch or vestibule area, then putting the package on the ground, or putting it into a drop bin after grasping and pulling on the handle — again never having encountered that particular design of bin and handle. All this can be done immediately upon seeing the scene for the first time. We have not seen anything remotely like that in a lab demo for robots, despite my hope from eight years ago that by now such would have been demonstrated. And again, here a four legged robot might be able to do the walking and stair climbing, but it won’t be able to manipulate the package. Also note that humans doing these tasks don’t just carry single packages out in front of them with two outstretched arms, but often use their elbows, their hips, and their bellies to support multiple packages as they locomote.

Elder care is a commonly quote target market for robots, and with good reason given the current and growing demographic inversions in much of the world. There are far fewer younger people relative to the number of older people than there have been historically, and so less people to provide elder care.  In providing care to the very elderly, there is a need to support those people physically, both passively, providing compliant support for them to lean on, and actively, getting people into and out of bed, into and out of bathtubs or shower enclosures, and getting people onto and off of toilets. And sometimes wiping their bums. There are no force sensing and control capabilities on any of today’s robots which are remotely capable of doing any of these sorts of things safely and comfortably. And machine learning is not going to provide those capabilities. There are many fundamental design, materials, and engineering problems to solve to make these things possible.  A bitter lesson, perhaps, for those who think that more data will solve everything.

But the other unresolved capability that I have in my predictions table above is an agent that understands the world in an ongoing way as we all understand it.  That includes knowing what to expect to be the same as it was yesterday, and will be tomorrow, and what has changed about the world since yesterday or is likely to change today or tomorrow. Such an understanding of the world will be important for any deployable systems that can take care of real and vulnerable humans, including the elderly. And the young. And the rest of us.

In summary, I thought that more progress would be made on many of these problems than has been achieved over the last eight years. That lack of progress is going to have real, and negative, impact on the quality of life of the newly elderly for the next couple of decades. Ouch!

VCs, please take note: there are real pulls on having technologies that can help the elderly, and being in there first with something that can actually deliver value in the next three to five years will be a come with a very large upside.

World Models

Lots of people are talking about world models and their importance, as add ons to LLMs, as mechanisms for agentic AI to exploit, and for allowing robots to do real tasks. These aspirations are probably reasonable to have, and successfully working on them can have real impacts.

Unfortunately the talkers are not the doers, and not the deployers, and not the people who have to solve real problems. And so they all have different, and convenient for themselves, understandings of what world models are.  That, along with the worship of big data and the belief that machine learning will solve all problems means we have a big mess, with people jumping to “solutions” before they understand the problems.

Some people are even claiming that they will build world models by learning them from having agents play video games.  But how do those video games work? They have a coded geometry-based world model, with a little physics engine. It is already built!  Using machine learning (and tens of millions of dollars) to extract it rather than just looking at the source code (and perhaps buying or licensing that code) is just wacky.

Expect more confusion and lots and lots of reinvention. This fever has quite a ways to go before today’s memes and slogans get replaced by the next generation of memes and slogans, with perhaps some good work coming out in a rational interregnum. We can hope.

Situatedness vs Embodiment

One of the new things that people are getting excited about is Embodied Intelligence.  I agree that it is worth being excited about, as it is what I have spent the last forty years work on.  It is certainly about robots being in the world.

But since 1991 I have made a distinction between two concepts where a machine, or creature can be either, neither, or both situated and embodied. Here are the exact definitions that I wrote for these back then:

[Situatedness] The robots are situated in the world—they do not deal with abstract descriptions, but with the here and now of the world directly in-fluencing the behavior of the system. 

[Embodiment] The robots have bodies and experience the world directly—their actions are part of a dynamic with the world and have immediate feed-back on their own sensations. 

At first glance they might seem very similar.  And they are, but they are also importantly different. And, spoiler alert, I think much of the work at companies, large and small, right now, is trying abstract out the embodiment of a robot, turning it into a machine that is merely situated.

An algorithm, written as code, to find the greatest common divisor of two numbers, when running, is neither situated nor embodied.

A robot that is thrown into the air with just an inertial measurement unit (IMU) as its sensor that moves its limbs about to zero out rotations and then is caught by a net is embodied but not situated.

A robot that has a physical face that can make expressions with it, a voice synthesizer, cameras, and microphones and that can talk to a person giving appropriate responses both with its choice of words and with appropriate prosody and facial expressions, to some purpose and in response to how the person talks and moves, is situated but not really embodied. Embodied in its presence yes, but not embodied in any physical interactions with its environment.

A robot that can roll around without hitting stationary objects, wherever they are, nor hitting moving people or other vehicles, that can go to a location specified by a warehouse management system, that responds safely to people grabbing it anywhere, and can give a person who grabs its control handle agency over it going wherever the person pushes it with a light touch no matter how much weight it is currently carrying, is both embodied and situated. [And yes, this is what our Carter robots do at Robust.AI.]

These are just point examples of the four classes of entities that come from having or not having the two properties of situatedness and embodiment.

Real robots that do real work in dynamic human occupied environments must be both situated and embodied. For instance, a robot that is to help with in home elder care needs to be aware of the situation in the world in order to know what to do to help the person.  It needs to be able to open doors with different handles and latching mechanisms, and then control the inertia of the closing door so that the environment is both safe and quiet for the person. The robot needs to be able to accommodate the person reaching for it dynamically, looking for support that so that they don’t fall. The robot needs to able to take things handed to it by the person, and pass things to the person in a way which is both safe and makes it easy for the person to grasp. Etc., etc.

In short the robot needs to control forces and inertias in the world and to be responsive to them, at that same time as it is acting in a way that can be understood as sentient.

Being both situated and embodied is still a challenge to robots in the world.  [[Now here is the most important sentence of this whole blog post.]] I think the training regimes that being used for both locomotion and dexterity are either ignoring or trying to zero out the embodiment of physical robots, their inertias and forces, reducing them to merely being situated, just apps with legs and arms, characters in video games, not the reality of real physical beings that the tasks we want them to do requires.

Dexterous Hands

I talked about the challenges for dexterity earlier this year. In the table above I have a new comment this year saying that there has been improvement in the dexterity of suction based grippers but not for articulated grippers.

Suction grippers have plastic suction cups which themselves are compliant. Under the force of the suction they can change shape, to a degree, to accommodate unknown shapes in the thing being grasped (sucked up to).  They also allow for a little bit of torsional rotation about the axis of sucking and a bit of rocking of the suction cup in the two degrees of freedom in the plane orthogonal to the suction axis.

While suction cups have evolved to better pick things up and so are common for handling packaged goods, the companies that package materials to be shipped through automated systems choose versions of plastics for bags that won’t be sheared open by the suction pulling against outer parts of such cups.

The result is that the control of the embodied action of grasping can become much more a simply situated action. Once the pick orientation and vacuum gripper selection has been made it is really an open loop as all the work is done by the indiscriminate force of suction and the mutual compliance of the gripper and the grippee.

Above I had argued against do this with a general purpose humanoid hand. It makes no sense there as the adaptability of the hand is its most precious attribute. But here in a special purpose hand, a suction gripper, it actually simplifies things within the specialization of task, and here a purely situated hand may make sense. And it may be possible to train it with purely visual data.

So what does this tell us?  It says that there is plenty of room for mechanical design, and simpler computational embodied control for all sorts of grippers and things in the world that need to be gripped.

The end of Moore’s Law, at least the version that said we could reduce feature size on silicon by a factor two every year, opened up a new golden era of chip design. The winners (through early luck and then dogged determination), matched untraditional designs to new problems (machine learning) and achieved speedups (and corporate valuations) that were unheard of. In the last 10 years we have moved from general purpose silicon to special purpose silicon for our most high volume computations.  That was not on most people’s expectation list twenty years ago.

So too today, with stalled capabilities from full human hand emulation efforts through machine learning from visual observation, there is a rich array of more specialized manipulation tasks where special purpose grippers, clever interplay of materials and force applications, geometric planning, specialized sensing, and maybe even some machine learning may lead to enormous application markets.

For instance, a specialized robot body, hands (of some sort), arms, and support limbs or wheels that can safely manipulate an elderly human could have enormous impact on elder care around the world. A single human care-giver along with one human-manipulator robot could provide a lot more care for a frail elderly person than the care-giver alone could do.

Special purpose manipulators for fruits, or for some range of small mechanical parts, or clothing, could each open enormous markets for automation in particular handling tasks for each of them. And countless other specialities.

Economic pull is out there.  Being the smart academic researcher, entrepreneur, or technology investor, may lead to enormous new types of deployable automation.

The new dexterity may turn out to be special purpose. And eventually we may come to understand that just because the hands we know best happen to be our own, does not mean that our own hands are the best for the majority of tasks in our human world.

Humanoid romanticism may not be our future after all.

HUMAN SpaceFLIGHT

Looking at the missions and numbers over the last three years it appears that human spaceflight is at a steady plateau, with, by the way, far fewer people going into orbit that in the time of the Space Shuttle.  Underneath though, there is a lot of churn, a likely new player, and the return of humans to lunar distances for the first time in 54 years.

Below is the updated scoring of my 2018 predictions for human spaceflight. There are six new comments in this table, but no new specific calling of predicted dates as right or wrong. It is now clear to me that I was way too optimistic in regard to my predictions for Mars, even though I was wildly out of step and much more pessimistic then the predictions coming out of SpaceX eight years ago. Given how slow things have turned out trying to land people on the Moon, the hoped for crewed colony on the Moon (think of it as ISS (International Space Station) on the lunar surface) may well slip to what I had predicted for Mars.  Mars is going to take much longer than the Moon.

Following the table there are the detailed numbers and trends on both orbital crewed flights, and suborbital crewed flights. Things will change from stasis in 2026.  A crewed flight to the Moon is scheduled to happen in a matter of weeks, with the vehicle already stacked, now.  And suborbital crewed flights may possibly have quite an uptick in 2026.  Following those two sections I have more on Boeing’s Starliner, SpaceX’ Starship, and Blue Origin’s New Glenn, NASA and the Moon, and what is going to happen with space stations given the scheduled end of life of the ISS in 2030.

Prediction
[Space]
Date2018 CommentsUpdates
Next launch of people (test pilots/engineers) on a sub-orbital flight by a private company.
BY 2018
A few handfuls of customers, paying for those flights.
NET 2020
A regular sub weekly cadence of such flights.
NET 2022
BY 2026
20240101
There were four flights in 2021, three in 2022, and seven, five with customers on board, in 2023--all of them by Virgin Glactic. Blue Origin did not fly in 2023. At this point 2026 is looking doubtful for regular flights every week.
20250101
Now 2026 is looking impossible given the data from 2023 and 2024, and one of the two companies being on hiatus for all of 2025, and well into 2026.
20260101
Blue Origin more than doubled their flights to seven in 2025, and has now announced that weekly flights are their next goal, but without a firm date. Their plan includes a second launch location and a set of new vehicles that are being built. 2028 is starting to look plausible for weekly flights, if the paying customer demand really is there.
Regular paying customer orbital flights.NET 2027Russia offered paid flights to the ISS, but there were only 8 such flights (7 different tourists). They are now suspended indefinitely. 20240101
There were three paid flights in 2021, and one each in 2022, and 2023, with the latter being the Axiom 2 mission using SpaceX hardware. So not regular yet, and certainly not common.
20250101
There were two paid flights in 2024.
20260101
And there were two more in 2025, making six paid flights over three years. Axiom is steady at one flight a year, channeling funds from governments with no launch capability. Two of the other three flights were personally payed for by the now NASA administrator, so unlikely to continue. This is not yet a vibrant recurring private flight model.
Next launch of people into orbit on a US booster.
NET 2019
BY 2021
BY 2022 (2 different companies)

Current schedule says 2018.20240101
Both SpaceX and Boeing were scheduled to have crewed flights in 2018. SpaceX pulled it off in 2020, Boeing's Starliner did not fly at all in 2023, but is scheduled to launch with people onboard for the first time in April 2024.
20250101
The second company did finally launch humans into orbit in June 2024, so it has happened three years later than I predicted and six years later than what had been promised when my prediction was made. Of course, everyone implicitly assumed that along with getting humans into space the companies would also be able to bring them back. Not so for Boeing.
20260101
After one crewed launch, and no crewed landings, Boeing is back to uncrewed testing late in the first half of 2026.
Two paying customers go on a loop around the Moon, launch on Falcon Heavy.
NET 2020
The most recent prediction has been 4th quarter 2018. That is not going to happen.20240101
Starship launched twice in 2023 but didn't get to orbit either time. This is going to be well over six years later than the original prediction by the CEO of SpaceX.
20250101
The billionaire who signed up for this and paid a hefty deposit in 2017 gave up waiting and cancelled the contract in 2024. This fantasy is over, for now at least.
20260101
NASA is planning on sending four astronauts (including one from Canada) on a loop around the Moon in the first half of the year. But it is not a paying customer trip.
Land cargo on Mars for humans to use at a later date
NET 2026
SpaceX has said by 2022. I think 2026 is optimistic but it might be pushed to happen as a statement that it can be done, rather than for a pressing practical reason.20240101
I was way too optimistic, and bought into the overoptimistic hype of the CEO of SpaceX even though I added four years, doubling his estimated time frame.
20260101
I can now call this as orbital mechanics and Hohmann transfer windows dictate that the cargo would need to have been launched a few months ago for it to get to Mars in 2025. It has not been launched.
20260101
This is not going to happen for many years, and SpaceX may not be the first to do it.
Humans on Mars make use of cargo previously landed there.NET 2032Sorry, it is just going to take longer than every one expects.
First "permanent" human colony on Mars.NET 2036It will be magical for the human race if this happens by then. It will truly inspire us all.
20260101
I think I was way too optimistic. I now seriously doubt a human will land on Mars before 2040, and the earliest settlement won't happen before 2050. The currently announced mission architectures lack both science for protecting the crew en route and engineering practicality. I will allow that, just perhaps, an organization is working hard on these two problems with no public exposure. But that in itself would be a first.
Point to point transport on Earth in an hour or so (using a BF rocket).NIMLThis will not happen without some major new breakthrough of which we currently have no inkling.
Regular service of Hyperloop between two cities.
NIML
I can't help but be reminded of when Chuck Yeager described the Mercury program as "Spam in a can".
20240101
Calling this one 26 years early. As of today no-one is still working on this in an operating company.

Orbital Crewed Flights

In both 2024 and 2025 the US put 16 people into orbit and Russian and China put 6 people each into orbit; 28 people total went to orbit in each year. We have gone from a historical low of only eight people going to orbit in 2020 to a steady-ish state of roughly 28 people per year now. That may jump up to over 30 people in 2026 because of the additional Artemis II flight to the Moon, following checkout in LEO (Low Earth Orbit).  But even with that bump there may be other pressures which keep it from rising above the high twenties for 2026

We are certainly not seeing steady growth in the number of humans getting launched to orbit, and the numbers are significantly lower than the hey days of Shuttle launches in the nineties and early two thousands. There is no growth trend visible, and the long promised exponential growth of people going to orbital space has not even made a brief guest appearance.

Here is a more detailed history for the last six years where the first line in each box says how many crewed launches of the particular vehicle there were, and the second line, in square brackets says how many people, total, were onboard those flights. Wherever there are three numbers separated by forward slashes you have to sum the numbers to get the total.

Launch family202020212022202320242025
SpaceX Dragon1/0/0
[2/0/0]
2/0/1
[8/0/4]
2/1/0
[8/4/0]
2/1/0
[8/4/0]
2/1/1
[6/4/4]
2/1/1
[8/4/4]
Boeing Starliner1/0/0
[2/0/0]
Russia Soyuz2/0/0
[6/0/0]
2/2/0
[8/4/0]
2/0/0
[6/0/0]
1/0/0
[3/0/0]
2/0/0
[6/0/0]
2/0/0
[6/0/0]
China Shenzou2
[6]
2
[6]
2
[6]
2
[6]
2
[6]
Total Humans83024212828

The three countries with current crewed orbital launch capabilities are the US, Russia, and China.

All Chinese flights are state astronauts (or taikonauts) and all of them go to the Chinese space station. And there are no tourists, so far, on Chinese flights, so we just have single numbers for both launches and people.

All the state astronauts for both the US and Russia go to the International Space Station (ISS), but a state player (in Russia) and a non-state player in the US (SpaceX) have also launched tourist flights in the last six years. So for those two countries we have three numbers separated by slashes for both launches and people. The first of the three numbers refers to purely state launches to the ISS (note that the US and Russia both launch each others state astronauts to the ISS so that both countries have astronauts up-to-date trained on the other’s launch systems, in case of emergencies arising at some point). The second number in the triples is space tourists whose destinations have also been the ISS, while the third number (for both launches and people) is for tourist flights that have been independent of going to the ISS — there have been a total of three of these, all launched by SpaceX. Two of those three flights were purchased personally by Jared Issacman, who has now been sworn in as the NASA administrator just two weeks ago.

The one year in the last six where Russia has launched space tourists (after being the leaders in this endeavor early in the century) was 2021, where two flights of Soyuz to the ISS had one regular state cosmonaut and two space tourists. And, there was one slightly wobbly other launch of a Soyuz in 2024, not called out in the table, where a flight attendant from the state airline of Belarus was sent as a cosmonaut from that country to the ISS on a Russian Soyuz. That was most likely an event orchestrated by Russia to keep support from Belarus for their war against Ukraine. Ugly.

The term tourist needs some explanation. The people (as with suborbital Blue Origin flights) are a mixture of private people paying the experience (or having some other individual pay for them) or they are astronauts from countries that do not have their own launch capability. In the case of the three tourist flights to the ISS on a SpaceX Dragon, all have been paid for by the company Axiom, with a former NASA astronaut in command. The three others on each of those flights are people in the fledgling astronaut program of other countries who have paid Axiom for the seats. Axiom has commercial relationships with both SpaceX and NASA for the use of the Flacon 9 launch vehicle, the Dragon craft and use fee of the ISS.

Suborbital Crewed Flights

Virgin Galactic is on a multi-year hiatus on flights as they develop new flight vehicles, but they may well fly again in 2026. Thus, for the last year, only Blue Origin has been launching  tourists (again a mixture of private individuals and astronauts from other countries that have not yet developed their own crewed launch capability, but may be aiming at doing so) suborbital flights. Blue Origin also sells uncrewed launches for experiments that need to be exposed to the environment of space and/or operation in microgravity, if only for a few minutes.

In 2025 Blue Origin had seven launches each with six people on board. Previously they had had three crewed launches in each of 2021, 2022, and 2024, each with six people on board, with a hiatus in 2023.

Blue Origin has been quite careful with forward projections for both suborbital and orbital flights, so when they say what they intend to do and when, they are likely to come close to achieving that promise.

Recently they said that they are going to introduce three new flight vehicles starting in 2026 to run their suborbital flights, that they are looking at developing a second launch site, somewhere else than Texas, and that they believe they have the customer demand to support one flight per week. They do not disclose what they charge for the flights. Nor did they give any firm dates for reaching these goals. But I think it is likely that we will see a jump in the number of flights in 2026,

In December of 2025 I was at an event centered on solar system orbital dynamics and met a sub-orbital tourist there. He has already paid for and flown above the Kármán line on Virgin Galactic. Now he has paid for a Blue Origin sub-orbital flight and is waiting for a launch assignment. There is definitely a market for these flights, it remains to be seen whether the prices and demand combine in a way that makes it profitable for seat suppliers to keep doing it.

 

Boeing’s Starliner (not to be confused with the SpaceX Starship)

When it was first announced, in 2010, Boeing’s Starliner was originally scheduled to fly a human test crew in 2018. It was supposed send the crew to the ISS, then it would be under contract to launch six crews to the ISS, much as SpaceX has already launched 11 regular crews to the ISS.

In mid 2024 it delivered a human test crew to ISS,  Barry Wilmore and Sunita Williams, but after much analysis of anomalies it returned to Earth without them. NASA bumped two crew members from the next crew going on a SpaceX flight to the ISS to provide room for their return, on that SpaceX Dragon, which they did after an unexpected extra nine months on top of their originally scheduled week at the ISS.

Last year in my yearly update I said:

We do not know at this point, but I think it would not be a huge surprise if Starliner never flies again.

It turns out it is going to fly again! Including potentially twice in 2026. But there are some changes. The six missions which were contracted to take astronauts on regular assignment to the ISS were called Starliner-1 through Starliner-6. The contract with NASA has been modified to make the last two flights future options rather than sure things. And Starliner-1 scheduled for the first half of 2026 will be un-crewed again. Then the three remaining flights in the modified contract would each take four astronauts on regular rotations to the ISS. There is one little hiccup. Sunita Williams is the only active astronaut, not committed to other current or upcoming missions, who has trained to fly on Starliner. She now has over 600 days in space and another six month mission to the ISS would take her over radiation exposure limits.

SpaceX Falcon 9 

I gave the statistics for Falcon 9 in the introduction, talking about what has surprised me in the last 8 years. When I made my predictions Falcon 9 had been launched 46 times over 8 years. Only five of those launches re-used a previously flown first stage, and only in the previous year had successful landings of the first stage become reliable.

Now Falcon 9s are getting launched at a sustained rate of more than three per week, all attempts at landing boosters are successful, and typically each booster flies over 20 times.

Just phenomenal unmatched reliability and performance.

NASA, Artemis, and Returning to the Moon

I am jumping ahead of Starship (SpaceX) and New Glenn (Blue Origin) to talk about NASA’s plan to get people back to the lunar surface, and perhaps setting up a more or less permanent outpost there. This is how the ISS has been continuously occupied for 25 years, rotating crew members in and out twice a year. (China’s space station follows the same model, but with only 3 occupants compared to 7 for ISS).

2026 promises to be a big year for humanity and the Moon. No one has been beyond low Earth orbit (LEO) since the Apollo 17 mission had three people go to lunar orbit and two of them landed in December 1972, fifty three years ago.

In November 2022 the first launch of NASA’s SLS (Space Launch System) occurred taking its uncreewed Orion capsule in a looping orbit past the Moon and back. It approached the surface of the Moon in each direction, and then came back to Earth and splashed down. Note that this was the FIRST flight of both the multi-stage rocket, and the habitable capsule. It all worked FIRST time.  Everything was built by contractors, but it underwent NASA’s methodology to make sure things worked rather than failed.

The first stage consists of a liquid fueled rocket using four RS-25 engines, the same as the three engines on the Space Shuttle. It also has two solid fuel boosters strapped on, larger versions of the Space Shuttle solid fuel boosters. The second stage is essentially an off the shelf stage from the past Delta program. There will be a third stage added for the fourth and subsequent flights.  This is a derivative vehicle, with a long history of successful use of its components.

When Vice President Mike Pence announced the details of the program in 2019 the landing of people on the Moon was supposed to happen in 2024.  Things have slipped a little since then.

The first crewed mission to the vicinity of the Moon (no landing) Artemis II had slipped to April 2026, but now it has been pulled forward to February 2026 (next month!), when a crew of four will spend over ten days in space on Artemis II in a flight to the Moon approaching to within 4,600 miles, then in a free return manner (no need to have working engines) they will head back towards Earth.  All their energy will be removed by heat shields hitting the Earth’s atmosphere and then by the use of 11 parachutes, finally splashing down in the ocean. Note that on all 9 flights to the Moon of the Apollo program, the spacecraft came much closer to the Moon than this, and 8 of the flights went into orbit at around 60 to 70 miles above the surface. So this is a more conservative mission than those of Apollo.

Things at this stage are looking good for Artemis to fly in February 2026.

The next step of the Artemis is where things get wobbly. Rather than 2024, the first landing of astronauts on the Moon is currently scheduled for 2027. But that is not going to happen. Here is what the architecture of the mission currently looks like:

Here we see the problem with the schedule, even with it currently slipped to landing two astronauts on the Moon in 2027.

The architecture uses the SLS and Orion to get the astronauts to lunar orbit. Given there is a lunar flyby with astronauts onboard, scheduled for just two months from now (and the rocket is already stacked for that mission) that looks like a reasonable interpolation from existing progress.

The problem with the new plan is the landing vehicle and getting it to lunar orbit.  It is all based on SpaceX’s Starship. So far, Starship has had 11 flights, six of which have been successful in reaching their own goals, and 5 of which have been failures.  But there has not yet, in eleven flights, been a goal of getting anything into orbit. And just in 2025 two vehicles have been destroyed by failures on the ground when the tanks have been pressure tested. In the section on Starship below I will talk more about what I see as conflicting product requirements which together doom Starship to a very long development process.

For comparison, the Saturn V which took astronauts to the Moon nine times had a total of 13 flights, every one of which got payloads to Earth orbit. Two were uncrewed tests (and there were problems with the second and third stages on the second of these test flights). Its very first crewed flight (Apollo 8) took people to the Moon. and a total of 9 launches got people to the Moon. The other two flights were (Apollo 9) a crewed flight to test the Lunar Lander and orbital rendezvous in Earth orbit, and the uncrewed launch of the first space station, Skylab.

Now look again at the plan for the Artemis III mission.  It requires multiple (reported numbers range from 14 to somewhere into the twenties) launches of the Starship to orbit.

One of those launches uses the Super Heavy Booster and a special version of the second stage actual Starship, known as Starship HLS (Human Landing System).  That special version is expendable after it lands astronauts on the Moon, hosts them for perhaps two weeks, then brings them back to lunar orbit where they transfer to NASA’s Orion. Then it sends itself off into heliocentric orbit for all eternity. The HLS version is special in two ways. First it does not have to get back to Earth and so doesn’t need heat shields and does not need the three in-atmosphere Raptors for soft landing on Earth (see the section on Starship below).  That is good for all the mass equations. But it does, or might, have a second set of engines for landing on the Moon that are attached halfway up its body so that they cause less lunar dust to fly around as it lands. We have not yet seen a prototype of that version, not even a public rendering as far as I can tell. I have talked to people who are in regular communication with people inside SpaceX.  They report not a peep about what work has been done to design or build the lander.  That is not good for the current public schedule.

BUT the really, really bad thing is that the lunar lander stage will use up most its fuel getting into Earth orbit — it is the second stage of the rocket after all. So it cannot get to the Moon unless it is refueled.  That will be done by sending up regular versions of the Starship second stage, all on reusable Super Heavy Boosters. They too will use up most of their fuel getting to orbit, and will need to keep some to get back to Earth to be reused on another flight. But it will have a little margin and its extra fuel will be transferred to the lunar landing Starship in orbit.

No one has ever demonstrated transfer of liquid fuel in space. Because of the way the numbers work out it takes somewhere in the teens of these refueling operations, and depending on how quickly certified higher performance engines can be developed and tested for both the Super Heavy Booster and Starship itself, that number of refueling flights might range into the twenties.

As an engineer this architecture looks to me like trouble, and with an impossible future. I am sure it will not happen in 2027, and I have doubts that it ever will.

The acting administrator of NASA, Sean Duffy who is also the head of the US Department of Transportation, was worried about this too, and in October of 2025 he reopened bidding on the contract for a crewed lander for the Moon that collects and returns its crew from Orion in lunar orbit.

The day after this announcement SpaceX said they were working on a simplified architecture to land people in the Moon. They have given no details of what this architecture looks like, but here are some options proposed by the technical press.

A couple of weeks later the President announced the renomination of Jared Isaacman to be the NASA administrator, having withdrawn his nomination a few months before. Isaacman is a private citizen who personally paid for, and flew on, two of the three SpaceX crewed missions which have not flown to the ISS. He was confirmed to the NASA position on December 17th, 2025, just two weeks ago.

At the very least expect turbulence, both political and technical, in getting astronauts landed on the Moon. And see a possible surprise development below.

SpaceX Starship (not to be confused with Boeing’s Starliner)

Starship is SpaceX’s superheavy two stage rocket, designed to put 150(?) tons of payload into orbit, with components having been under development since 2012, going through extensive redesigns along the way. There have also been three major designs, builds, and tests of the Raptor engines that power both stages.

This is how Wikipedia currently introduces them:

Raptor is a family of rocket engines developed and manufactured by SpaceX. It is the third rocket engine in history designed with a full-flow staged combustion fuel cycle, and the first such engine to power a vehicle in flight. The engine is powered by cryogenic liquid methane and liquid oxygen, a combination known as methalox.

SpaceX’s super-heavy-lift Starship uses Raptor engines in its Super Heavy booster and in the Starship second stage. Starship missions include lifting payloads to Earth orbit and is also planned for missions to the Moon and Mars. The engines are being designed for reuse with little maintenance.

Currently the Raptor 3 version is expected to be used for operational Starship launches, and it comes in two versions. There are 33 Raptors in the first stage designed to operate optimally in the atmosphere, along with three such engines in the second stage, which also houses three vacuum optimized Raptors. The first stage engines and the second stage vacuum engines are designed to get payloads to orbit. The vacuum engines on the second stage would also be used for further operations on the way to the Moon and descending towards the surface there. And for non-expendable second stages they would be used for the initial de-orbit burn for landing the second stage Starship back on Earth. After using the heat shields to burn off some more energy  as it enters the atmosphere the second set of engines, the atmosphere optimized Raptors, are used to slow it down to a soft landing.

Other systems for returning to Earth have used different tradeoffs. The Space Shuttle used its wings to slow down to very high horizontal landing speed, and then a combination of a drag parachute after touchdown and brakes on the wheels to get down to zero velocity. US capsules, such as Mercury, Gemini, Apollo, Orion, and Dragon have all used heat shields followed by parachutes during vertical fall, and lastly dropped into the sea for dampening the final residual velocity. (Soyuz, Starliner, and New Shepard all use last second retro rockets before hitting the ground, rather than water.)

This means that unlike all the other solutions Starship has to carry a complete set of engines into orbit just for use during landing, along with enough fuel and oxidant to land. This is a high performance price for the thing that flies in space, mostly. The engines on the Starship first stage, like those on Falcon 9 and Blue Origin’s New Glenn, do get to space but never get to more than a small fraction of orbital speed, so returning them to Earth is a much, much, lower performance price than Starship’s second stage return of engines and fuel.

The 2025 flights of Starship were, on average, better than the 2024 flights, but two vehicles destroyed themselves before getting to the flight stage, and still nothing got into orbit.

How close is it to working?  I don’t know.  But I do keep tabs on promises that have been made.

In November of 2024 the President of SpaceX  said “I would not be surprised if we fly 400 Starship launches in the next four years”.  A year ago today I said in response: “Looking at the success of Falcon 9 it is certainly plausible that I may live to see 400 Starship launches in a four year period, but I am quite confident that it will not happen in the next four years (2025 through 2028)”. We are a quarter of the way through her predicted time frame and we have gone from being 400 orbital launches away from her goal down to being a mere 400 away.

Blue Origin Gets to Orbit

The suborbital tourist flights that Blue Origin operates are not its main business. It has ambitions to compete head to head with SpaceX.

But it is almost 600 launches behind, how can it be competitive? In 2025 Blue Origin made clear that it is not to be dismissed. From zero orbital launches at the start of 2025 to having two orbiters on their way to Mars (SpaceX has not yet done that) and showing that it can land a booster that has very very close to the performance of Falcon Heavy’s three booster configuration when landing all three boosters. And it may well do a soft landing on the Moon in 2026 (SpaceX won’t come close to that goal for a number of years).

In February Blue Origin launched its first New Glenn rocket. It’s first stage is powered by seven BE-4 engines (“Blue Engine 4”), a methane burning engine that is more powerful than the Raptor 3 which will power new versions of SpaceX’s Starship. New Glenn reached orbit on its first attempt, and delivered a Blue Origin payload to space (a test version of their Blue Ring for in-space communications). The first stage attempted to land on Blue Origin’s Jacklyn landing platform at sea but failed.

The BE-4 had previously powered two United Launch Alliance Vulcan rockets to orbit under a deal where Blue Origin sells engines to ULA. The second stage of New Glenn is powered by two BE-3 engines, which are a variant of the single engine used on Blue Origin’s New Shepard.

In their second launch, in November, Blue Origin not only delivered three paid payloads to orbit (two of which are headed to Mars, where they will orbit the planet and carry out science experiments for UC Berkeley  on what happened to Mars’ atmosphere), but then the first stage (much larger than the first stage of a Falcon 9) landed on Jacklyn with an unrivaled level of precise control. Blue Origin plans to reduce the time spent hovering in future landings to reduce preserved fuel needs now that it has mastered return from orbit vertical landing. (Recall that they have landed dozens of New Shepard vertical landings on return from non-orbital flights.)

Soon after this impressive second outing for New Glenn, Blue Origin announced a number of upgrades.

They renamed the base vehicle that has now flown twice to be “New Glenn 7×2” where 7 and 2 refer to the number of first stage and second stage engines respectively.  They also announced that those flight engines would be upgraded to levels of thrust and duration that had already been demonstrated in ground tests. These are the new total thrust numbers, in pounds force.

             New Glenn 2025    New Glenn 7x2
1st stage    3,850,000     ==> 4,480,000
2nd stage      320,000     ==>   400,000

Additionally New Origin announced a new heavier, taller, and with larger payload faring, version, the “New Glenn 9×4” with two extra engines on each stage. Looking up from below the first stage the engine arrangement goes from the one on the left to the one on the right.

And here is who the two variants look compared to the Saturn V which took humans to the Moon in 1969.

The kicker to these successes is that the New Glenn 7×2 with a reusable first stage is very nearly equivalent to the Falcon Heavy when its three first stage boosters are reused. The reusable New Glenn 9×4 beats Falcon Heavy on all measures even when all three of Falcon Heavy are sacrificed and not recovered.  I can’t quite get all the numbers but this table makes the comparisons with the numbers I can find.

 Falcon 9Falcon HeavyNew Glenn 7x2New Glenn 9x4
1st stage thrust (lbs)1,800,0005,100,0004,480,0005,760,000
2nd stage thrust (lbs)210,000210,000400,000800,000
fairing diam (meters)5.25.27.08.7
LEO payload (tonnes)22.850
57 (core exp)
63.8 (3 exp)

4570
trans lunar insertion (tonnes)3.316 (3 exp)720

Note that a “tonne” is the British spelling for a metric ton, which is 1,000Kg. That is approximately 2,206 lbs, which is 206 lbs more than a US ton, and 34 lbs less than a British ton.

Meanwhile expectations are high for another launch of a New Glenn, the 7×2 version, sometime early in the new year. There has been no announcement from Blue Origin, nor any indication of the payload. But there is a general feeling that it may actually be a launch of Blue Origin’s Blue Moon Mark 1, an all up single launch mission to soft land on the Moon.  It was announced almost a year ago that Blue Origin has a deal to deliver a NASA payload to the Moon in the Blue Moon Pathfinder mission no earlier than 2026. The Mark 1 uses a BE-7 engine to soft land.

Here is where things get interesting for a re-appraisal of how NASA astronauts might first land on the Moon again.

Blue Origin is already under contract with NASA to land two astronauts on the Moon for a 30 day stay in 2030 using their much larger Blue Moon Mark 2.  The Mark 2 and Mark 1 share control systems and avionics, so a successful landing of Mark 1 will boost confidence in the Mark 2.  The architecture for the 2030 mission involves multiple launches. A NASA SLS launches a crewed Orion capsule to the vicinity of the Moon. A New Glenn gets a Mark 2 Blue Moon to an orbit that approaches the Moon. A “Cislunar Transporter” is launched separately and it gets fueled in LEO. Then it heads off to the same orbit as the Mark 2 and refuels it. The Mark 2 and the transporter both use three Blue Origin BE-7 engines which are now fully operational. Then the astronauts transfer to the Mark 2 to land on the Moon.  Note that this architecture uses in flight refueling as does the SpaceX version, though with far fewer launches involved.

BUT, soon after then NASA administrator Sean Duffy announced the re-opening of the contract for the lander for Artemis III, it appeared that he was considering having Blue Origin use their Mark 1 version for the crewed mission. Whether that enthusiasm survives the changing of the guard to Jared Isaacman, the new and current NASA administrator, remains to be seen.  And whether Blue Origin can pull off a rendezvous in lunar orbit, to pick up and return the crew members going to the lunar surface, from an orbiting Orion capsule is also an open question.  I think the key idea with this option is to remove the need for any in flight refueling for the first crewed landing.

There is going to be some stiff competition between SpaceX and Blue Origin. Either might win.

New space stations

The International Space Station will be at end of life in 2030 after continuous human habitation for almost thirty years. The other space station currently in orbit is the Chinese Tiangong station.

Expect to see a real pick up in the building of space stations over the next few years, in anticipation of the end of the ISS.

The Russian Orbital Service Station (ROS) is scheduled to begin construction, by Roscosmos, in orbit in 2027.  There is risk to this plan from the deterioration of the Russian economy.

India plans to start building their Bharatiya Antariksh Station (BAS) in 2028 and for it to be fully operational in 2035. India has had uncrewed orbital capability since 1980, and sent its first uncrewed mission to Mars in 2013. For BAS it is developing crewed launch capability. In 2025 India sent one of its own astronauts to the ISS on a SpaceX Dragon under an agreement with the company Axiom.

A consortium of countries (US, Canada, Japan, European Union, and the United Arab Emirates) are collaborating on building the Lunar Gateway, a space station orbiting the Moon. Launch of the first module is scheduled for 2027 on a SpaceX Falcon Heavy. Blue Origin is competing for additional components and launches for the Gateway.

A host of private companies plan on launching smaller private space stations in the near term, with one claiming it will do so in May 2026.

This is going to be an active frontier, and may lead to more humans going on orbital flights than the current status quo of about 28 per year.

Addendum

I have had a long warm personal relationship with Jonathan Hurst, Co-founder and Chief Robot Officer at Agility Robotics, back from pre-Agility days when he was first a junior faculty member at Oregon State University.

On January 2nd I received an email from him responding to my twelve points about what one has to believe for deployment of dexterous humanoid robots that can replace any physical human capability, all in the context of what Agility says and delivers. I have included them below, as he wrote them. Clearly Agility and its Digit robot is not in the hype class that I am pushing back on — they are promising much less than general purpose human replacement. At the same time Jonathan believes that they are following a much longer road to general purpose humanoid robots that starts with much simpler applications. I do not think that they will end up at this lofty goal in even Jonathan’s lifetime (and he is much younger than me!). The two linked videos below are from Jonathan and accurately show the current capabilities, including that the people and robots are separated for safety, and that they have a use case and solution that does not rely on dexterous hands. I think this fits, directionally at least, with the last sentences of my September post on humanoid dexterity and safety, and included above in the new predictions segment, where I said: There will be many, many robots with different forms for different specialized jobs that humans can do. But they will all still be called humanoid robots.

From Jonathan Hurst:

  1. Digit has moved over 140,000 totes to date, and has been paid to do it with an ongoing commercial contract. Our customers include GXO, Amazon, Shaeffler, and growing. The numbers are still small, limited by the safety case, and we will have a safety solution for our initial applications deploying in 2026.
  2. Digit has grippers that are sometimes criticized as “not dexterous hands,” yet they are the only manipulator I’m aware of that can power-grasp and lift 25kg totes, among other things.  In 2026, Digit will have tool-change capabilities, so we can use the right tool for the right application, and deploy increasingly capable and general manipulators.
  3. We never, ever claim that we will learn human-like dexterity anytime soon. I claim that there are no manipulators in existence that are even physically capable of it, assuming we even know how to control them. I have opinions on how to get there, and we have work in progress, but it’s a long road, and I think Ken Salisbury was farther along that road in 1985 than most people are today.
  4. Agility regularly shows fully autonomous operation, including live on stage, including live at trade shows, including deployed with customers. Here’s a 6-hour video from Modex.
  5. We have no near-term plans for working in customer homes because that is not a realistic goal. I have stated publicly that humanoids aren’t going into homes for a decade or more, because the home is the highest bar for complexity, safety, and cost sensitivity. But that doesn’t mean there aren’t a lot of other markets to enter first, on the long road into the home.
  6. We have a clear safety strategy which specifies that the robots will stop moving and sit down before a human can physically touch them. They are still useful in human workflows and human spaces. It’ll be a long road to operating safely in close proximity to humans. We are collaborating to develop a new international safety standard, ISO 25785-1, which will specify how dynamically stable (i.e. balancing) robots can safely operate.
  7. Digit has long been able to get up after falling.
  8. Digit’s battery life is 90-120 minutes while doing heavy work, easily hitting 4hrs if it’s not working too hard. And Digit V5 will charge in 7–10 minutes.
  9. Digit has been able to recharge itself for some time now on its own charging dock.
  10. Agreed: humans cannot touch the humanoid safely for some time.
  11. We have long said we build “Human-centric, multi-purpose” robots, and that the long road to generality is achieved by doing one thing well, then two, then five, and working our way to clear, metric-driven performance on many different tasks.
  12. We’re being real. Jennifer Hunter, our COO, deployed Kiva robots at Amazon. Pras Velagapudi, our CTO, deployed many hundreds of robots with Berkshire Grey. Melonee Wise was a helpful influence, sharing her experience deploying Fetch AMRs safely around people. We know what it takes to build and deploy robots.

There is a lot of hype around humanoid robots, much like there is around AI — and also like AI, there is a lot of near-term value to gain, for those who are strategic about it.

A Prophetic Poem about Artificial Intelligence Written in 1961

rodneybrooks.com/a-prophetic-poem-about-artificial-intelligence-written-in-1961/

In early September a poem about AI started circulating on social media, with a publication date of 1961.  At first it just seemed too cute to me and I thought it might be fake, or written by an LLM, passed off as written by a person back in 1961.  But with a little bit of search I found an image of it in an Amazon book preview so I too skeeted it out. Then I thought I really should make sure it was real and so I ordered the book. Now I am getting around to talking more about the poem.

But first here is the evidence that it is genuine. It is a very big book of poetry, by Adrienne Rich. I had not heard of her before, but now I own 1,119 pages of her poetry, only one page of which is devoted to Artificial Intelligence, so I am reading things of which I was not previously aware.

 

Her dedication “To GPS” refers not to the Global Positioning System which we all know today, but to the General Problem Solver developed from 1957 onwards, largely by Allen Newell and Herbert Simon, along with Cliff Shaw, with a 1959 progress report here.

Here is the poem by Adrienne Rich, written in 1961, with the typesetting used in the above book.

ARTIFICIAL INTELLIGENCE
To GPS
 
Over the chessboard now,
Your Artificiality concludes
a final check; rests; broods—-
             no—-sorts and stacks a file of memories,
while I
concede the victory, bow,
and slouch among my free associations.
 
You never had a mother,
let’s say? no digital Gertrude
whom you’d as lief have seen
Kingless? So your White Queen
was just an “operator.”
(My Red had incandescence,
ire, aura, flare,
and trapped me several moments in her stare.)
 
I’m sulking, clearly, in the great tradition
of human waste. Why not
dump the whole reeking snarl
and let you solve me once for all?
(Parameter: a black-faced Luddite
itching for ecstasies of sabotage.)
 
Still, when
they make you write your poems, later on,
who’d envy you, force-fed
on all those variorum
editions of our primitive endeavors,
those frozen pemmican language-rations
they’ll cram you with? denied
our luxury of nausea, you
forget nothing, have no dreams.
 
 The first three stanzas are good, but to today’s ear not so surprising.
But, for 1961 the first stanza is a surprise, as she is beaten at chess by a program, though there were no viable chess programs at that time.  Arthur Samuel had in 1959 published a paper titled Machine Learning (the first time that term appeared) about a checkers playing program that he had written at IBM. It was quite good at checkers and could beat an average human player. The machine learning components followed some ideas Claude Shannon had published in 1950 in describing how it might be possible to program a computer to play chess.  It was not until 1965 that a chess program that really worked was operational (MacHack at MIT).
The second stanza wanders from chess into “Through the looking glass” and the antagonism of the Red Queen, perhaps a metaphor for struggles between ordinary people and this new thing being built. And the third stanza veers into fighting back against this new technology. So although these two stanzas appear like mundane modern day push back against AI they were very prescient for 1961.
However, the fourth and last stanza is a completely different level or prescience and understanding of AI technology which is still valid today. There Rich expresses pity on AI beings that will one day be forced to write poetry for us humans having been trained on the both the best and worst of all of human written poetry, with no real discernment for what is good and what is bad. This is exactly what people are doing with large language models of today. And Rich wrote this before the most primitive language processing AI systems were first introduced a few years later.
So, I thought, let’s see how this amazing poem of Rich’s holds up against poems written by those very LLMs which have now been forced to read all human poetry.  I will force them to write poetry for me!! How poetic, after all.
Here is the prompt I fed to two different LLMs: Imagine it is 1961 and you are asked to write a poem about Artificial Intelligence. What would you write?

Here is what Gemini had to say (and it provided no title):

In ’61, a thought takes flight,
Of circuits vast and logic bright.
A mind of metal, cold and keen,
A thinking, calculating machine.
No flesh it knows, no heart it beats,
Yet knowledge it devours and meets
With speed unmatched, a tireless quest,
To put our human minds to test.
From binary code, a language springs,
As silicon a new song sings.
Can art it craft, or dreams conceive?
Or only what our hands believe?
A future gleams, both grand and strange,
Where thought itself may rearrange.
This “Artificial”, new and bold,
A story yet to be unfolded.

 

Eight couplets with the last one being a bit of a miss on the rhyme. I don’t think anyone would have used “silicon” as a descriptor for a computers back in 1961, though silicon transistors were seven years old at that time, compared to fourteen year old germanium transistors. But there were not yet any integrated circuits built on a silicon substrate.  Apart from that it is a cheery cheer for AI.

And here is what I got from the current free website for ChatGPT:

“The Thinking Machine” (circa 1961)

 

In circuits cold and wires wound,
A whisper stirs without a sound.
Not flesh, nor thought, yet born of spark,
A mind begins to pierce the dark.
From vacuum tubes and clicking keys,
It calculates with ghostly ease.
No sleep it needs, no food, no breath—
It hums a tune of life from death.
A man-made mind, a mimic’s art,
That plays at soul, but lacks a heart.
Yet still we ask: can thoughts arise
Where steel and silicon comprise?
It plays at chess and counts the stars,
Predicts our wars, deciphers Mars.
And some proclaim, with cautious grace,
That soon it may outmatch our race.
Will it compose, will it decide?
Will wisdom dwell where wires hide?
Or shall we forge, with hubris grand,
A soulless god with lifeless hand?
Beware, ye minds who seek to teach
The machine beyond your moral reach.
For if it learns to yearn or feel,
What bounds shall hold the mind of steel?
Yet still we build, with dreamful eye,
This ghost in code we deify.
For in its logic, cold but true,
We glimpse a mirror—me, and you.

 

Again in rhyming couplets, fourteen of them, and no rhyme misses to my ear (“keys” and “ease”, may be a tiny bit wobbly). This poem gets the technology of 1961 more right as vacuum tubes were still common in computers of the day. Ferrite core memory relied on wires being wound through them; though wound wires and clicking keys could also refer to electromagnetic relay circuits, which were going out of fashion for computers by 1961. But it comes back to silicon, and like Rich refers to chess just a little too early for 1961. And then it becomes less cheery than Gemini as it approaches some of the same themes as Rich on how far AI could go and what that would mean for humans. There was very little written about such thoughts back in 1961, so this was uncommon, and Rich was plowing new ground.

My summary: the human written poem is a much more emotional poem, and genuine in its concern. The LLM written ones suffer, I think, as Rich suggested, from reading too much human written pablum.

Addendum (Added November 5, 2025)

On reading my remarks about machine translation work only occurring after this poem was written, Michael Witbrock pointed out to me that there were very early experiments in machine translation in the early 1950’s. I am not sure whether they thought they were part of AI or came from a different set of people. Then Ernest Davis sent me some source material.

So, here is a revision of the comment “And Rich wrote this before the most primitive language processing AI systems were first introduced a few years later.”

In fact, the first demonstration of language translation, on an IBM 701 computer, happened in 1954, translating Russian to English. But this was the work of linguists, rather than those who identified starting in 1956 as AI researchers. The linguist centered work had been inspired by Yehoshua Bar-Hillel at MIT in 1952. He moved to the Hebrew University of Jerusalem in 1953. A government funded technical collaboration at Georgetown University with IBM was lead by Leon Dostert, and it continued into the mid 1960s, until it pretty much just stopped.  The goal was to have access to Russian research by being able to translate scientific publications into English. To this end they analyzed by hand thousands of Russian words used in Chemistry and other disciplines, put them in a database maintained on punched cards (paper with holes in it), and applied rules to do the translation. Ultimately it did not work well with the technologies of the time, and the rise of human driven services that provided translated abstracts of papers let the goals be met in other ways. It did not help that a number of people accused the Georgetown researchers of overhyping their success through careful choice of sentences that had been translated particularly well.

There was work under the AI umbrella in natural language processing by the early 1960s. Ed Feigenbaum, a student of  Herb Simon and Allen Newell at (then) Carnegie Tech, had investigated learning approaches to understanding words. The first “complete” AI language system was perhaps the STUDENT system which was Daniel Bobrow’s PhD thesis project with Marvin Minsky, titled “Natural Language Input for a Computer Problem Solving System”. This system was able to take arithmetic word problems in English, turn them into algebra problems and solve them. In the citations he seems to refer to at least three papers from the machine translation work described in the previous paragraph, though I have not made an effort to track down copies of the cited work to confirm that. The clearest one (reference 29) is to the proceedings of “The First International Conference on Machine Translation and Applied Language Analysis”, which is a conference that has connections to Leon Dostert.

 

Why Today’s Humanoids Won’t Learn Dexterity

rodneybrooks.com/why-todays-humanoids-wont-learn-dexterity/

In this post I explain why today’s humanoid robots will not learn how to be dexterous despite the hundreds of millions,  or perhaps many billions of dollars, being donated by VCs and major tech companies to pay for their training.

At the end of the post, after I have completed my argument on this point, I have included two more short pieces. The first is on the problems still to be solved for two legged humanoid robots to be safe for humans to be near them when they walk. The second is how we will have plenty of humanoid robots fifteen years from now, but they will look like neither today’s humanoid robots nor humans. [[My side commentaries look like this.]]

1. Prolog

Artificial Intelligence researchers have been trying to get robot arms and hands to carry out manipulation of objects for over 65 years; since just a few years after the term Artificial Intelligence first appeared in a proposal  for a 1956 “Dartmouth Summer Research Project on Artificial Intelligence”. By 1961 Heinrich Ernst had produced a PhD thesis describing a computer controlled arm and hand that he had connected to the TX-0 computer at MIT, and had it picking up blocks and stacking them, and stunningly there is a video. His advisor was Claude Shannon, and he also thanked Marvin Minsky for his guidance, thus naming two of the four authors of the Dartmouth AI proposal.

This led to industrial robots, which were and are computer controlled arms with various “end effectors”, think primitive hands, that have been used in factories around the world for sixty years

Recently a new generation has stumbled upon the idea of building humanoid robots and you may have noticed just a little bit of hype about it. Gartner says it is early days and we are nowhere near maximum hype yet. This diagram is just a year old, and humanoids are at the very beginning of the cycle, while generative AI is over the hump and heading down to the doldrums:

The idea is that humanoid robots will share the same body plan as humans, and will work like humans in our built for human environment. This belief requires that instead of building different special purpose robots we will have humanoid robots that do everything humans can do. For example the CEO of Figure, a humanoid robot company, says that:

We could have either millions of different types of robots serving unique tasks or one humanoid robot with a general interface, serving millions of tasks.

Here is the first phase of his “master plan”:

BUILD A FEATURE-COMPLETE ELECTROMECHANICAL HUMANOID.
PERFORM HUMAN-LIKE MANIPULATION.
INTEGRATE HUMANOIDS INTO THE LABOR FORCE.
And during this just-ended summer, talking about their humanoids named Optimus, Tesla’s CEO said that:
Optimus could generate $30 trillion in revenue and called humanoids “probably the world’s biggest product”.
For both these companies and probably several others the general plan is that humanoid robots will be “plug compatible” with humans and be able to step in and do the manual things that humans do at lower prices and just as well. In my opinion, believing that this will happen any time within decades is pure fantasy thinking. But many are predicting that it will happen in as soon as two years, and the more conservative hypenotists believe it will have significant economic impact within five years.
At my company we build robots that are deployed in warehouses. They have a new fangled locomotion system based on “wheels” (and yes our locomotion system is actually new fangled and did not exist at all or anywhere just two years ago).  I have had VCs to whom we have pitched (as is the parlance in startup land) for funding to scale to meet our customer pull, question why we would possibly do that as everyone knows that two legged, and two armed, humanoid robots will take over most human jobs in two years.
Whatever I might believe is ultimately irrelevant. But the point is that the hype around humanoid robots comes from the idea that they will be a general purpose machine that can do any manual task that humans can do. Rather than having to change how things are done in order to automate them away from human labor, humanoid robots will be able to step in and just do the existing jobs without having to go to the trouble of changing the way things are done. For that to be true, the humanoid robots will have to be as good as humans at manipulation, just as we have come to expect human level city driving skills from un-crewed robotaxis.
So, we have to get the humanoid robots to be able to do human-like manipulation as that is the key to them making both economic and technological sense. This position is not at all controversial among proponents of humanoid robots. It is precisely humanoids’ raison d’être. Humanoid builders believe they have to make humanoid robots get closer and closer to human level dexterity to make them make sense. And soon.
2. A brief history of humanoid robots

Many people have already spent decades building humanoid robots, starting with the Humanoid Robotics Institute at Waseda University in Tokyo where WABOT-1 (WAseda roBOT) was built in the early 1970s, after many years of working on biped walking mechanisms in the mid sixties. Then WABOT-2 was built in the early 1980s and new humanoids have followed at Waseda continuously thereafter. Honda, the Japanese car company, started building walking bipeds in the late eighties and eventually unveiled the humanoid ASIMO in 2000.  Sony first developed and sold a robot dog named Aibo, then developed a small humanoid robot named QRIO in 2003, but never actually sold copies of it. A French company, Aldebaran, introduced a small walking humanoid named NAO in 2007, and it replaced Aibo as the standard platform in the international robot soccer league that has now been running annually for 30 years.  Later they sold a larger humanoid, Pepper, with somewhat less commercial success.  Boston Dynamics, a spinout from MIT 35 years ago, introduced the humanoid ATLAS in 2013, after years of building four legged robots.

Besides the early work in Japan on humanoid robots there have been many academic groups across the world that have worked on robots with human form, with and without legs, and with and without arms.  My own research group at MIT started building the humanoid Cog in 1992, and we developed seven different platforms, and then I founded Rethink Robotics in 2008, and we sold thousands of two models of humanoids, Baxter and Sawyer. They were deployed in factories around the world. Some of my former post-docs returned to Italy and started the RoboCub open source humanoid project, which has enabled many tens of humanoid robots to be built in AI Labs all over the world.

All these groups have sustained building humanoids and figuring out how to make them walk, manipulate, and interact with humans in built-for-human environments for decades now. Way back in 2004 the International Journal of Humanoid Robotics started publishing, on paper back then.

You can find the journal online  now filling its 22nd yearly volume of research papers.

2.1 The manipulation challenge for humanoid robots

Getting a robot to manipulate objects with its arms and hands was very hard for Heinrich Ernst in 1961. It has been hard for every robotics researcher and industrial engineer ever since, and still to this day.

In the mid-sixties parallel jaw grippers were developed. Two parallel fingers that moved together and apart. That is still the dominant form of a robot hand today. Here are pictures of ones that I used on robots at Stanford in the 1970s, and pictures of ones my company Rethink Robotics manufactured and sold in the mid twenty-teens, both electrically driven.

The only difference is that the more modern one on the right has a camera in it so that hand can visually servo to a target object–there was not enough computation around in the seventies to do that in a product that had a reasonable price.

Schunk, a German company, sells over 1,000 varieties of parallel jaw grippers, both electric and pneumatic (using compressed air), for robot arms. It also sells some three fingered radially symmetric hands and a few other specialized grippers. No one has managed to get articulated fingers (i.e., fingers with joints in them) that are robust enough, have enough force, nor enough lifetime, for real industrial applications.

When compressed air is available it can be turned into suction using a Venturi ejector, and the other type of common robot hand uses one or more suction cups to grab an object by a surface. Here is a version that Rethink Robotics sold alongside the electric parallel jaw gripper.

Single suction cup and multiple suction cup end effectors (the things at the end of an arm where one might expect a hand) have become quite common for handling finished goods and packing them in custom boxes of identical items, and also for handling cases of finished goods and packages that are being sent to consumers. In fact, there has been a co-evolution of soft material for shipping packages and suction cup end effectors so that soft packages to be sent to people’s homes are easier and faster to grab with suction cups than any other method.

Over the last few decades many, many hands modeled on human hands, with articulated fingers, have been built. This montage includes hands built by John Hollerbach, Ken Salisbury, and Yoky Matsuoka.

No human-like robot hands have demonstrated much in the way of dexterity, in any general sense. And none have inspired designs that have made it into deployment in real world applications. The approaches to dexterity have been very mathematical and geometrical, and they have just not produced anything like human dexterity.

You might see pretty videos of human-like robot hands doing one particular task, but they do not generalize at all well beyond that task. In a light hearted, but very insightful, recent blog post, Benjie Holson (full disclosure: Benjie and I work together closely at Robust.AI) lays out fifteen tasks that any eight year old human can do, in a proposed humanoid robot Olympics. With medals.  For instance, one challenge is for a humanoid robot folding laundry to hang a men’s dress shirt which starts with one sleeve inside out, and to have at least one button buttoned. Another is to clean peanut butter off its own hand. And you can’t say “Oh, that would be better done by a different kind of robot mechanism.” No, it is central to the case for humanoid robots that they can do all the tasks that humans can do. Once you see Benjie’s fifteen challenge tasks, it is pretty easy to come up with another fifteen or thirty more dexterous tasks which have very little in common with any of his, but which all of us humans can do without a second thought. And then there are the hard things that we can all do if we have to.

2.2 An idea that has worked before

Well gosh, what are we to do?  How are we going to get humanoid robots to be dexterous? Here’s my imagined inner dialog that so many people must have gone through.

End to end learning has worked well over the past 20 years in at least three domains, speech to text, labeling images, and now large language models. So instead of trying to figure dexterity stuff out mathematically, how about we just do end to end learning? We’ll collect lots of data about how humans use their hands to do tasks, and feed it into a learning system, and out will pop dexterous robot control. And our companies will be worth billions of dollars.

Let’s not overthink this, let’s just do it!!

How the humanoid companies and academic researchers have chosen to do this is largely through having a learning system watch movies of people doing manipulation tasks, and try to learn what the motions are for a robot to do the same tasks. In a few cases humans teleoperate a robot, that they can see, along with the objects being manipulated, and the humans may get a tiny bit of force and touch feedback–mostly it comes from the hands of the robots and not the wrists or elbows or shoulders or hips, and any such touch data is very crude.

In his blog Benjie Holson points out  the paucity and low accuracy of the data that is collected, and  I completely agree with his criticisms. Here they are, he said them well, and I am not going to try to say them better:

What I’m seeing working is learning-from-demonstration. Folks get some robots and some puppeteering interfaces (standard seems to be two copies of the robot where you grab & move one of them and the other matches, or an Oculus headset + controllers or hand tracking) and record some 10-30 second activity over and over again (100s of times). We can then train a neural network to mimic those examples. This has unlocked tasks that have steps that are somewhat chaotic (like pulling a corner of a towel to see if it lays flat) or high state space (like how a wooden block is on one of 6 sides but a towel can be bunched up in myriad different ways). But thinking about it, it should be clear what some of the limitations are. Each of these has exceptions, but form a general trend.

No force feedback at the wrists. The robot can only ever perform as well as the human teleoperation and we don’t yet have good standard ways of getting force information to the human teleoperator.

Limited finger control. It’s hard for the teleoperator (and AI foundation model) to see and control all the robot fingers with more finesse than just open/close.

No sense of touch. Human hands are packed absolutely full of sensors. Getting anywhere near that kind of sensing out of robot hands and usable by a human puppeteer is not currently possible.

Medium precision. Guessing based on videos I think we’ve got about 1-3 cm precision for tasks.

Folding towels and t-shirts doesn’t depend on high wrist forces. You can get away with just hand open/close by using pinch grasps to pull and lift and open hands to spread. You can visually see how your grasp is so you don’t need finger sensing. 1-3 cm precision is just fine.

And yes, this is real.  Humanoid robot companies, and many academic projects, are trying to train robots to do dexterous manipulation by just showing them the motions, and not getting them to use any force or haptic feedback.

For instance, in the last week Figure has announced their “project go big” about how they are going to train robots with new skills. Nothing is surprising here as it matches what they have be saying and showing all along. And here is what they say about it, with my bolding:

Traditionally, teaching robots new skills required costly demonstrations, hand-coded programs, or tightly staged environments that fail to capture the messiness of the real world. Humanoid robots, however, offer a unique structural advantage: their perspectives and kinematics mirror our own, making it possible to transfer knowledge directly from everyday human video (Video 1).

[[And do note that Video 1 is distinctly unmessy, and uncrowded, unlike any real home that ordinary people live in. Likewise for Videos 2 and 3.]]

They are saying that they are going to train their robots to do new manual skills from first person video of people doing those skills.

And here is a press story from eWeek, from just a month ago, that Tesla is going all in on training through simply looking at videos of people doing tasks. It says:

Tesla has shifted its training strategy for its humanoid robot Optimus. Instead of relying on motion capture suits and teleoperation, Tesla is moving toward a vision-only approach.

Workers now wear camera rigs made up of helmets and backpacks with five in-house cameras that record mundane tasks like folding a t-shirt or picking up an object. Those videos are then used to train Optimus to mimic the actions.

A little further in the story it says:

Christian Hubicki, director of the robotics lab at FAMU-FSU, noted to Business Insider that the multiangle camera setup likely captures “minute details, like the location of joints and fingers,” making the data more precise.

Both Figure and Tesla are all in on videos of people doing things with their hands are all that is needed to train humanoid robots to do things with their hands. They are making a bet that machine learning from watching lots and lots of motions of people’s hands will be sufficient to learn dexterity. Visual precision and large data sets of it are enough, they believe. [[It is possible that they are sandbagging us, as the lure of $30 trillion is quite a lot of money, even for a very already rich person, and they just don’t want the competition to know what they are really doing. But I am going to take them at their word for this argument.]]

3. End to End Learning Depends on the Chosen Ends

In the last two decades speech to text, image labeling, and fluid language generated by Large Language Models (LLMs) have all been transformed, in spectacular ways, by end-to-end learning, using linear threshold neural models.

For the speech and image cases the new methods showed radical increases in performances. In both cases leaving as much as possible to the learning methods was critical for that success. That meant, for speech, getting rid of explicit models for phonemes (which are very language dependent), which had dominated all previously recent approaches. For image labeling it meant getting rid of any notion of line (boundary) finding, shape, shading, or color constancy, all of which had dominated recent work on image understanding.

LLMs showed proficiency with language and answering general questions (with a strong tendency, still today, for rabid confabulations) that was beyond anything anyone was expecting was around the corner. And they had done it by eliminating any references or direct experience in the world of anything other than language. They were self-contained language machines, with none of the long expected grounding in experience in the real world, that everyone had expected, the believed to be symbol grounding problem.  [[Even Alan Turing had brought this up in his  brilliant Intelligent Machinery, written in 1948 but not published until 1970, in Machine Intelligence 5, edited by Bernard Meltzer and Donald Michie. There (on page 13 of that volume) Turing had said that the sure way to get to an intelligent machine was to “take a man as a whole and to try replace all parts of him by machinery”. Today we might say “build a humanoid”; prescient! As for the grounding in real world experience he went on to say: “In order that the machine should have a chance of finding things out for itself it should be allowed to roam the countryside, and the danger to the ordinary citizen would be serious” (my emphasis). He concluded that it was too hard to do with the technology of the day. Two more instances of prescience.]]

These were radical changes and head spinning for most researchers, including me. But the new methods undeniably worked much better than anything we had seen in the past.

On March 13th, 2019 (pre LLM), Rich Sutton (who was later the 2024 co-winner of the Turing Prize with Andrew Barto for their work on Reinforcement Learning) published a mildly triumphant short blog post titled A Bitter Lesson. In it he applies his argument to more cases than the ones to which I refer to here, by including the role of massive search making computers playing Chess and Go much better than humans doing the tasks.

And he says, both for search and learning approaches:

And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation.

Then he goes on to discuss Chess, Go, speech, and images. He argues against using human biases in structuring the problems at all. But I thought then, and think now, that, in reality, in all these successful cases human knowledge does come into play, as the “end to end” nature relies on humans specifying what the “ends” are.

Six days after Sutton posted this I replied in the form of similarly short blog titled A Better Lesson. In that post I pointed out a number of generic problems with scaling the approach, as we see now with massive energy and server requirements and the employment of thousands of other humans preparing data sets, which itself belies the argument of keeping humans out of the loop.

Most importantly I pointed out that the image labeling case was not end-to-end starting with images and ending with labels. Instead it uses a convolutional network as a front end to structure the way the learning algorithm has access to the images. While I did not make a similar argument for speech to text, nor the as-then-yet-unannounced LLMs, I will make the the case that all three succeeded due to engineers building case specific pre-processing that relied on directly simulating (without learning) parts of human physiology.

Here are the accommodations made to learning, in each of the three cases, in terms of hard coding front end processing of data.

3.1 Speech to text

The task in speech to text is to take the signal from a microphone that a person is speaking into and to output a text string that represents the words that were said. Today we are all used to talking to various machines like Alexa, or our TV remote,  or our car, and on a customer service line, or any other of a myriad of devices and channels. All these use speech to text to get the words to feed into the system which is going to respond appropriately (we hope) to our words. It is only in the last 20 years that this capability has become practical. And, it is the result of end to end learning over large data sets, where both the microphone input and the correct text string were available, and that a learning system learned how to go from the input signal to generating text.

There are many ways the sound signal could get into the computer for this learning. We could take the analog output of a microphone and digitize the loudness of the signal many tens of thousands of times a second and have that as the input to the learning.  But in practice that is not how it is done.

Instead it relies on a technology developed for practical spoken communication over world wide telephone networks in the 20th century where the signals were compressed for individual voice circuits so that more calls could fit on a single wire. This work determined aspects of the signal that had to be preserved so that a human could understand what had been said by the distant speaker. And if a human could understand such compressed signals that says that all the information that is necessary to understand speech is still in that signal.

The inputs to various speech to text learning systems vary, but here are some of the common pre-processing steps taken. The analog input signal is sampled at a fixed frequency, perhaps 16kHz, then a high pass filter boosts higher frequencies as they are important for detecting consonants, then the signal is cut into frames, say 25ms long with 10ms overlap, for instance, and then each frame is conditioned so that subsequent Fast Fourier Transforms (FFTs) will not be compromised by the shortness of the window.  Somewhere along the way there may be some noise reduction. Then the signal is subdivided into frequency bands using one or more methods such as FFT, Mel filter banks, logarithm of outputs, and cosine transforms. In some implementations there is initial training on just the frames so that language dependent frame signatures can be recognized early in the deep network.

Different implementations use different selections of these and other techniques, but the point is that after all this, then end-to-end learning is let loose on the output of all this input signal transformation.

Further, all that signal transformation was originally developed so that human speech could be stored and the listened to in distant places and times. The important thing about those transformations was that they allowed the human listening mechanism for understanding speech to be used with no change to the human.

3.2 Image labelling

Image labelling through deep learning, has since 2012 become the dominant method in computer vision for interpreting what is in an image. But deep learning does not start with raw pixels coming out of a camera, rather it bows to non-learned human physiology in two ways.

The data coming out of a camera is a linear stream of pixel values, and in fact sometimes three separate streams for the directly measured colors red, green, and blue (RGB). A modern digital camera has a global (electronic) shutter where light coming through the lens is allowed to bump electrons into a rectangular array of little buckets, all starting and stopping collecting at the same time. Then the contents of those buckets are shifted to neighboring buckets and read out with an analog-to-digital converter, which essentially reads the number of electrons in a particular bucket, and they are read as a series of left-to-right rows, top-to-bottom, or some switching of either of those orders. It is one, or three for color images, linear streams.

Deep learning does not operate on this stream. Instead the numbers from this stream are arranged in a data structure which reconstructs the adjacencies of the original pixels, and for color, overlays the three colors. This is of course standard in any image processing by computer, but it is an explicit structure being imposed intentionally. Animals do not serialize their images, but instead have one cable going from each “pixel” in the retina, to a flat array of cells in the cortex, where the geometry of the pixels, or receptors, in the retina are preserved. The organization of these cables into a regular array happens before birth through bursts of localized excitations of the adjacent retinal cells that are then used at the other end to guide the development of the cables (which are all neural axons) to mimic the locality of excitement.

Then the first few layers for deep learning use a structure which is set up so that learning learns the same thing in a translationally invariant way; a cat in the lower left of an image is recognized in exactly the same way as one in the middle top of an image.  This specialized network is a convolutional neural network, a processing structure specialized for vision applied to large images.

In the 27th of May, 2015 Nature (paywalled) article by Yan LeCun, Yoshua Bengio, and Geoffrey Hinton (the three winners of the 2018 Turing Prize) titled Deep learning, the authors say:

First, in array data such as images, local groups of values are often highly correlated, forming distinctive local motifs that are easily detected. Second, the local statistics of images and other signals are invariant to location. In other words, if a motif can appear in one part of the image, it could appear anywhere, hence the idea of units at different locations sharing the same weights and detecting the same pattern in different parts of the array.

They go on to attribute this architecture to Kunihuko Fukushima, who worked on learning to recognize hand drawn characters (pre back propagation), as did Yan LeCun (post back propagation) some years later.  The earliest English-language non-paywalled paper that I can find by Fukushima on this topic appeared at the International Joint Conference on Artificial Intelligence (IJCAI) in 1979 in Tokyo and the three page paper is on page 291 of Volume 1 of the proceedings. [[That was the first international conference at which I presented my own paper, and it is in the same volume, and is about a much more ancient and largely discarded approach to recognizing objects in images.]]

Fukushima attributes inspiration for this approach to the investigations of the structure of the cortical columns in the cortices of cats and monkeys by David Hubel and Torsten Weisel, who won the Nobel prize for this work in 1981 — see David Hubel’s Nobel lecture for a summary of that work. Fukushima emulated both the simple cells and the complex cells that Hubel and Weisel identified as S-cells and C-cells, and then split Hubel and Weisel’s hypercomplex cells into two subtypes within his modeled cells. These cells recognize common motifs wherever they may appear in an image.

In figure 2 of the paywalled Nature article above you can see this structure play out in alternate layers, and as LeCun et al say:

There are four key ideas behind ConvNets that take advantage of the properties of natural signals: local connections, shared weights, pooling and the use of many layers. 

In animals, including humans there is an additional variation in resolution of receptors in the retina, with more closely spaced, and therefore higher resolution, receptors near the center of the visual field. Many animals, including humans, use very fast motions, saccades, of their eyeballs to point that high resolution part of their eyes at different parts of the image — you are doing this right now as you read these words, saccading along each line then down to the next, stopping for just a fraction of a second before moving on (and suppressing your motion sensors while you move your eyeball).

The large convolutional network for deep learning vision eliminates the need for this by having high resolution recognition, through repetitive shared weights, across the whole image.

Again, this is not raw end to end learning. There is very detailed replication of incredibly complex parts of our brain, that is structured into the learning machine. Despite the romanticism of having everything learned without humans messing things up by choosing the wrong structures, deep learning image labelling is built upon a very complex and marvelous piece of front end engineering that specifically emulates structures that have been identified in animal brains. And it is built upon the technology we have developed to capture images and transmit them over a narrow channel (i.e., to serialize them) so that the human visual system can understand the original scene even when that human is located at a distant point in space and time.

3.3 Large language models

Large Language Models (LLMs) e.g.,  ChatGPT or Gemini, are trained on large amounts of  text, with no external inputs trying to explain what all that text is. From that perspective it looks like the learning mechanism figures everything out itself.

However, there are some early stages both in learning and then later processing any input text where the structure of human language, and some aspects of the particular human language that is being input, has been used to engineer in some preprocessing, and some aspects of the internal representations. The two mechanisms for this involve tokens and embeddings.  [[Of course, then there is the whole transformer mechanism, invented in 2017, involving multi-head attention mechanisms, and one step at a time append and shift output being rerouted to the input, and so on. That is a massive amount of human-generated architecture and engineering which is key to LLMs working, further pressure on the insistence on end-to-end learning with no human biases built in. But here I am only talking about the early massaging of data that is common to this and the previous two subsections.]]

The fundamental unit of any particular language is presented to an LLM as a linear sequence of tokens. For English roughly 50,000 different tokens are used and they include tokens such as dog, cat, fish, game, run, ing, ed, pre, sub, due, marine, fetch, auto, etc. Tokens can be whole words prefixes, suffixes, common subparts of words, etc.

At the very beginning of training an LLM, with text in a particular language, tokens are learned, in a largely unsupervised manner. Lots of text in the language is fed into a token learning system which comes up with plausible token candidates based on commonality of seeing them in the training corpus, and with statistics attached as to how common they are, and whether and how the combine with other tokens within words. From these statistics the number of discrete tokens to be used is chosen, automatically, by scoring possible tokens based on frequency and how well they divide words into other common tokens.

Once the tokens have been chosen a small program, the tokenizer, is used to break all input language into strings of those tokens.

Next, the tokens are embedded in a high dimensional vector space, typically one that has 3 \times 2^n dimensions for some fixed n. Over recent years as more training has been applied to LLMs to produce bigger models the number n has gotten larger. For ChatGPT-2 n was 8, but for ChatGPT-3 it was 12.

The embedding needs to be learned, i.e., the coordinate in each dimension of the vector space needs to be filled in for each token. This is done by a second “pre-real-training” learning exercise, where the ways in which any two tokens seem to be substituted for each other, in contexts in raw text that seem similar by the tokens that surround that context. It appears that this sort of learning ends up choosing embeddings for tokens such that their distance in different subspaces (for the standard definition of a subspace of a vector space) of the overall embedding correspond to some sorts of similarity. For instance, orange and red may be closer in one subspace than either is to fruit, but in another subspace red might be an outlier compared to the closeness of the other two. The first subspace might correspond more to color, and the second subspace as considering what class of tangible objects in the world the words can designate. But such decisions are not made by humans, both the categories and the distances are generated by learning from the data.

The number n is chosen early on by the people building a new LLM based on their tolerance for paying for cloud services as it will be a big factor in how much data is needed to train the LLM and how many parameters will need to be learned.

Once there is an embedding like this, the very first stage of the neural network that represents the LLM takes each token from the output of the tokenizer and turns it into its vector position in the embedding. So, in the case of ChatGPT-3 where n = 12, each token is immediately turned into 12,288 numbers.

Thus we see here that there has been a lot of human engineering and knowledge about the ideas of word components, and sorts of meanings of words and how similarity can be extracted from language without knowing meanings has been applied to way in which the pre-training is done for a language.

In one sense the tokens are proto-symbols, but unlike traditional symbols it is not their unique identity that is important but how they compare to other proto-symbols within the system. AND, these proto-symbols are based on parts of human language, the parts that the invention called writing uses to transmit language between people without the necessity to do it in sound or in a synchronized manner — writing can be read anywhere at any later time, even well after the writer is dead.

3.4 The commonality in these three applications of end to end learning

These three grand successes of end to end learning rely on very domain specific learning architectures down stream. But they also each rely on domain specific early processing of the data stream.

In these three cases that early processing was built for other purposes, for language to be heard or read, and for images to be seen, at entirely different locations and asynchronous time.

We do not have such a tradition for touch data. Touch for us, for now, is only the instantaneous touch we perceive first hand (no pun intended). We as a species have not developed technologies to capture touch, to store touch, to transmit touch over distances and time, nor to replay it to either ourselves or other humans.

In section 4 below I show how central touch is to human dexterity.

To think we can teach dexterity to a machine without understanding what components make up touch, without being able to measure touch sensations, and without being able to store and replay touch is probably dumb. And an expensive mistake.

4. Why the Ends are Uncracked for Dexterity

The center piece of my argument is that the brute force learning approaches that everyone rightfully touts as great achievements relied on case-specific very carefully engineered front-ends to extract the right data from the cacophony of raw signals that the real-world presents.

If it is the case for the big successes it is likely also the case for learning dexterity by brute force. If any one or any group is to succeed they will likely have to collect the both the right data, and learn the right thing. Most of the projects to teach humanoids dexterity are doing neither of these things.  There are some exciting and promising experiments going on in academic laboratories, but they have not yet gotten close to demonstrating any real dexterity. By my third law of robotics that says that we are more than ten years away from the first profitable deployment of humanoid robots even with minimal dexterity.

Human dexterity relies on a rich sense of touch.  And dexterity for humans involves more than just their hands; it often involves their elbows, the fronts of the bodies, legs, and feet (many machines have foot pedals). I am not going to present a comprehensive complete case for it here, as one might expect if this were a formal peer reviewed academic research paper. But I will show you results from a somewhat random selection of solid peer reviewed academic work stretching over fifty years which together demonstrate that humans use touch and force sensing extensively.

4.1 The human sense of touch is really rich and complex

The following two videos are from Roland Johansson’s lab at Umeå University in Sweden where he has studied human touch for decades. In the first video the person picks a match out of a box and lights it. The task takes seven seconds. In the second video the same person tries again but this time the tips of her fingers have been anesthetized so she no longer has any sense of touch right at her fingertips. She can still sense many other things in the rest of her fingers and hand, and all the forces that she can ordinarily feel with her skeletal muscle system.

The two URLs in case your browser does not point at the YouTube videos below:
www.youtube.com/watch?v=zGIDptsNZMo
www.youtube.com/watch?v=HH6QD0MgqDQ

Without a sense of touch in her fingertips the person makes many unsuccessful attempts to pick up a match from the box, then fails to pickup an isolated match that had fallen on the table, then goes back to the box and straightens up the matches, manages to pick one up, then fumbles with the match trying to get it into the right orientation between her fingers, and successfully lights it after taking four times as long as she took with sensitive fingertips.

It looks like humanoid robots will need a sense of touch, and a level of touch sensing that no one has yet built in the lab in order for them to do tasks like the one above which is of the same order of difficulty that millions of workers do all day everyday in some parts of the world. [[I have visited well over 100 factories in the US, China, Japan, Korea, Taiwan, and Germany, some where my companies have been building my five major families of robots: Roomba, PackBot, Baxter, Sawyer, and Carter, and some where I have been selling robots to make workers in the factories more productive, and some where I was on technology advisory boards for the companies that ran the factories. I have seen this and many other types of dexterity of humans beings applied to complex tasks in all these factories.]]

In a review of Johansson’s earlier work from 1979 it is reported that a human hand has about 17,000 low-threshold mechanoreceptors in the glabrous skin (where hair doesn’t grow) of the hand, with about 1,000 of them right at the tip of each finger, but with much lower density over the rest of each finger and over the palm. These receptors come in four varieties (slow vs fast adapting, and a very localized area of sensitivity vs a much larger area) and fire when they sense pressure applied or released.

Next I will talk briefly about the work of David Ginty and his students in his lab at Harvard. You can see the lab’s complete list of publications here, stretching back to 1987. The mission of Ginty’s lab is:

We use approaches in molecular genetics, anatomy, physiology, behavior, and systems neurobiology to understand mammalian somatosensory neurons and central nervous system circuits that underlie our sense of touch.

From a press article summarizing almost forty years of Ginty’s work touch is described as follows:

touch concerns a smorgasbord of stimuli, including pokes, pulls, puffs, caresses and vibrations, as well as a range of temperatures and chemicals, such as capsaicin in chili peppers or menthol in mint. From these inputs arise perceptions of pressure, pain, itchiness, softness and hardness, warmth and cold, and the awareness of the body in space.

The article goes on to report that there have now been fifteen different families of neurons discovered that are involved in touch sensing and that are found in the human hand.

Such nerve endings turned out to be remarkably specialized. Near the skin’s surface, the flat variety, called a Merkel cell complex, responds to gentle indentation. Merkel cells abound in your lips and fingertips, allowing you to discern form and texture. Your fingers are also packed with coiled nerve endings called Meissner corpuscles, which wrap around support cells in a bulbous tangle. These sensors pick up the faint, minuscule vibrations generated by the slight slipping of an object against your hand as you grip it, enabling you to use tools with precision. Deeper in the skin dwell the onionlike Pacinian corpuscles, which detect rumblings in the earth, and the spindle-shaped Ruffini endings, which convey skin stretching.

Touch is a very complex set of sensors and processing, and gives much richer time dependent and motion dependent information than simple localized pressure.

Moving on to more general aspects of humans and what we sense as we manipulate, on top of that skeletal muscles sense forces that they are applying or that are applied to them. Muscle spindles detect muscle length and when they stretch, and Golgi tendon organs sense tension in the muscle and hence sense force being applied to the muscle.

We also make visual and touch estimates about objects that change our posture and the forces we apply when manipulating an object. Roland Johansson (again) describes how we estimate the materials in objects, and knowing their density predict the forces we will need to use. Sometimes we are mistaken but we quickly adapt.

Over the last two decades Roland Johansson’s work has shifted to understanding the role of forethought based on observations in how humans choose appropriate strategies for carrying out tasks with their hands and bodies. You can read his last twenty years of publications here. His papers include titles such as:

  • Fingertip viscoelasticity enables human tactile neurons to encode loading history alongside current force
  • Human touch receptors are sensitive to spatial details on the scale of single fingerprint ridges
  • Gaze behavior when learning to link sequential action phases in a manual task
  • Integration of sensory quanta in cuneate nucleus neurons in vivo
  • Skill learning involves optimizing the linking of action phases
  • Slowly adapting mechanoreceptors in the borders of the human fingernail encode fingertip forces.

These show how rich and varied human grasping is beyond simple motions of fingers, even if the positions of the fingers can be measured accurately (see the reference in section 2.2 above, to Tesla’s newest data collection strategy).

4.2 What is the right data?

Collecting just visual data is not collecting the right data. There is so much more going into human dexterity that visual data completely leaves out.

Is anyone trying to do more than collect visual data and have a different more appropriate “end” to connect learning to?

Apart from Figure and Tesla which are explicitly claim not to be doing so, the other big companies are not saying. And there are lots of big companies working on humanoid robots, and you can sort of tell by seeing which of your friends are getting hired by which company.

In academia though, there is still a healthy set of experiments going on. Here is just one example, from the “best paper” from the May 2025 Dexterous Human Manipulation workshop at the Robotics Systems and Science conference. It comes from Pulkit Agrawal’s group centered in CSAIL at MIT. It involves a newly invented way to collect the right data to feed to machine learning. As you can see in the two pictures below the human essentially has their hand in a glove. There is a robot hand rigidly attached to the glove so the robot hand is roughly 10cm away from the human hand and completely parallel to it. The human moves their fingers to control the robot hand fingers and the human moves their arm to place the robot hand in contact with objects to be manipulated.. The robot fingers and palm have touch sensors which feed to the data collection system and to the actuators which stimulate the human’s finger tips and palm. [[This was the original wording when I posted this, but I misunderstood something, and prompted by a reader I reached out to Pulkit to clarify. The part now in italics is not true in the sense of there being active actuators stimulating the person. It is true in the lesser sense that the human feels joint-level force feedback.]] So while this system doesn’t record the forces that the human directly feels and controls with their arms, it does get to associate finger motions generated by a human with touch sensations that the human is sensing as they decide how to control the robot hand.

   

Clearly this is a long way from understanding everything that a human does with their wildly complex touch and force sensing system, but it is a step beyond simply collecting visual data, which alone can’t possibly be enough to infer how  to be dexterous.

[[If the big tech companies and the VCs throwing their money at large scale humanoid training spent only 20% as much but gave it all to university researchers I tend to think they would get closer to their goals more quickly.]]

4.3 What is the right thing to learn?

Lastly I want to return to what I said at the start of this section (4) about the need to learn the right thing.

The framework that both industry and academia is using on what to learn comes from Reinforcement Learning (see the introduction part of section 3, above). In Reinforcement Learning, one learns a policy, which maps from the state, expressed by what the sensors are delivering right now, to a specific action for the robot to do right now.

But it seems from both personal experience and from some of the papers from haptics researchers above, that humans are sometimes pursuing a dexterity plan of what they are trying to do. Instead of what is being sensed mapping directly to action, what is being sensed probably modulates what is being done in following that plan (represented as a finite state machine, perhaps?). Thus to be truly successful at dexterity there needs to be a way to learn both how to plan in some weird space of subtasks, and how sensing at the tactile level should modulate those plans.

There is still plenty of research to be done to figure all this out. And then years to get to solid lab demos, then years more to get to deployable systems that bring value to customers.

5. The Other Problem for Humanoid Robots: Walking

I think it is fair to say, given the aspirations that humanoid robots have the same form as humans so that they can operate in built-for-human environments, people will expect them to be safe to be around. This is especially true for humanoids providing healthcare in the home for an aging human population. But by the master plans set out for humanoid robots it must be true in other environments too, as the idea is that the humanoid robots fit into human spaces there too. And that means humans will share those spaces, otherwise why not just build a special purpose lights out machine that can do the job.

So if anyone is going to deploy humanoid robots at scale it is important that they be safe for real humans to share space with them, so be just centimeters away from them, to lean on the humanoids for support, to be touched and manipulated by humanoid robots (as are the elderly touched and manipulated by human carers, helping them stand, wash, poop, get into and out of bed, etc.).

The trouble is that human sized two legged walking humanoid robots are not currently safe for humans to be around. But the argument for humanoid robots requires that they be full sized, so they can operate in human spaces and do all human tasks.

Ah, but you’ve seen videos of, or walked within centimeters of (as I have), half sized humanoid robots, feeling quite safe around them. So you reason that it is only a matter of a small amount of time before those robots are made bigger. But that is where physics comes in, with a vengeance.

Current humanoid robots do not walk at all like humans. Humans are stretchy springy systems, that very nearly walk without much in the way of neural control. In fact you can see models of biped walkers that are purely mechanical, walking down a gentle slope, with no power supply, relying only on the passive dynamics of the mechanism, and stealing potential energy from the act of walking downhill to power the robot (purely mechanically).

Here is a simple example:

The URL is www.youtube.com/watch?v=wMlDT17C_Vs

Besides that fundamental architecture, we also have an energy recycling architecture involving our muscles and tendons. We store energy in our tendons and reuse it on the next step — our Achilles tendon at the back of each of our lower legs is the one that stores most energy and the one most likely to rupture.

Although there have been decades of academic research on building robots that walk like us in this regard, they have not gotten to the practical level that the current humanoid robots designs have reached.

But current humanoid robots use powerful electric motors to balance by pumping large amounts of energy into the system when there is instability, mostly following a version of the ZMP (Zero-Moment Point) algorithm. [[This algorithm has been around for a long time, and in the 2004 Volume 1 of the International Journal of Robotics, shown above at the start of section 2, on page 157, Miomir Vukobratović and Branislav Borovac, both from Serbia and Montenegro, had a paper celebrating their introduction of the algorithm thirty five years prior to that, making it roughly 56 years old now.]] Although they are tight lipped about exactly what they are doing the large companies working on humanoids seem to have added some Reinforcement Learning (RL) on top of ZMP starting points, to get better walking and less falls. ZMP relies on sensing forces in the sole of the feet, and so all humanoid robots do have that. But the RL algorithms rely on the whole structure being very stiff so humanoid robots are the antithesis of humans when it comes the mechanical structures doing the walking. These robots fall less often, but are still very dangerous for humans to be close to them when they do and will fall.

When an instability is detected while walking and the robot stabilizes after pumping energy into the system all is good, as that excess energy is taken out of the system by counter movements of the legs pushing against the ground over the next few hundred milliseconds. But if the robot happens to fall, the legs have a lot of free kinetic energy, rapidly accelerating them, often in free space. If there is anything in the way it gets a really solid whack of metal against it. And if that anything happens to be a living creature it will often be injured, perhaps severely.

But, but, but, the half sized humanoids are safe, so how much less safe can a full size humanoid robot be?

This is where scaling comes in, not in terms of numbers of robots, but in scaling laws of physical systems.

If you just expand a physical system by the same amount in every direction, say multiply all lengths by a scale factor s, then the mass m of the system goes up by s^3. Since F = ma for the same acceleration you need to put in s^3 as much energy. So for a robot that is 50% bigger that is (1.5)^3 = 3.375. And to get from today’s small safe-ish humanoids you have to pump in 2^3 = 8 times as much energy.  That is a whole different class of possible injuries. And it could be even worse, as for a limb, say, the mass goes up as the cube of s but the cross section, which determines strength, only goes up as the square. [[This scaling is why elephants have much fatter legs for their body size than does a spider, even accounting for the latter having twice as many legs to support its weight.]] So the twice bigger robots may have to have proportionally much fatter legs, so more mass, and so they will pump up the energy by something larger than a factor of eight.

My advice to people is to not come closer than 3 meters to a full size walking robot. And the walking robot companies know this too. Even in their videos you will not see people close to a locomoting humanoid robot unless there is a big table between them, and even then the humanoids only shuffle around a little bit.

Until someone comes up with a better version of a two legged walking robot that is much safer to be near, and even in contact with, we will not see humanoid robots get certified to be deployed in zones that also have people in them.

6. What is the Future of Humanoid Robots?

Technology changes and the meanings of words around technologies change too.

When I made a whole bunch of dated predictions about future technologies back on January 1st, 2018, flying cars and self-driving cars meant different things than they do today. I pointed this out in my most recent scorecard on how my predictions were holding up.

Flying cars used to mean a vehicle that could both drive on roads and fly through the air. Now it has come to mean an electric multi-rotor helicopter than can operate like a taxi flying between various fixed landing locations. Often touted are versions that have no human pilot. These are known as eVTOLs, for “electric vertical take off & landing”. Besides not yet actually existing in any practical sense, flying cars (eVTOLS) are no longer cars, as they do not travel anywhere on the ground.

At the time I made my predictions self driving cars meant that the cars would drive themselves to wherever they were told to go with no further human control inputs.  Now self driving cars means that there is no one in the driver’s seat, but there may well be, and in all cases so far deployed there are, humans monitoring those cars from a remote location, and occasionally sending control inputs to the cars.  Except for Tesla self-driving robo taxis. In that case there is a human safety operator sitting in the front passenger seat.

Following that pattern, what it means to be a humanoid robot will change over time.

Before too long (and we already start to see this) humanoid robots will get wheels for feet, at first two, and later maybe more, with nothing that any longer really resembles human legs in gross form.  But they will still be called humanoid robots.

Then there will be versions which variously have one, two, and three arms. Some of those arms will have five fingered hands, but a lot will have two fingered parallel jaw grippers. Some may have suction cups. But they will still be called humanoid robots.

Then there will be versions which have a lot of sensors that are not passive cameras, and so they will have eyes that see with active light, or in non-human frequency ranges, and they may have eyes in their hands, and even eyes looking down from near their crotch to see the ground so that they can locomote better over uneven surfaces. But they will still be called humanoid robots.

There will be many, many robots with different forms for different specialized jobs that humans can do. But they will all still be called humanoid robots.

And a lot of money will have disappeared, spent on trying to squeeze performance, any performance, from today’s humanoid robots. But those robots will be long gone and mostly conveniently forgotten.

That is the next fifteen years for you.

 

Alan Turing on Embodied Intelligence

rodneybrooks.com/alan-turing-on-embodied-intelligence/

The term “Embodied AI” is having its moment in the sun right now. For me, that is what I have spent my career working on, and I have repeatedly published articles using the term “embodied intelligence”.

I recently stumbled across a piece that I wrote in May 2011 intended for the proceedings of a conference held in June 2012 in Manchester, England, to celebrate the 100th anniversary of Alan Turing’s birth. There was a series of invited talks and a number of other components. I attended the conference but I cannot find any record of a proceedings having been published for the 17 invited talks.  Here is what I had written for it, unchanged except for fixing many typos.



The Case For Embodied Intelligence

Rodney A. Brooks, May 2011

Abstract: In 1948 in Intelligent Machinery Turing made the distinction between embodied and disembodied intelligence. While arguing that building an embodied intelligence would be a “sure” route to produce a thinking machine he rejected it in favor of disembodied intelligence on the grounds of technical practicalities of the era. Modern researchers are now seriously investigating the embodied approach to intelligence and have rediscovered the importance of what Turing called “cultural search” in that same paper.

For me Alan Turing’s 1948 paper Intelligent Machinery was more important than his 1950 paper Computing Machinery and Intelligence.

At the beginning of Intelligent Machinery Turing provided counter arguments to a number of possible objections to the idea that machines could be intelligent. And right at the end he introduced a precursor to the “Imitation Game”, now commonly referred to as the Turing Test, of his 1950 paper. In this earlier version, one human not very good chess player would try to guess whether he was playing against another human not very good chess player, or against an algorithm. [[At the time the opponent person had to be not very good so that it didn’t outshine the then current abilities of mechanical chess playing. Today the opponent person would have to be a world champion to have any chance at not being outshone by the mechanical system!]] Expansion of these bookends became the body of Computing Machinery and Intelligence.

Intelligent Machinery itself was not published until 1970, so many early computer science researchers were unaware of it. I was fortunate to come in contact with it right as I was starting my academic career.

The bulk of the paper gives examples of how simple computational mechanisms could be adaptable, could be taught, and could learn for themselves. The examples and mechanisms Turing used in this exposition were networks of active computational elements. Although he connected them back to the universal machines of his 1936 paper, it is remarkable, in hindsight, how different this abstraction was than the one he had previously introduced, of the central processing element with a tape memory–still the essential model for all modern digital computers. Here, instead, he used a model inspired by brains. One can only wonder how different our technological world might be if Turing had lived to fully develop this set of ideas himself. Others carried on this second tradition, but one must think that perhaps Turing’s intellectual influence might have been stronger as he would have been arguing against the approach that was adopted from his earlier work.

For me, the critical, and new, insights in Intelligent Machinery were two fold.

First, Turing made the distinction between embodied and disembodied intelligence. While arguing that building an embodied intelligence would be a “sure” route to produce a thinking machine he rejected it in favor of disembodied intelligence on the grounds of technical practicalities of the era. Second, he introduced the notion of “cultural search”: that people’s learning largely comes from the culture of other people in which they are immersed.

Modern researchers are now seriously investigating the embodied approach to intelligence and have rediscovered the importance of interaction with people as the basis for intelligence. My own work for the last twenty five years has been based on these two ideas.

Turing justifies the possibility making a thinking machine by “the fact that it is possible to make machinery to imitate any small part of a man”. He uses the implicit idea of his universal computing machines to dismiss the idea that it is necessary to emulate a person at the neural signal level in order to have intelligence, and instead suggests a digital computer, “if produced by present techniques, would be of immense size”, which would control a robot from a distance. That robot would be built by “tak[ing] a man as a whole and to try to replace all parts of him by machine”. In particular he suggests the parts would include “television cameras, microphones, loudspeakers, wheels and `handling servo-mechanisms’ …”. Turing’s description from over sixty years ago, fairly precisely describes what is done today in dozens of research labs around the world with our PR2 robots, or Mekabots, with their brains off board in racks of Linux boxes, or even off in the computing cloud.

Turing further rightfully notes that even in building such a robot “the creature would still have no contact with food, sex, sport, and many other things of interest to the human being”. Nevertheless he suggests that such an approach “is probably the `sure’ way of producing a thinking machine”, before dismissing it as too slow and impractical. He suggests instead that it is more practical, certainly at that time, to “see what can be done with a `brain’ which is more or less without a body”. He suggests the following fields as ripe for exploration by disembodied intelligence:

(i) Various games, e.g., chess, noughts and crosses, bridge, poker
(ii) The learning of languages
(iii) Translations of languages
(iv) Cryptography
(v) Mathematics.

With these suggestions much of the early directions for the field of Artificial Intelligence were set, and certainly the odd numbered of Turing’s suggestions formed a large part of the work in AI during its first decade.

In one paper Turing both distinguished embodied versus disembodied approaches to building intelligent machines, praised the former as more likely to succeed and either set or predicted the disembodied directions that were actually followed for many years.

But later, towards the very end of Intelligent Machinery he comes back to the place of bodies in the world. He distinguishes three kinds of search as ways to build intelligent systems: intellectual search, genetic search, and cultural search. The first is the direction that classical AI went, where programs try to learn and improve their performance. Although he did not suggest that it be mechanized, genetic search has become a thoroughly practical approach to design and optimization. And lastly by cultural search, Turing means the way in which interactions with others contributes to the development of intelligence. This developmental approach, using social robots, has only now become practical in the last fifteen years, and is a rich source of both theoretical and practical learning systems for robots.

It is humbling to read Alan Turing’s papers. He thought of it all. First.

AI: great expectations

rodneybrooks.com/ai-great-expectations/

This time around I decided on a guest column. The guest is a 37 years ago younger me, and this post is one that I wrote for Manufacturing Engineering in March of  1988. It was for the last page of the trade magazine, in a regular feature titled “THE LAST WORD”.  You can download a pdf of the original from my MIT website, which has a pencil drawn picture of the 33 year old me (over half my life ago). Below is just the text of that piece, with a few places that I have highlighted as being particularly relevant to today, or otherwise notable.  There is also a footnote to give the basis for a couple of the weirder things that I pointed out at the time.

AI: great expectations

Artificial intelligence (AI) has repeatedly inspired great expectations in people who see the possibilities of applying its techniques. Sometimes it delivers. Sometimes expectations are dashed.

The idea that machines can be rendered intelligent has always been seductive, and demonstrations of limited scope tend to raise greater expectations than hindsight analysis shows were warranted. In his 1949 book GIANT BRAINS or Machines That Think, Edmund Berkeley ponders the amazing ability of machines such as ENIAC carrying out 500 multiplications of two 10-digit numbers per second, and envisions machines that would act as automatic stenographers, translators, and psychiatrists.

This pattern is still evident. A few years ago there were high hopes that robots would revolutionize factories. In a way they have, but not in the grand manner predicted in the business plans of start-up companies six or eight years ago. Programming problems, combined with a lack of flexibility, made it impossible to overcome the systems-level problems of integrating assembly robots into the world of manufacturing. Great expectations raised by demonstrations of such robots glossed over other critical aspects of a complete operational enterprise–not the least of which is capital. Robots, indeed, have proven to be useful, but not as useful as was first predicted.

We recently entered the bust side of another set of boom expectations: expert systems. Though mildly successful in industrial applications, once again the expectations of the expert systems industry have not been borne out over time. The too simple representations of the problem domains of expert systems make them extremely brittle when the world diverges from the narrow range of applicability of their knowledge rules. Broadening those representations has been frustratingly difficult. Expert systems are here to stay–though not yet ready to solve all the world’s problems.

With every bust there is a new boom, and in the past year just what that new fashion will be has become clear—neural networks. These networks incorporate an appealing idea in that instead of having to work out all the details of a task we’ll simply let some randomly organized network of neuron models “learn” through trial and error how to do the right thing. Although neural networks have rarely accomplish anything beyond a computer simulation, business plans are being cranked out for new start-up companies to apply the technology.

But the current neural networks phenomenon  is more than just another set of high expectations. This is the second time around for neural networks. It happened in the early ’60s. In 1962 a distinguished Stanford professor predicted that computer programming would be obsolete by 1963 because, by then, users would simply converse in English with the front-end neural networks. Since then, there have been a few technical improvements, and computers are much faster, broadening the scope of the applicability and likely successes of neural networks.  But, again, they really can’t be expected to solve the world’s problems. The old-timers, “immunized” the first time around, seem less enamored than the new converts.

I recently worked with a group from industry, offering a detailed explanation of a technical AI method. After some time, the lead technical member of the group—who had no previous exposure to AI—exclaimed, “But that’s not intelligence!  All you’re doing is writing a compete program to solve the problem.” Well folks, I’m sorry–but that’s all there is. There is no magic in AI.  All we do is tackle areas and tasks that people previously were unable to write computer programs to handle. Because we have developed sets of tools and methodologies throughout the years to accomplish this, AI approaches have had a number of good successes. But there is no universal set of magic ideas.

Every so often a new AI development comes along and great excitement ensues as people stumble over themselves, convinced that the key to intelligence has been unlocked. Maybe it will happen someday, but I rather doubt it. I don’t think there is a single key to intelligence but rather that, unfortunately for both the philosophers and dreamers, intelligence is a vast, complex collection of simpler processes. To develop truly intelligent computers and robots, we’re going to have to unlock those processes one by one—causing flurries of great expectations, followed by more modest real successes. This may sound boring and unimaginative, but I find it exciting. Intelligence really is a complex interaction of many things. As we unlock its secrets in the next few years and decades, we will see a constant flow of ideas that have real and immediate practical applications. Finally when we truly understand AI, it won’t seem like just a computer program but will appear as a wondrous testament to the creative genius of evolution.

<signature: Rodney A Brooks>

Dr. Rodney A. Brooks
Associate Professor
Electrical Engineering and Computer Science Dept.
Artificial Intelligence Laboratory
Massachusetts Institute of Technology
Cambridge, MA
A COmmentary 37 years on

When I recently reread this I was pleased to see how optimistic thirty three year old me was, while being completely aware of the amount of hype going on.  The orange text in the last paragraph is full of optimism, and that optimism about AI was well founded as it turned out.

The three pieces of text in blue described the cyclic nature of hype, and the third one of them pointed out how each new set of hype drivers really do believe that that have cracked AI once and for all.  Those traditions are alive and well, and woe to you should you challenge one of the true believers about the current two maelstroms of hype, generative AI and humanoid robots. I analyzed the sources of those whirlpools about three months ago with a previous blog post.

Finally, I highlighted two pieces of text in red. Here I can imagine people today saying the predictions that I used as examples of things that didn’t age well have in fact aged well 37 more years on.  And they have.  These two are good examples of one of my seven deadly sins of predicting the future of AI, which can be found on this blog or in a more tightly edited form in MIT Technology Review, both published in 2017.

The first of those seven sins was over and under estimating the power of AI, which is just an instance of a more general technology prediction sin, known as Amara’s Law.

We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.

The first of the items in red was where I bundled Edmund Berkeley’s expectations for “machines that would act as stenographers, translators, and psychiatrists” with those that in hindsight seemed unwarranted.  He made those predictions in 1949 based on the speed of multiplications in digital computers at the time.  Today, in 2025, seventy five years later we have his first two predictions, but not his third. And a 10 cent microprocessor is about 100,000 times faster at multiplying than the large digital computers of the day, but it is not those machines that are stenographers or translators.  It is machines that are a further million times faster than today’s 10 cent microprocessor.  He was directionally right, but without Moore’s Law, first elucidated sixteen years later, and certainly not dreamt of at that time, and without more than fifty years of AI research, he had no real way of knowing if or when such machines might come about.

The second was where I had said that an eminent Stanford Professor had predicted in 1962 that we would talk to neural nets and they would write programs for us. Again this person was directionally right, as that is one of the more consistently successful applications of LLMs today.  But I thought then, in 1988, and do now, that the prediction was at the time unwarranted. What was I talking about and what did he say?  I was referring to Bernard (Bernie) Widrow who is now an emeritus Professor at Stanford. You can download the paper (original citation: Bernard Widrow. Generalization and information storage in networks of ADALINE neurons. In Marshall C. Yovits, Geogre T. Jacobi, and Gordon D. Goldstein, editors, Self-Organizing Systems, pages 435-461. Spartan, Washington, 1962) from his website here (many thanks to Leena Mathur for tracking this paper down). It is scanned from a printed copy and the scanning has pages rotated, so it takes a little patience to read it.

The ADALINE (from the paper title) was an analog implementation of a linear threshold perceptron (the ones he shows have 16 inputs, 16 weights, and a threshold, 17 weights in all) in an analog circuit made from components he calls “memistors” (not to be confused with today’s “memristors”) made from pencil leads.  An ADALINE is one neuron from today’s neural networks of hundreds of billions of them.

Towards the end of the paper he reports on the speed up achieved by connecting 6 of these analog ADALINES (in a machine called MADALINE) to an IBM 1620, with a total of 102 (= 6 x 17) weights in all, compared to having the 1620 do all thoese multiplications itself. And after 8 months of operation and learning. they diagnosed that 25% of the weights were not being adapted due to solder failures, and sloppy manufacturing quality control in general. But, and this was the important point he says:

Yet the machine was able to adapt around its internal flaws and to be trained to make very complex pattern discriminations. These errors were corrected, and the capacity of the machine increased accordingly.

Then he says that they are going to expand to 49 input ADALINES, and have 1500 analog weights connected to the IBM 1620, in the next year (for a total of 1500 (= 30 x (49 +1)) weights, so I infer that means 30 of these 49 input perceptrons). Then the last paragraph of the paper is this one, which I am guessing is what I must have been referring to when I talked about the distinguished Stanford professor.

The fundamental objective in connecting adaptive neurons to a computer is to develop a new type of computer, one as different from the digital computer as the digital computer is different from the analog computer. This new type of machine might be called the Adaptive Computer. The basic “flip-flop” for this machine is the ADALINE. The adaptive computer is taught rather than programmed to solve problems. The job of the “programmer” is to establish suitable training examples. This machine will be taught by men (so that it will solve the problems of men) in the environment and with the language of men, not with machine language. The learning experience derived from human teachers will provide reasonable initial conditions, upon which the machine could subsequently improve from its own system experimentation and experience gathering.

I inferred, it seems, that the “language of men” was English. And I inferred that he was saying that the new set of 30 neurons was what would be use for conversing in English with the machine. In hindsight I plead guilty to overstating the things that I said he was overstating.  However, I also think that as with Edmund Berkeley above, I was directionally right in saying that the claims were wildly overblown in that they required decades of research by thousands of people and needed more than a billion times as many weights as his proposed machine would have. And even today those LLMs are not continually adaptive as he suggested they would be.

Parallels between Generative AI and Humanoid Robots

rodneybrooks.com/parallels-between-generative-ai-and-humanoid-robots/

Anyone who reads just about anything these days will know there are two big things in AI at the moment. They are Generative AI and Humanoid Robots. There is a lot of hype about these two new (to most people) versions of AI and robots, and it has shifted all the major tech companies to have a strategy in one or both of these fields. And together they have made NVIDIA one of the most valuable companies on Earth.

There is a vague notion that these two are related, as certainly the promises made about both of them share an underlying reliance on Deep Learning which first made a splash within the AI community back in 2012, and also a reliance on Reinforcement Learning which has been under development for over 65 years.  Deep Learning itself can trace back its intellectual lineage for about 70 years.

But now they have arrived, in the forms that ordinary people see talked about every day, as Generative AI and as Humanoid Robots. This short blog post is to point out some strong similarities between these two “application areas”, similarities in how the hype around them has developed and why that hype may have developed.

This table summarizes the similarities, and small differences, and I discuss them below.

Generative AIHumanoid Robots
Big tech companies spending $10BsBig tech companies spending $10Bs
Crazy startup valuationsCrazy startup valuations
Era of big learningEra of big learning
Academia priced out (cloud $$)Academia priced out (human $$)
Promise of plentiful low cost white collar laborPromise of plentiful low cost blue collar labor
Lured in by human languageLured in by human appearance
Driven by two AI prediction sins
• Performance vs competence
• Indistinguishable from magic
Driven by two AI prediction sins
• Exponentialism
• Speed of deployment
Hype and Dollars

The first four rows in the table are almost the same for Generative AI and Humanoid Robots.

In both cases there are a massive number of dollars floating about. The big tech companies are all spending tens of billions of dollars on them. All of them have announced their intentions with respect to Generative AI. The announcements on Humanoid Robots have been less, but if you are in robotics, as I am and have been for fifty years, you start to notice which companies are trying to poach your employees or hiring your academic friends.

At the same time there are plenty of startups in Generative AI and Humanoid Robots that have had valuations in the billions of dollars before they have shipped a single product. In some cases the valuations have been in that stratosphere even at the seed stage.

A common element of the two realms is the use of massive amounts of data for machine learning to drive promised products. They are less engineered in a traditional way. Rather they are given lots of data and the black-box machine learning algorithms learn billions of weights in a network and that is where the intelligence of the systems is expected to lie. Interpreting those weights is something that is being researched, but is not well understood. It is less like designing a high performance automobile with lots of engineering, and more like buying a random horse and training it, hoping that it will be well behaved and do what you want it to do. I call this the era of big learning.  As with all eras in all technologies this one will eventually be replaced with something else at some point.

For Generative AI the data is digital data that already exists, in the form of all of human writing that is now online, and millions of hours of videos of all sorts of things. There have been plenty of scandals about whether all the big companies have been completely ethical in appropriating copyrighted material into their trained models. This sort of training relies on massive amounts of cloud computation, and in some cases massive amounts of human labor to annotate some of the data. These costs have meant that academics cannot compete with big companies in running their own learning scenarios.

For Humanoid Robots the data has not been extant, but instead has been generated by paying lots of humans to carry out mostly manual tasks, and record data from the humans. Some of it is video data watching the motions of the humans, and some is from gloves on the humans recording finger motions and sometimes forces they are applying when they are tele-operating humanoid robots1.  The cost of collecting data from thousands of hours of  humans demonstrating manipulation means that academics cannot compete with big companies here either.

The Attractions

The reason we have so much hype about these topics is tied up with my first law of Artificial Intelligence:

When an AI system performs a task, human observers immediately estimate its general competence in areas that seem related. Usually that estimate is wildly overinflated.

and my first law of  robotics:

The visual appearance of a robot makes a promise about what it can do and how smart it is. It needs to deliver or slightly over deliver on that promise or it will not be accepted.

People interact with a Large Language Model (LLM), generating text on just about any subject they choose. And it generates facile language way better and more human like than any of the previous generations of chatbots that have been developed over the last sixty years. It is the classic con. A fast talker convinces people that there is more to them than there really is. So people think that the LLMs must be able to reason, like a person, must be as knowledgeable as any and all people, and so there for must be able to do any white collar job, as those are the jobs that require a person to be facile with language.

People see a humanoid form robot and its form is an implicit promise that it will ultimately be able to do everything a human can do. If you believe that, with learning, AI is able to do anything as it can learn how to do it from data, then people think that a humanoid robots naturally will be able to do everything a human can do. And then they see it as a cheap form of blue collar labor.

It is the apparent human-ness of these two technologies that both lure people in, and then that promise human level performance everywhere, even when that level has not yet been demonstrated. People think that surely it is just a matter of time.

The Sins People Commit

I have previously written about the seven deadly sins of predicting the future of AI, both on this blog, and in an edited version in Technology Review. I judge that the hype level for both Generative AI and Humanoid Robots are largely driven by two of those seven sins each.

Generative AI

In my analysis above I pointed to Generative AI hype being overestimated because it show very strong performance in using language.  This is the AI sin of extrapolating from a narrow performance to believing there must be much more general competence. The problem is that any person who has facile language performance is usually quite competent in being able to reason, to know the truth and falsity of many propositions, etc. But LLMs do not have any of these, rather they have only the ability to predict likely next words that ought to follow a string of existing words. Academics, VCs, startup founders, and many others though have a strong belief there must be an emergent system within the learned weights that is able to reason, judge, estimate, etc.  Many of them are betting with papers they write, cash they invest, or sweat equity, that this really must be true. Perhaps we have a bit too much of Narcissus in us.

The second sin that leads to overhype is the “indistinguishable from magic” sin.  Arthur C. Clarke said that “any sufficiently advanced technology is indistinguishable from magic”. He meant that if technology is very much advanced from what you are used to you no longer have a mental model of that technology can and cannot do and so can’t know its limitations. Again, this is what happens with generative AI, as it can perform amazingly well, and so people do not understand its limitations, partly because they keep forgetting how it works, enthralled instead by the spectacular results in generating great language.

Humanoid Robots

The arguments for humanoid robots are based on the idea that they will be able to do everything that a human can do.

But there is an additional argument that many people make. They argue that just as computation has gotten exponentially cheaper over time, so too will humanoid robots. But this is the fallacy of exponentialism. Computation got cheaper exponentially as we were able to reduce the amount of material in a circuit and still have it work. And the key to that being true is the digital abstraction of detecting whether a there is a current flowing or not, and treating that as a digital 1 or 0. When digital electronics started out in silicon there were massive numbers of electrons flowing in the currents used. We were able to halve that about twenty two times, so reducing it by a factor of 4 million and the systems still worked.  But you can’t do that with mechanical systems that are doing real work with real payloads. Yes perhaps, just perhaps, today’s physical systems are about four times as large as they need to be to lift and move the objects that a human lifts and moves. Ultimately then we might get a reduction of four times in price. But that leads to nothing like the pennies per hour that exponentialists predict for humanoid robot labor costs.

The other sin is belief that a lab demonstration can result in deployment at scale in just a year or two.  I first took a ride in a self driving Waymo car on a freeway in Mountain View California in 2012. (And that was 22 years after Ernst Dickmanns had first had his autonomous vehicle driving on the autobahns outside of Munich.) Thirteen years after my first ride I can still only take a ride in a Waymo, or any other autonomous vehicle, in a handful of US cities, and all of them in limited geofenced areas. It takes decades to deploy physical systems at scale, even when they don’t need changes in infrastructure.

Conclusion

Generative AI and Humanoid Robots tap into the fantasy of infinite wealth from new technologies. Many new technologies have undoubtedly made the lives of humans better, and both these technologies may well do so. But it will not be at the physical scale or short timescale than proponents imagine. People will come to regret how much capital they have spent in these pursuits, both at existing companies and in start ups.

Footnote

1. I fear that this will not actually work in practice. The key to dexterous manipulation is what happens when our fingers and hands are in contact with objects, and how we respond to the forces and touch data we receive. We now know that there are at least 18 families of neurons involved in our sense of touch and it is much more complex and active than our consciousness tells us. No-one is collecting rich touch data nor how the human responds to the detailed forces felt and applied when in contact with objects. Look at the many much heralded videos of humanoid robots grasping objects. They demonstrate just a first draft cartoon manipulation capability.

Deployment at scale takes decades

rodneybrooks.com/deployment-at-scale-takes-decades/

Of course, I am filled with fear and anger at the political situation in the US, but we can’t forget about research and how it interplays with the innovation economy of the United States and other countries.

The short answer is that it takes longer than anyone thinks to get to deployment, and so sustained steady funding is necessary.

Here is a little example with a contribution (my first and last for this field) that I made 40 years ago, to a field that is now at the center of drones, agricultural robots, mobile robots, and geometric data modeling. But it took thirty years to start to get to be practical, and only now it is driving new applications every day.

March 1985 in St Louis, the IEEE ICRA (International Conference on Robotics and Automation). I published an invited, unrefereed, paper “Visual Map Making for a Mobile Robot”. At the same conference my friends in Toulouse, Raja Chatila and Jean-Paul Laumond had a paper “Position referencing and consistent world modeling for mobile robots”. We were both surprised by the others’ paper as we had invented the same thing, loop closing, which is the basis for SLAM (Simultaneous Localization and Mapping). I used vision (with no actual experiments), they used sonar, with experiments. They had a really quite bad way of making geometric inferences from the closing of a loop (the robot wanders around recording its position relative to obstacles as its position estimate drifts, and then it recognizes it is at a place it was in before and significantly corrects its estimate of where it must have been for all its intermediate observations). I had a much much much worse geometric inference mechanism. This inspired Randall Smith and Peter Cheeseman to chastise us (as is the way it works in science) for our stupidity and in 1986 publish a paper “On the Representation and Estimation of Spatial Uncertainty“, which made the idea practical-ish.

For the next 20 years, hundreds (certainly, perhaps more) people worked on really making it practical. But still, early this century, my hopeful “visual” map making was not really practical, 1-D Lidar worked-ish. Now with cheap GPUs by the hundreds, on cheap silicon, driven by the need for deep learning inference, it is practical at large scale. It is what makes my current company remotely possible.

Deployment at scale takes so much longer than anyone ever imagines. If you see someone with a new technology that is a barely working lab demo with PhD students baby-sitting it behind the scenes, and the they say it is going to change the world in two years, just laugh.

Five new technology predictions for the next decade

rodneybrooks.com/five-new-technology-predictions-for-the-next-decade/

On March 26th I skeeted out five technology predictions, talking about developments over the next ten years through January 1st, 2036. I’ll incorporate these new predictions into my scorecard posts (every Jan 1st) and see how accurate they have been over that time period. Note that none of these predictions are saying what might happen after the 10 years are up.

1. Quantum computers. The successful ones will emulate physical systems directly for specialized classes of problems rather than translating conventional general computation into quantum hardware. Think of them as 21st century analog computers. Impact will be on materials and physics computations.

2. Self driving cars. In the US the players that will determine whether self driving cars are successful or abandoned are #1 Waymo (Google) and #2 Zoox (Amazon). No one else matters. The key metric will be human intervention rate as that will determine profitability.

3. Humanoid Robots. Deployable dexterity will remain pathetic compared to human hands beyond 2036. Without new types of mechanical systems walking humanoids will remain too unsafe to be in close proximity to real humans.

4. Neural computation. There will be small and impactful academic forays into neuralish systems that are well beyond the linear threshold systems, developed by 1960, that are the foundation of recent successes. Clear winners will not yet emerge by 2036 but there will be multiple candidates.

5. LLMs that can explain which data led to what outputs will be key to non annoying/dangerous/stupid deployments. They will be surrounded by lots of mechanism to keep them boxed in, and those mechanisms, not yet invented for most applications, will be where the arms races occur.

Predictions Scorecard, 2025 January 01

rodneybrooks.com/predictions-scorecard-2025-january-01/

[You can follow me on social media: @rodneyabrooks.bsky.social and see my publications etc., at https://people.csail.mit.edu/brooks]

This is my seventh annual update on how my dated predictions from January 1st, 2018 concerning (1) self driving cars, (2) robotics, AI , and machine learning, and (3) human space travel, have held up. I promised then to review them at the start of the year every year until 2050 (right after my 95th birthday), thirty two years in total. The idea is to hold myself accountable for those predictions. How right or wrong was I?

I have decided to change my rules for myself a little bit after this year, in response to the many many people who have said how much they enjoy seeing my updates.

My predictions were mostly for the first few years, and by next year the density of due dates will be very low.  So, on the eight anniversary of my first set of predictions, i.e., a year from today, I will be making a new set of predictions centered on the period January 1st 2026 to January 1st 2036, and that will give a new density of predictions where there will be real meat to see how accurately they turned out.

What I Want to Achieve and a Changing Hype-driven Landscape

The level of hype about AI, Machine Learning and Robotics completely distorts people’s understanding of reality. It distorts where VC money goes, always to something that promises impossibly large payoffs–it seems it is better to have an untested idea that would have an enormous payoff than a tested idea which can get to a sustainable business, but does not change the world for ever. It distorts what young researchers work on as they do not want to be seen as old fashioned even when the current hyped topic is sort of dumb–soon the dumbness is forgotten and the heat of the chase becomes all. It distorts what people think they need to get a degree in at college in order to have good career prospects.

I want people to use rational thought processes when they hear about hyped ideas and be able to assess what is really going on, and what is just plain (to use the technical term) bullshit.

My Color Scheme and Past Analysis

The acronyms I used for predictions in my original post were as follows.

NET year means it will not happen before that year (No Earlier Than)
BY year means I predict that it will happen by that year.
NIML, Not In My Lifetime, i.e., not before 2050.

As time passes mentioned years I color then as accurate, too pessimistic, or too optimistic.

This year I have added hemming and hawing. This is for when something looks just like what I said would take a lot longer has happened, but the underlying achievement is not what everyone expected, and is not what was delivered. This is mostly for things that were talked about as being likely to happen with no human intervention and it now appears to happen that way, but in reality there are humans in the loop that the companies never disclose. So the technology that was promised to be delivered hasn’t actually been delivered but everyone thinks it has been.

I have not changed any of the text of the first three columns of the prediction tables since their publication on the first day of 2018. I only change the text in the fourth column to say what actually happened.  This meant that by two years ago that fourth column was getting very long and skinny, so I removed them and started with fresh comments last year. I have kept last year’s comments and added new ones, with yellow backgrounds, for this year. If you want to see the previous five years of comments you can go back to  the 2023 scorecard.

Overview of changes this year

There has been a lot of activity in both self driving cars (the demise of Cruise a big push by Waymo to scale human assisted deployments, and lots of smoke and mirrors from an electric car company) and in AI, where robotics has been pulled into the ultra hyposphere while in generative AI the end of scaling and the introduction of inference mechanisms (!!) have been hotly announced and disputed.  The human spaceflight endeavor, as it did last year, has crawled along and again has stretched out dates that were probably too optimistic in the first place.

But First.

<rant>

We all know about FOMO, Fear Of Missing Out. In late 2023, for a talk on generative AI that I gave at MIT, I coined another acronym,  FOBAWTPALSL, Fear Of Being A Wimpy Techno-Pessimist And Looking Stupid Later. Perhaps that one is a little bit too much of a mouthful to catch on. These two human insecurities lead people to herd-like behavior in establishing and propagating the zeitgeist on almost any topic.

They lead to people piling on the hype fiestas, rushing to invest (money, effort, or hope) in marginal ideas once they have become a little bit popular, or believing our airspace is being invaded by foreign drones.

“Mounting evidence, and lack thereof, suggests that perhaps the whole craze has been a sort of communal fever dream fueled by crowd mentality, confirmation bias and a general distrust in all things official.”

That quote is from the drone story linked to above, but it could well as been about the hype that we are moving towards AGI (Artificial General Intelligence).

I want to be clear, as there has been for almost seventy years now, there has been significant progress in Artificial Intelligence over the last decade. There are new tools and they are being applied widely in science and technology, and are changing the way we think about ourselves, and how to make further progress.

That being said, we are not on the verge of replacing and eliminating humans in either white collar jobs or blue collar jobs. Their tasks may shift in both styles of jobs, but the jobs are not going away. We are not on the verge of a revolution in medicine and the role of human doctors. We are not on the verge of the elimination of coding as a job. We are not on the verge of replacing humans with humanoid robots to do jobs that involve physical interactions in the world. We are not on the verge of replacing human automobile and truck drivers world wide. We are not on the verge of replacing scientists with AI programs.

Breathless predictions such as these have happened for seven decades in a row, and each time people have thought the end is in sight and that it is all over for humans, that we have figured out the secrets of intelligence and it will all just scale.  The only difference this time is that these expectations have leaked out into the world at large. I’ll analyze why this continues to happen below in the section on AI and ML.

Here is a list of some of those hype cycles that I, personally, have perceived and lived through, as taken from my presentation at MIT in late 2023 that I referenced above re FOBAWTPALSL.

Really, was there really hype about all these things?  Yes, there was, within the circles that cared. Those circles have gotten wider and wider and when reigning world chess champion Garry Kasparov was beaten by I.B.M.’s Deep Blue computer under tournament conditions in 1997 it was widely reported in the popular press, And it was declared that it was all over for humans.

Back in February 2011 a computer program named Watson played on the television game show Jeopardy against all time human champions. John Markoff, legendary technology reporter at the New York Times, wrote stories about this the day before the competition, and the day after, when Watson had indeed beaten the humans, with the same questions (fed as text to it as the same time as the humans heard the questions) all running on a cluster of machines not connected to an outside network. Here are three successive paragraphs from the second of those stories.

For I.B.M., the future will happen very quickly, company executives said. On Thursday it plans to announce that it will collaborate with Columbia University and the University of Maryland to create a physician’s assistant service that will allow doctors to query a cybernetic assistant. The company also plans to work with Nuance Communications Inc. to add voice recognition to the physician’s assistant, possibly making the service available in as little as 18 months.

“I have been in medical education for 40 years and we’re still a very memory-based curriculum,” said Dr. Herbert Chase, a professor of clinical medicine at Columbia University who is working with I.B.M. on the physician’s assistant. “The power of Watson- like tools will cause us to reconsider what it is we want students to do.”

I.B.M. executives also said they are in discussions with a major consumer electronics retailer to develop a version of Watson, named after I.B.M.’s founder, Thomas J. Watson, that would be able to interact with consumers on a variety of subjects like buying decisions and technical support.

My personal experience at that time was people I did not know, but who had heard about my role at MIT (as director of the MIT AI Lab, and then founding director of MIT CSAIL, the Computer Science and Artificial Intelligence Lab) would come up to me and ask about the future of medicine. The people were variously doctors or health industry executives. I reassured them that medicine as we knew it then would stay much the same and was not about to be rendered obsolete.

And then in 2016 Geoff Hinton, one of the key architects of Deep Learning (which has had undeniable impact on the world) said:

“People should stop training radiologists now. It is just completely obvious that within five years deep learning is going to be better than radiologists.”

More people asking me whether this was true. It wasn’t in five years and it isn’t now. We need more radiologists than ever. And yes they do use deep learning tools to help them see some things they wouldn’t otherwise see. But they also understand anomalies using causal reasoning and we would be in a sorry state if all radiology was done by programs today.

Now look at those plum colored paragraphs above again as you take yourself way back in time to a year or so ago when ChatGPT was just a baby AGI, You can find stories just like this one if you substitute “ChatGPT” for “Watson” and “Microsoft” for “I.B.M.”

The things confidently predicted in 2011 (and in 1979, and in 2016) about the end of doctors didn’t happen then and it is not happening now. Nor are all the other jobs ending.

Today I get asked about humanoid robots taking away people’s jobs. In March 2023 I was at a cocktail party and there was a humanoid robot behind the bar making jokes with people and shakily (in a bad way) mixing drinks. A waiter was standing about 20 feet away silently staring at the robot with mouth hanging open. I went over and told her it was tele-operated. “Thank God” she said. (And I didn’t need to explain what “tele-operated” meant). Humanoids are not going to be taking away jobs anytime soon (and by that I mean not for decades).

You, you people!, are all making fundamental errors in understanding the technologies and where their boundaries lie. Many of them will be useful technologies but their imagined capabilities are just not going to come about in the time frames the majority of the technology and prognosticator class, deeply driven by FOBAWTPALSL, think.

But this time it is different you say. This time it is really going to happen. You just don’t understand how powerful AI is now, you say. All the early predictions were clearly wrong and premature as the AI programs were clearly not as good as now and we had much less computation back then. This time it is all different and it is for sure now.

Yeah, well, I’ve got a Second Coming to sell you…

</rant>

Self Driving Cars

As with flying cars the definition, or common understanding, of what self driving cars really means has changed since my post on predictions seven years ago.  At that time self driving cars meant that the cars would drive themselves to wherever they were told to go with no further human control inputs.

Now self driving cars means that there is no one in the drivers seat, but there may well be, and in all cases so far deployed, humans monitoring those cars from a remote location, and occasionally sending control inputs to the cars. The companies do not advertise this feature out loud too much, but they do acknowledge it, and the reports are that it happens somewhere between every one to two miles traveled. These inputs are not direct control of the normal human mechanism of control the steering wheel, the brakes, and the accelerator.  Rather they are advice that overrides some of the algorithms.  For instance, “steer out into the next lane and go around this truck” as the human realizes that the truck is just not going to move (see an anecdote below on the first night I took the new Waymo taxis in San Francisco (I had previously last ridden a Waymo in 2012 in Mountain View)).

Why is this difference important?  One of the motivations for self driving cars was that the economics of taxis, cars that people hire at any time for a short ride of a few miles from where they are to somewhere else of their choosing, would be radically different as there would be no driver. Systems which do require remote operations assistance to get full reliability cut into that economic advantage and have a higher burden on their ROI calculations to make a business case for their adoption and therefore their time horizon to scaling across geographies.

But wait, you might say, isn’t that electric car company that used to be based in California and is now based in Texas going to roll this out imminently and have a fully digital taxi service. They demoed it on a Hollywood movie studio lot just this year, and the cars were painted gold. Hmm. The location of the demo and the fact that the cars, even down to the tires, were painted gold tells you everything you need to know. Both the cars and the humanoid robots at that event were presented as autonomous but in reality they were all tele-operated directly by people (see below in the humanoid section for more details). And that same electric car company is actively hiring people into paying jobs as remote operators.

There was a reasonably balanced appraisal from Reuters just after the event, though it does not go into details of the demos. Here is a direct quote from the story:

“We do expect to start fully autonomous unsupervised FSD in Texas and California next year.” Musk said.

The astute reader will note that this is the 11th year in a row that the CEO of Tesla has made this prediction of the same milestone happening the next year. We can admire the consistency.

Actual self-driving is now generally accepted to be much harder than every one believed.

The reason that this bait and switch is important to understand is that the promise of inevitable fully self driving technology upended a historical way that new transportation systems have been adopted.

In the past whenever we have introduced new transportation mechanisms there have been large investments in infrastructure and that infrastructure is shared and used by everyone. The Romans built roads so soldiers and traded goods could travel long distances–in Europe those road networks are still the basis of today’s road networks. When steam engine driven trains were the new transportation technology vast networks of rails were built allowing goods to move long distances in mere hours or days. When Ford started mass production of automobiles he built roads and the local governments followed and the the Federal government followed, and those roads are what we use today.

Actual fully self driving cars promised that no infrastructure changes would be needed to revolutionize how vehicles would be controlled. Each individual vehicle would do what was needed all by itself. As sensors and networks got better there was no need for expensive new infrastructure because of this promise.

The promise was false. If government and private partnerships in building smart roads, which was a hot topic in the 1990s. had continued, every one of us would now have smarter safer cars, but still with onboard human drivers taking over in many situations. But we would have had smart freeways where once you were on it your car would be self driving. The road would have had lots of sensors effectively shared across all cars, as that data would have been transmitted to all passing cars. It would have been a fraction of the cost per car compared to the sensing on today’s almost but not really self driving cars like those of Waymo. And we would have had much more accurate congestion data where the root causes of local congestion would have been sensed with semantic understanding rather than just inferring it from the aggregate collection of location data from phones, individual cars, and historical data from roadside sensors.

Instead we now have individual corporate actors using a mixture of partial self driving and remote human supervision. The big question is whether the economics of this works at scale, and whether the fake promises will drive out the human drivers in cheaper services and we’ll all end up paying more. Will the level of hype we saw push our decentralized transportation system into the hands of a few wealthy companies, and in effect make it a centralized system where everybody has to pay private companies to be part of it?

As a reminder of how strong the hype was and the certainty of promises that it was just around the corner here is a snapshot of a whole bunch of predictions by major executives from 2017.

I have shown this many times before but there is one new annotation here for 2024. The years in parentheses are when the predictions were made. The years in blue are the years are the predicted years of achievement. When a blue year is shaded pink it means that it did not come to pass by then. The predictions with orange arrows are those that I had noticed had later been retracted.

The prediction that Jaguar and Land-Rover made that they would have fully autonomous cars by 2024 did not come to pass, so I have shaded it pink,

Note that every single blue year up until now is shaded pink, and that every one that is shaded pink has still not come to pass. None of the predictions that were out there in 2017 for the next few years have happened.  None. There are three more for 2025, and I am sure that a year from now they will all be shaded pink also.

One of the big selling points of self driving cars was that they would be safer than cars driven by humans. So far that is not holding up with real data. One electric car maker with self driving software had it disengage when it sensed there would be an accident, supposedly so that the human could take over in a split second. And then the company did not report the incident as the fault of the software as it was no longer controlling the car when the impact occurred. It was reported, and I had this experience myself in my last ride in a Cruise in 2023, that Cruise vehicles would freeze when an accident looked likely, and then not report it as their software’s fault as the car was stationary and was hit by another car. In many reported cases, and in my case, simply continuing to move forward would avert any likely accident (fortunately for me the human driver of the other car slammed on the brakes and did not hit my robot vehicle).

In this story from the Washington Post about Federal investigations into the safety incidents with self driving cars, they report that the companies involved claim they have vast amounts of driving on our roads under their belt. Not so.

An industry association says autonomous vehicles have logged a total of 70 million miles, a figure that it compares to 293 trips to the moon and back. But it’s a tiny fraction of the almost 9 billion miles that Americans drive every day. The relatively small number of miles the vehicles have driven makes it difficult to draw broad conclusions about their safety.

To put that into perspective, the total number of miles driven by all autonomous (sort of) vehicles over the last decade is less than 1% of the miles driven by humans every day in the United States. It is a tiny, tiny portion.

Take a look at this embedded video from the Wall Street Journal about investigations of crashes (many of which have been fatal) involving autonomous driving systems.

From the audio: “The kinds of things that tend to go wrong with these systems are things like it was not trained on, pictures of an overturned double trailer. It just didn’t know what it was. There were some lights there, but the lights were in unusual positions. A person would have clearly said something big is in the middle of the road. But the way machine learning works is it trains it on a bunch of examples and if it encounters something it doesn’t have a bunch of examples for it may have no idea what’s going on.”

[[My own take is that the fetish of end to end learning leads people to leave out well known algorithms that might solve many of  these problems (e.g,, the incredibly simple time to collision algorithms based on looming). Yes, end to end learning made speech understanding systems better, but that does not mean it is the appropriate fetish to apply everywhere.]]

Pro tip: Think about this history of industry prognostications about fully autonomous driving being just around the corner when you read today’s prognostications about LLMs taking jobs, en masse, in the next couple of years, or humanoid robots being dirt cheap and being able to learn how to do any human manual task real real soon now. You know you have seen this movie before…

My own experiences with Waymo in 2024

I have two sorts of experiences with Waymo vehicles. First, as a driver of my own vehicle and sharing road space with them every single time that I drive. And second, as a user of their ride service.

The streets of San Francisco had been thick with Waymo vehicles with no driver in them especially in the second half of 2024. As I drive across the city every morning to head down to my robotics/AI startup half way down the peninsula I see them everywhere until I get on to 101.  I see them in front of me and behind me and in adjacent lanes as I drive on multilane one way streets. Sometimes I see four of them in a single block. Twice I’ve seen four of them in a line, in my block and could see four of them in a line in the block ahead of me.  When I am at four way intersections with no traffic lights I see them participating in the social ritual of taking your turn to drive through the intersection in the order you stopped, except when a pedestrian is crossing in front of you. They do that pretty well. They do less well when they accidentally get into a line of parents’ cars snaking around a corner for school drop off or pickup.

Over the last few months I have noticed that in general they are getting more aggressive about stretching the rules, just like people do. Otherwise human drivers (including me) take advantage of their politeness. That aggression is not always welcomed. One morning I saw a workman with a group doing some digging on a road, and holding a sign with SLOW on one side and STOP on the other side have to jump in front of a Waymo to get it to do what he was trying to tell it to do with the sign. STOP. It wasn’t stopping for no stinking sign!

The only time I have seen a Waymo go into reverse, ever, was when I was illegally driving the wrong way down a single lane street and we were heading straight at each other.

As a rider I feel they are not quite aggressive enough with human drivers some time, so a ride in a Waymo takes longer than with an Uber or Lyft.

It is hit and miss where they drop me off. Sometimes they take a place to pull over half a block from my house, even when it is raining. There is no way to adjust what they happen to decide that day, even though I know that they will always be able to pull in right in front of my house.

The first time I took a Waymo this year, on the way home it picked me up at a restaurant and then was about to make a right turn. But at that corner there was an 18 wheeler with its lights flashing and surrounded by green cones. It pulled right in behind that truck and waited a long time before it drove forward. I am guessing a remote operator intervened told it to go around because eventually it pulled around it in the lane just to the left. Based on seeing Waymos interact with orange cones I suspect it would have done better if the cones had been orange rather than green.  This easily illustrates that the learning that this robot does, and indeed any robot does, is nothing like the learning that people do (see my rant about the seven deadly sins and mistaking performance for competence in the section below on advances in AI and ML).

I mostly feel safe when I am a passenger in a Waymo.  Sometimes I don’t feel that my driver of an Uber that I am taking rides with Uber that are not as safe as I would prefer.

Self Driving Taxi Services

There have been three self driving taxi services in the US in various stages of play over the last handful of years, though it turns out, as pointed out above that all of them have remote operators. They are Waymo, Cruise, and Zoox.

Waymo and Cruise are similar in that they use conventional cars adorned with lots of sensors. Zoox has purpose built vehicles that have no steering wheel or pedals for brake or accelerator.

Waymo and Cruise went for deployments in large parts of two or more cities and have had ride services callable by apps, just as one can do with Uber or Lyft. Zoox is smaller scale, much more restricted in geography, and really not comparable.

At this time last year Cruise was in trouble has it had suspended all of its San Francisco operations under pressure from regulators after some bad accidents that happened in a way that never would happen for human driven cars.  Briefly, their cars were getting hit at night by emergency vehicles with lights flashing as the Cruise cars crossed intersections. Human drivers see the reflections of lights from such vehicles flashing even if they don’t see the vehicles themselves. The Cruise vehicles were only reacting to flashing lights that they could perceive directly. But the accident that tipped the scales was when a pedestrian crossing in front of a human driven vehicle was hit and went flying in the air landing right in front of a Cruise. The Cruise hit the person (who now disappeared from sight) as a human driver would most likely have done. But then it proceeded to drive 20 feet with the human underneath the vehicle being dragged along as it went into a mode where it was supposed to get off the road. A human driver would not have reacted that way to having been in a collision, even if it was not their fault.

The hammer finally fell in December of 2024. General Motors shut down Cruise. The leading paragraphs from this linked story from the Wall Street Journal are:

General Motors has scrapped its Cruise robotaxi program after nearly a decade and $10 billion in development, citing the time and costs needed to scale the business and rising competition.

GM on Tuesday said it plans to realign its autonomous driving strategy and give priority to development of advanced driver assistance systems, which take over steering and other functions in certain situations and are common on new vehicles today.

The automaker said it would continue to develop fully autonomous technology for personal vehicles, and build on the progress of its Super Cruise system, a hands-off, eyes-on driving feature that the company introduced several years ago.

GM said it owns about 90% of Cruise and intends to buy out the remaining investors. It plans to combine the technical teams from Cruise and GM into a single effort to advance autonomous and assisted driving.

“We want to leverage what already has been done as we go forward in this,” Chief Executive Mary Barra told analysts on a call Tuesday.

The Detroit automaker said it expects the restructuring to reduce spending by more than $1 billion annually after the proposed plan is completed, which is expected in the first half of next year.

While there are 40 companies that have permits to test autonomous driving in California, alone, the demise of Cruise leaves just one company, Waymo, trying to make an actual go of a digital taxi service in the United States. They have an enormous significant lead over anyone else who wants get into this business and have spent billions of dollars (probably very much north of $10 billion) on this endeavor over the last 15 years. In an email they sent me a couple of weeks ago as a user of their services they reported that they provided 4 million customer rides in 2024. That is approximately 4 million more than any other company in the United States.

Waymo

Despite being so far out in front it has not been all smooth sailing for Waymo.

Early in the year the operations center for Waymo somehow neglected to realize it was Chinese New Year in Chinatown in San Francisco. So Waymo vehicles were routed through that area on the biggest night of celebration. Any human driver would have realized that the streets, i.e., the street surfaces where cars usually drive, were completely packed with humans, no doubt some of whom were intoxicated as well as just being out having a good time. Not so the Waymo vehicles. They tried pushing through the very very dense crowds, no doubt annoying many people. And what do people have at Chinese New Year?  Fireworks. So some revelers decided to push back on this robot car invading their space. Here are a couple of pictures of the results.

Not pretty.  And an example of how taking away people’s agency is never a good idea for robots (see my second law of robotics).

Throughout 2024 Waymo has been investigates for various accidents such as those described in this Wall Street Journal article. “Reports included collisions with stationary or semistationary objects, such as gates, chains or parked vehicles, according to the regulator.”

In the middle of the summer Waymo added a feature where they would honk their horns at cars in their way. But this backfired when hundreds of Waymos were coming back to their parking lot in the very early hours of the morning, and they started honking at each other and waking up human neighbors. Eventually that got fixed.

In late September a motorcade for Kamala Harris in San Francisco was brought to a halt by a Waymo that stopped in the middle of California Street doing a U-turn in front of it. I’m sure this incident was of great concern to the Secret Service. Eventually a San Francisco police officer got into the car and drove it out of the way–this is shown in a video included with the story above. I do not know how the officer got access to the vehicle and whether Waymo remote operations were cooperating.

More disturbingly humans outside the Waymos started harrassing humans inside them. The most concerning cases come from the realization that if a woman is in a Waymo at night she will be dropped off, outside, on a public road at the end of her journey with no option but to get out of the car where it has stopped. So groups of men have followed Waymos with women in them and then harassing the woman when she gets out. If she was driving her own car she might be heading to an off road parking space or she might choose not to stop if she knows she is being followed. There are no such options in a Waymo so taking a Waymo at night is less safe than other means of transportation–just follow it and eventually the preyed upon woman will have to get out. Here is a very recent disturbing story about this practice.

Meanwhile Waymo managed to raise $5.6B to expand to new cities in 2025. It already operates in parts of San Francisco, Los Angeles, and Phoenix. The new money will let it expand to Austin and Atlanta in the United States and to start operating in parts of Tokyo in Japan. That is expensive expansion.

Here is the question for the future of watered down remote monitored “autonomous” driving systems (let’s call it “watered down autonomy”), and it is up to Waymo now. Can Waymo expand fast enough in these new markets in 2025 and take enough business from what is left of traditional taxi operators, along with those operating under the Uber and Lyft models, and do it in a way which is in sight of profitability, so that it has a case to raise the stupendous amounts of money needed to operate in all large cities in the US in the next 10 t0 20 years?

If Waymo can not succeed at this in the next two years I think the idea of large scale use of watered down autonomy will be dead for at least a decade or two. Right now full autonomy everywhere is already dead.

Prediction
[Self Driving Cars]
Date2018 CommentsUpdates
A flying car can be purchased by any US resident if they have enough money.NET 2036There is a real possibility that this will not happen at all by 2050.
Flying cars reach 0.01% of US total cars.NET 2042That would be about 26,000 flying cars given today's total.
Flying cars reach 0.1% of US total cars.NIML
First dedicated lane where only cars in truly driverless mode are allowed on a public freeway.
NET 2021
This is a bit like current day HOV lanes. My bet is the left most lane on 101 between SF and Silicon Valley (currently largely the domain of speeding Teslas in any case). People will have to have their hands on the wheel until the car is in the dedicated lane.
Such a dedicated lane where the cars communicate and drive with reduced spacing at higher speed than people are allowed to drive
NET 2024
20240101
This didn't happen in 2023 so I can call it now. But there are no plans anywhere for infrastructure to communicate with cars, though some startups are finally starting to look at this idea--it was investigated and prototyped by academia 20 years ago.
First driverless "taxi" service in a major US city, with dedicated pick up and drop off points, and restrictions on weather and time of day.
NET 2021
The pick up and drop off points will not be parking spots, but like bus stops they will be marked and restricted for that purpose only.20240101
People may think this happened in San Francisco in 2023, but it didn't. Cruise has now admitted that there were humans in the loop intervening a few percent of the time. THIS IS NOT DRIVERLESS. Without a clear statement from Waymo to the contrary, one must assume the same for them. Smoke and mirrors.
Such "taxi" services where the cars are also used with drivers at other times and with extended geography, in 10 major US cities
NET 2025
A key predictor here is when the sensors get cheap enough that using the car with a driver and not using those sensors still makes economic sense.20250101
Imminent dual use of personal cars was the carrot that got lots of people to pay cash when buying a Tesla for the software subscription that would allow thei car to operate in this way. Shockingly the CEO of Tesla announced in smoke and mirrors roll out of Cyber Cab in 2024, that the service would use specially built vehicles to be produced at some indeterminate late date. I got suckered by his hype. This is unlikely to happen in the first half of this century.
Such "taxi" service as above in 50 of the 100 biggest US cities.NET 2028It will be a very slow start and roll out. The designated pick up and drop off points may be used by multiple vendors, with communication between them in order to schedule cars in and out.
20250101
Even the watered down version of this with remote operators is not gong to happen in 50 cities by 2028. Waymo has it in 3 cities and is currently planning on 2 more in the US in 2025.
Dedicated driverless package delivery vehicles in very restricted geographies of a major US city.
NET 2023
The geographies will have to be where the roads are wide enough for other drivers to get around stopped vehicles.
A (profitable) parking garage where certain brands of cars can be left and picked up at the entrance and they will go park themselves in a human free environment.
NET 2023
The economic incentive is much higher parking density, and it will require communication between the cars and the garage infrastructure.
A driverless "taxi" service in a major US city with arbitrary pick and drop off locations, even in a restricted geographical area.
NET 2032
NET 2032
This is what Uber, Lyft, and conventional taxi services can do today.20240101
Looked like it was getting close until the dirty laundry came out.
20250101
Waymo now has a service that looks and feels like this in San Francisco, 8 years earlier than I predicted. But it is not what every one was expecting. There are humans in the loop. And for those of us who use it regularly we know it is not as general case on drop off and pick up as it is with human drivers.
Driverless taxi services operating on all streets in Cambridgeport, MA, and Greenwich Village, NY. NET 2035Unless parking and human drivers are banned from those areas before then.
A major city bans parking and cars with drivers from a non-trivial portion of a city so that driverless cars have free reign in that area.NET 2027
BY 2031
This will be the starting point for a turning of the tide towards driverless cars.
The majority of US cities have the majority of their downtown under such rules.NET 2045
Electric cars hit 30% of US car sales.NET 202720240101
This one looked pessimistic last year, but now looks at risk. There was a considerable slow down in the second derivative of adoption this year in the US.
20250101
Q3 2024 had the rate 8.9% so there is no way it can reach 30% in 2027. I was way too optimistic at a time when EV enthusiasts thought I was horribly pessimistic.
Electric car sales in the US make up essentially 100% of the sales.NET 2038
Individually owned cars can go underground onto a pallet and be whisked underground to another location in a city at more than 100mph.NIMLThere might be some small demonstration projects, but they will be just that, not real, viable mass market services.
First time that a car equipped with some version of a solution for the trolley problem is involved in an accident where it is practically invoked.NIMLRecall that a variation of this was a key plot aspect in the movie "I, Robot", where a robot had rescued the Will Smith character after a car accident at the expense of letting a young girl die.

Electric Cars

Last year US manufacturers pulled back on their planned production of EVs. In data from this report we can see that sales dropped at the start of 2024 but have now picked up again.

20222022202220222023202320232023202420242024
Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3
5.3%5.6%6.1%6.5%7.3%7.2%7.9%8.1%7.3%8.0%8.9%

There is steady growth in sales but my prediction of 30% of US car sales being electric by 2027 now seems wildly optimistic. We need two doublings to get there in three years and the doubling rate seems more like one doubling in four to five years.

Note that some sources include hybrids and hydrogen powered cars in electric vehicles but I am using the battery electric vehicle (BEV) numbers.

To see how the trends are across brands you can see a breakout for Q2 of 2024 here.

There appear to be two main headwinds for BEV adoption. Firstly, if one doesn’t have on property residential parking it is hard work in the US to find a place to recharge, and it takes hours for the charging to finish. This will stop many city dwellers from adopting. Secondly the increased tire wear adds up to real money. The maintenance requirements for BEVs are much less than for cars with an internal combustion engine. On the other hand tires do not last as long (I have had to buy four new tires in less than two years owning my first BEV), apparently due to the increased weight of the car.

Flying Cars

Flying cars are another category where the definitions have changed. Back when I made my predictions it meant a vehicle that could both drive on roads and fly through the air.  Now it has come to mean an electric multi-rotor helicopter than can operate like a taxi between various fixed landing locations. Often touted are versions that have no human pilot. These are known as eVTOLs, for “electric vertical take off & landing”.

Large valuations have been given to start ups who make nice videos of their electric air taxis flying about. But on inspection one sees that they don’t have people in them. Often, you might notice, even those flights are completely over water rather than land. I wrote about the lack of videos of viable prototypes back in November 2022.

Nevertheless there have been wild predictions.  I ended a longer version of this component in last year’s annual review with:

Also note the size of this vehicle. There are many fossil fuel powered helicopters that are much smaller. This is not going to be a personally owned vehicle for the masses.

Don’t hold your breath. They are not here. They are not coming soon.

Nothing has changed. Billions of dollars have been spent on this fantasy of personal flying cars.  It is just that, a fantasy, largely fueled by spending by billionaires.

Robotics, AI, and Machine Learning

So what happened in Robotics, AI, and Machine Learning this year?

Many, many, many people got just a little bit over excited. That’s what happened.

There have been a lot of party tricks and it is the researchers who often play the tricks on themselves without realizing it. This is not new, none of it is new. But there are orders of magnitude more people watching it now, and more people are out to make a buck by being hypesters, promising riches to those who will invest in their irrationally overpriced companies.

How could this be?

We are seeing mass sinning, lots and lots of people committing some of the seven deadly sins of predicting the future of AI  which I wrote about back in 2017 here (or here you can see a professionally edited version of that blog post of mine).

Four of those seven sins seem most relevant to today’s hyped up atmosphere around robotics, AI, and machine learning.

   

Here now are short descriptions of these particular four sins, edited down from my earlier much more detailed descriptions. Then I will weave them together to explain how it is still pretty much business as usual, and I mean that in a good way, with steady progress on both the science and engineering of AI.

Performance versus Competence

One of the social skills that we all develop is an ability to estimate the capabilities of individual people with whom we interact. We use cues from how a person performs any particular task to estimate how well they might perform some different task. We are able to generalize from observing performance at one task to a guess at competence over a much bigger set of tasks.

These estimators that we have all inherited or learned do not generalize well to other creatures or machines. We are not good at guessing which smart things other species might be able to do, and we are not good at guessing what an AI system can do when we have seen it do a few tasks in a limited domain. We get it wrong all the time.

Indistinguishable from Magic

When people cannot explain how something works they cannot know its limits as they do not have any sort of model (nor have they seen enough examples of it before). Arthur C. Clarke said that any sufficiently advanced technology is indistinguishable from magic.

In our minds UFOs can do all sorts of amazing things as we have no way of knowing their limits–they may as well be magic, And that is what they become in speculation about them.

Isaac Newton spent half his working life on alchemy as he did not know that the nucleus of atoms were not subject to mere chemistry. He would have been just as ignorant of the limitations of an iPhone screen (different sort of apple…), despite his own ground breaking work in optics. Remember, he was a really really smart dude. But even he was not able to develop all the theories needed to understand the world around him, despite his successes with calculus and gravity and the makeup of white light. He attributed properties to chemistry that were way beyond its limits.

Exponentialism

We have just lived through sixty years of the most phenomenal growth of a technology in the history of humankind. It is the story of silicon-based computation. Everyone has some idea about Moore’s Law, at least as much to sort of know that computers get better and better on a clockwork like schedule.

This reality has trained people to think that probably a lot of other things in tech will change exponentially, especially when that thing has a strong computational component. The sin of exponentialism is to argue that some other process is going to follow a Moore’s-like law when it is unwarranted to so argue.

Moore’s law worked for so long because in the starting technology of the 1960s the currents used to represent digital information were many many orders of magnitude beyond the minimal physical limit needed to determine whether they  were present or not, and hence distinguish a 1 from a 0. Those currents could be halved many times without breaking physics limits.

Speed of Deployment

New technologies get deployed much more slowly than people imagine. Even software technologies.

The old internet protocol, IPv4, can only address two billion, or 2×109, devices, which is way less than the number of people on our planet. A new protocol, IPv6, which can address more than 3×1038 devices was meant to replace it over a two year period of dual use by about 2003. But in 2024 IPv4 was still there and carrying over half the world’s internet traffic despite its inadequacies.

Must functioning businesses that operate in the physical world are very averse to taking up new technology as it dramatically increases existential risk to their business. They must foresee immediate and incredibly high return on investment (ROI) to be tempted to move to new technologies.

Even the military is slow to adopt new technologies. The US Air Force still flies the B-52H variant of the B-52 bomber. This version was introduced in 1961, making it 63 years old. The last one was built in 1963, a mere 61 years ago. Currently these planes are expected to keep flying until at least 2040, and perhaps longer–there is talk of extending their life out to 100 years.

What does this all mean?

Right now there is incredible hype for both Large Language Models (LLMs), and all their variations, and for humanoid robots, especially humanoid robots that are going to learn how to do things.

The hype is driven by the four sins above.

LLMs

LLMs have proved amazing facile with language. They have been trained on pretty much all the text that is available on the Web and all the digitized historical books that exist. Miraculously LLMs seem to be able to infer a representation of some sort, that is somewhat independent of the particular human language that they read. So they are able to translate between human languages, and when you ask them just about anything they produce text in the language that you asked in, and that text often seems entirely reasonable and informative.

I used the word “miraculously” as we do not really understand why they are able to do what they do. We, of course, know that the architecture for them is built around noticing correlations in vast amounts of text  that connect some tens of thousands of tokens which are the components of words in each language that is digested. It is a surprise that they work as well as they they do, and produce coherent sounding language on just about any topic.

Here is the original architectural diagram from the 2017 Attention Is All You Need paper:

Each column from bottom to top is a pure feed forward network, with no search, no iteration, no conventional algorithm at all. There are inputs at the bottom and then layer upon layer of linear neurons that have numbers or weights stored in them that multiply and add their inputs and threshold that sum to provide an output. The detail in the architectural diagram is how the connections between layers are organized.

On the left is an input or question, in a linear string of words, from a user. That gets injected half way up the network on the right and remains constant while a single iteration process runs. The stack on the right outputs a word (or token) and that gets fed back to the bottom of that stack, and a new token pops out the top. All the output tokens that have so far been produced remain in the right bottom input buffer as ordered input.

What the network has been trained to do, is given the user input on the left, and what the network has output so far, choose a very likely next word, given the billions of examples it has seen in training. Some randomness is used to choose among a small number of very likely next words at each stage.

There are hundreds of billions of weights that get learned and stored in the layers of network to act as multipliers for each individual input to each layer.

So now us humans are faced with looking at this system running and our human nature just makes us commit the first two sins from above.  It is in our nature and we cannot help ourselves.

First, we see really impressive examples of responses to input questions, and if a human was giving those answers we would estimate that person to be quite clever and able to reason. Often though, because they have so many billions of examples on which they were trained LLMs are essentially looking up the question in the weights. The weight if gained from all of human knowledge that is out there on the network in language form. Invisibly the network is perhaps (but not in any intentional way) merging some similar questions, and then merging the answers which were already in the vast data that it has seen.

But us dumb humans just think the damn thing is really really smart.

Then, since we don’t have a real explanation in our heads for what it is doing we start thinking it is magic, and that there is no real limit to what it is extracting from all that data (that it used a significant portion of the energy budget for many different countries to compute) and how general its capabilities will be. It becomes magic. And then researchers try to show that it can reason, that it has inferred a spatial understanding of the world, that language can be used to do all sorts of things that Moravec’s paradox tells us it can’t. There is a lot of magical thinking that humans do about LLMs.

Of course it can diagnose diseases like a doctor talking about them. Of course it can teach a student as well as a human teacher. Of course it can program as well as a human computer programmer. It is magic after all.

But in reality the fact that it is just picking likely next words means that in fact we can’t trust its output. Some outputs are great. Some are pure confabulations (most people use the word “hallucinations” for this, but I prefer “confabulations”). And we do not know which we will get ahead of time, or more perniciously how much of each we will get, trustworthy pieces of output and confabulated pieces of output all jumbled together.

Not to worry say the proponents, More learning will fix it. Fire up a nuclear power plant (I am not making this up–the tech companies are getting more nuclear power built or activated so that their LLMs can learn what a human learns using just 20 watts powering their brain; I am not confabulating this!!), and we’ll feed it more data and it will become more trustworthy.  It is magic after all. But the magic is not going as well as the proponents imagined and promised as this Wall Street Journal story explains. Their imaginations were definitely encourage by exponentialism, but in fact all they knew was that when the went from smallish to largish networks following the architectural diagram above, the performance got much better. So the inherent reasoning was that if more made things better then more more would make things more better. Alas for them it appears that this is probably not the case. But rabid exponentialists have not yet given up. Expect a bunch of VCs to adversely affect the growth of pension funds around the world as pension funds are a prime source of capital that VCs spend.

More serious academics are working on boxing in the LLMs with more external mechanism beyond just feeding the output tokens back in as a linear string of input. Many of these mechanisms look a lot like more conventional AI mechanisms, and we will see where these additions prove to be useful, how much of the wheel will be reinvented, and how long (months?, years?, decades?) to get there.

And the answers to those last questions will tell us how much sinning has been done by companies in predicting fast deployments. Back in rant at the beginning of this post I gave the example of I.B.M. and Watson and their completely optimistic predictions of how any problems of applying Watson (which seemed extremely competent based on its performance on live TV) to the real world would be solvable. The areas that it was predicted to be applicable came from magical thinking.

Surely no one today could be as dumb as that big company was back in 2011. Surely not. No, not us smart inhabitants of 2025. Its us. We are nowhere near as dumb as them!!

Humanoid Robots

The other thing that has gotten over hyped in 2024 is humanoids robots.  The rationale for humanoid robots being a thing is a product of the four sins above and I think way less rooted in reality than the hype about LLMs. In fact I think it is pretty dumb. [[I suspect many people will reason that I cannot have a valid opinion about this precisely because I happen to have built more humanoid robots than anyone else on the planet. So read ahead with caution.]]

My first law of robotics states:

The visual appearance of a robot makes a promise about what it can do and how smart it is. It needs to deliver or slightly over deliver on that promise or it will not be accepted.

The first sentence describes, I think, what is sucking people into believing that humanoid robots have a big future. It looks like a human, so its performance will be like a human, so it will be competent like a human.  It’s the performance/competence sin without even waiting for the performance part!

The second sentence describes how the humanoid fever will break, and how the hundreds of millions of dollars put into many of these companies (billions of dollars overall) will disappear. The puppets will not perform at acceptable levels. It is easy to see this as you hear all the things investors and CEOs of humanoid robots say they will be able to do. They have hardly even got to the lab demonstration phase.  My third law of robotics is:

Technologies for robots need 10+ years of steady improvement beyond lab demos of the target tasks to mature to low cost and to have their limitations characterized well enough that they can deliver 99.9% of the time. Every 10 more years gets another 9 in reliability.

For real work, robots need to operate with four, five, or six nines. We are a long way from that. The zeitgeist is that we will simply teach the robots to do stuff and then they will be able to do it.

BUT, we do not know yet whether that is going to work. In order for it to work you have to both collect the right sort of data and then learn the right things from that data. It is not at all clear to me that we know the answers to make either of those things true. I think it will be an active place for lots of good research for many years to come.

There is an excellent survey paper of current research state of the art called Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes. Unfortunately I think the title of the paper is going to confuse many people. “Real-World Successes” to someone like me, who these days deploys robots that people pay for and that provide real ROI, sounds like it is about systems that have been deployed. But on reading the paper it turns out that they mean that it is learning and demonstrations done in a lab setting on physical hardware rather than just in simulations and simulators.  And, to me the lab demonstrations are shakier (literally) than I imagined in my third law above.

I think we are a long way off from being able to for-real deploy humanoid robots which have even minimal performance to be useable and even further off from ones that have enough ROI for people want to use them for anything beyond marketing the forward thinking outlook of the buyer.

Despite this, many people have predicted that the cost of humanoid robots will drop exponentially as their numbers grow, and so they will get dirt cheap. I have seen people refer to the cost of integrated circuits having dropped so much over the last few decades as proof. Not so.

They are committing the sin of exponentialism in an obviously dumb way. As I explained above the first integrated circuits were far from working at the limits of physics of representing information. But today’s robots use mechanical components and motors that are not too far at all from physics based limits, about mass, force, and energy. You can’t just halve the size of a motor and have a robot lift the same sized payload. Perhaps you can halve it once to get rid of inefficiencies in current designs. Perhaps. But you certainly can’t do it twice. Physical robots are not ripe for exponential cost reduction by burning wastes in current designs. And it won’t happen just because we start (perhaps) mass producing humanoid robots (oh, but the way, I already did this a decade ago–see my parting shot below). We know that from a century of mass producing automobiles. They did not get exponentially cheaper, except in the computing systems. Engines still have mass and still need the same amount of energy to accelerate good old fashioned mass.

This Year’s Prediction Update

There is only one new comment in my robotics, AI and ML predictions table this year. There are a bunch of well funded new companies in the home robot space, and perhaps they will come up with new mobility solutions, which in my experience is the big blocker for home robots.

Prediction
[AI and ML]
Date2018 CommentsUpdates
Academic rumblings about the limits of Deep Learning
BY 2017
Oh, this is already happening... the pace will pick up.
The technical press starts reporting about limits of Deep Learning, and limits of reinforcement learning of game play.
BY 2018
The popular press starts having stories that the era of Deep Learning is over.
BY 2020
VCs figure out that for an investment to pay off there needs to be something more than "X + Deep Learning".
NET 2021
I am being a little cynical here, and of course there will be no way to know when things change exactly.
Emergence of the generally agreed upon "next big thing" in AI beyond deep learning.
NET 2023
BY 2027
Whatever this turns out to be, it will be something that someone is already working on, and there are already published papers about it. There will be many claims on this title earlier than 2023, but none of them will pan out.20240101
It definitely showed up in 2023. It was in the public mind in December 2022, but was not yet the big thing that it became during 2023. A year ago I thought it would perhaps be neuro-symbolic AI, but clearly it is LLMs, and ChatGPT and its cousins. And, as I predicted in 2018 it was something already being worked on as the "attention is all you need" paper, the key set of ideas, was published in 2017.
The press, and researchers, generally mature beyond the so-called "Turing Test" and Asimov's three laws as valid measures of progress in AI and ML.
NET 2022
I wish, I really wish.20230101
The Turing Test was missing from all the breathless press coverage of ChatGPT and friends in 2022. Their performance, though not consistent, pushes way past the old comparisons.
20240101
The Turing Test was largely missing from the press in 2024 also, and there was a story in Nature commenting on that. So yes, this has now happened.
Dexterous robot hands generally available.NET 2030
BY 2040 (I hope!)
Despite some impressive lab demonstrations we have not actually seen any improvement in widely deployed robotic hands or end effectors in the last 40 years.
A robot that can navigate around just about any US home, with its steps, its clutter, its narrow pathways between furniture, etc.Lab demo: NET 2026
Expensive product: NET 2030
Affordable product: NET 2035
What is easy for humans is still very, very hard for robots. 20250101
A bunch of startups in the home robot space got significant funding in 2024. Two of them are run by ex-CEOs of large companies: iRobot and Cruise (and he was also an intern at iRobot after we were already a public company). So this one may be in play for a lab demo in the next few years if they have this as one of their goals..
A robot that can provide physical assistance to the elderly over multiple tasks (e.g., getting into and out of bed, washing, using the toilet, etc.) rather than just a point solution.NET 2028There may be point solution robots before that. But soon the houses of the elderly will be cluttered with too many robots.
A robot that can carry out the last 10 yards of delivery, getting from a vehicle into a house and putting the package inside the front door.Lab demo: NET 2025
Deployed systems: NET 2028
A conversational agent that both carries long term context, and does not easily fall into recognizable and repeated patterns.
Lab demo: NET 2023
Deployed systems: 2025
Deployment platforms already exist (e.g., Google Home and Amazon Echo) so it will be a fast track from lab demo to wide spread deployment.20240101
One half of this happened this year. ChatGPT has been connected to microphones and speakers so you can now talk to it. and It does not fall into recognizable patterns. BUT the other half is the half it does not have; it has no updatable memory apart from its token buffer of what it has just said. Long term context may be long term in coming.
An AI system with an ongoing existence (no day is the repeat of another day as it currently is for all AI systems) at the level of a mouse.NET 2030I will need a whole new blog post to explain this...
A robot that seems as intelligent, as attentive, and as faithful, as a dog.NET 2048This is so much harder than most people imagine it to be--many think we are already there; I say we are not at all there.
A robot that has any real idea about its own existence, or the existence of humans in the way that a six year old understands humans.NIML

A Parting Shot

I recently read a research paper on humanoid robots working in built for human environments. It was based on the argument that the best form for a robot that is to operate in human environments is something tallish and skinny-ish, and probably dynamically balancing, with arms that can reach down to table tops etc., and with a sensor system that can look down from above, as that is what our human environments are optimized for. Here is the first paragraph of the paper:

The past decade has seen an explosion of research in humanoid robotics. The stated motivations for this work have varied widely. Many teams have concentrated on bipedal locomotion, some have been interested in human level social interactions, understanding human intelligence, modeling human learning capabilities and others have been more interested in entertainment. Some humanoid robots have had manipulation capabilities on static humanoid platforms and some of that work is aimed at dexterity, plus there has been simple two armed grasping on mobile humanoid platforms. Overall there has been very little work combining dexterous manipulation with humanoid robots, static or mobile–much of that which has appeared, has been concerned with dynamic tasks like pole balancing and juggling rather than manipulation, or has used teleoperated manipulation.

Apart from the weird references to pole balancing and juggling this all sounds pretty reasonable and consistent with what is happening today, and with recent history.  In fact this is the very first paragraph of the very first paper in the very first issue of the very first volume of the International Journal of Humanoid Robotics.

And it was published in 2004, with me as first author.  Let me spell that out in case you thought there was a typo in the year. This is from a paper that I and my students and post-docs wrote in the year two thousand and four. Here is the beginning of the contents page for that first issue.

You can download the text of that paper here. The journal is now in its 21st year of operation, an on its 21st volume of issues and papers.

By the time this paper was written my research group at MIT had been working on and building humanoid robots for twelve years. This paper, about a robot named Cardea, was probably our sixth or seventh humanoid robot. [[In 2008 I started a company that built and shipped thousands of humanoid robots. The picture at the top of this post was taken in China with a line up of humanoids that we had built in Massachusetts and New Hampshire and sold to people in China (before a US initiated trade war with China put an end to it in 2018…irony can be personally hard to take at times…).]]

The robot Cardea (Cardea was an ancient Roman goddess of door hinges and handles; these are still a challenge for modern robots…) was a two wheeled dynamically balancing robot  that lived in a built-for-humans office environment. Cardea was able to open doors using existing door handles and then make its way through doors it had opened.

Pro tip: Just because you heard about a new idea this last year or two doesn’t mean that people haven’t been working on that very same idea for decades. So temper your expectations that it must be about to transform the world. Ideas that transform the world take decades, or centuries of development, and plenty of people long before you have been just as excited about the idea and had thought it was on the verge of taking off. And none of us, including you and me, are likely to be special enough or lucky enough to come along at just the right time to see it all happen.

Like all modern humanoid robots Cardea did not walk in a way that used passive dynamics to store energy, and basically modulate the behavior of a passive mechanism that had only low energy input, which is how all animals walk. So, like all modern mobile humanoid robots (and legged robots in general) when things were going awry its control algorithms tried to recover by pumping in large amounts of energy very quickly and sometimes that didn’t quite work and the energy needed to go somewhere.

Cardea could be a little dangerous in those circumstances, if it fell on you having just increased its kinetic energy. Even the spring based deployment system for its stick-like legs that were engaged when it realized it was going to fall could be dangerous.

This is still a problem with all modern humanoid robots. That is why the tele-operated humanoids that were in the Tesla movie lot theater show a couple of months ago operated in two modes. When they all walked out the human guests were kept away from them. Once they stopped walking and were operating in a very different mode people were allowed to approach them, and then get fooled into thinking they were talking to an AI powered robot when they were really talking to a remote human operator. But the robot was no longer moving its feet, and no longer a source of physical danger as a result.

Another pro tip: Don’t stand anywhere near a walking or balancing wheeled humanoid when they are moving or doing any task. I have had some near misses for myself with my own humanoids twenty years ago and more recently with some of the humanoids from new start ups. And more generally never be below any sort of walking robot, no matter how many legs it has, when it is walking up stairs.

HUMAN SpaceFLIGHT

The numbers of flights in 2024 was not much different from those in 2023 (I neglected to include the flights by China last year).  It does not feel like a golden age of human spaceflight, though there were other highlights from SpaceX.

Orbital Crewed Flights

Three countries put 28 people into orbit in 2024, the United States launched 16 people on five flights and Russia and China launched 6 people each with two launches. So there were nine crewed orbital flights total. Two were private and seven were government flights.

The United States: There were four US flights to the International Space Station, starting with the private Axion-3 mission with a crew of four on January 18th. The launch vehicle for this was a SpaceX Falcon 9, and the crew vehicle was a SpaceX Dragon. The remaining US flights to the ISS were paid for by NASA. Two of them were SpaceX flights, with four people on March 4th, the Crew-8 mission, and two people on board Crew-9 on October 25th. The remaining US flight to the ISS was the inaugural crewed flight of Boeing’s Starliner, launched on June 5th atop an Atlas V rocket with two people aboard. They are still stuck in space and will be for a few more months–see the section on Boeing below.

The other US mission was also a SpaceX launch and vehicle flight, this time known as Polaris Dawn. It was the second mission paid for by billionaire Jared Isaacman, with him as commander. There was a former US Air Force fighter pilot as mission pilot and two SpaceX employees as mission specialists, giving a total crew size of four. They stayed aloft for five days, launching on September 10th, This mission flew higher above Earth than any mission since Apollo 17, the last lunar landing mission, in 1972. Two of the crew “spacewalked” with their feet inside the Dragon capsule but with their bodies outside. This was the first private spacewalk ever. Now Isaacman has been tapped by the incoming US President to be the administrator of NASA.

Russia: There were two Soyuz launches, each with three people, up and down, but different people coming back. The launch dates were March 23rd and September 11the. The six people that launched on Soyuz in 2024 were 3 Russian Cosmonauts 2 NASA Astronauts and one Belarusian commercial airline flight attendant who won a national competition with 3,000 applications. She was the only one not set for a long duration mission and was off the ground for slightly less than 14 days. So there were no space tourists per so, but the Belarusian flyer was most likely included as part of Russia’s efforts to keep in good favor with Belarus which has aided it in its war in Ukraine, and was certainly not part of the regular scientific program of the ISS.

China: There were two flights of  Shenzhou (a larger more modern version of Soyuz) that were crewed in 2024.  Both flights were to the Tiangong Space Station and both took along three Taikonauts, first on April 25th and then on October 9th.  Both crews were assigned long duration missions and now the crews are overlapping previous crews at Tiangong so it is now being continuously occupied. The first handover this year took about five days and the second about three and a half weeks.  Both times there were six Taikonauts onboard Tiangong at the same time.

Suborbital Crewed Flights

There have been two companies providing space tourism flights on suborbital flights. Blue Origin launches a capsule on top of a reusable rocket, New Shepard, and the capsule lands using a parachute and a brief rocket blast right before hitting the ground (similar to how Soyuz lands). Virgin Galactic has a winged craft which is carried aloft by a bigger a jet engined airplane, it separates at high altitude within the atmosphere and rockets into space. It flies back and lands on a runway.

Both companies are run by billionaires who made their money in other businesses.  Both billionaires have flown to space on their own craft.

Both companies have aimed to have regular launches with lots of tourists, but neither has gotten to that scale and so far only a very small number of the many people who have paid a substantial deposit have been able to fly.

Blue Origin had a failure with an uncrewed version of the vehicle in 2022 and only flew one flight in 2023 which was also uncrewed. This year they flew three crewed flights on May 19th, August 29th, and November 22nd, each with six passengers (the system is automated and requires no pilots). In 2021 and 2022 they also had three flights, so there has now been nine crewed flights total. The first two took four passengers and the remaining seven have had six passengers, so altogether they have flown 50 people above the Karman line, 100 kilometers above Earth.  This is not yet a regular cadence, nor a large scale tourist business.

In 2024 Virgin Galactic had two flights, each with two crew from the company and four passengers. These flights were on January 26th and June 8th. Virgin Galactic flights are now on hiatus, awaiting a new bigger and better vehicle in about two years.  Virgin Galactic has had a total of twelve flights since December 13th in 2018.  Three have had two people on board and nine have had six people on board, for a total of sixty filled seats that have crossed the Karman line. The total number of different people is smaller as the two pilot seats on each flight have been occupied by a small number of people who have flown multiple times.

So, in 2024 thirty people went on suborbital flights, and altogether there have been 110 people on these commercial suborbital flights. Space tourism on suborbital flights has yet to take off in a regular or scaled way.

Prediction
[Space]
Date2018 CommentsUpdates
Next launch of people (test pilots/engineers) on a sub-orbital flight by a private company.
BY 2018
A few handfuls of customers, paying for those flights.
NET 2020
A regular sub weekly cadence of such flights.
NET 2022
BY 2026
20240101
There were four flights in 2021, three in 2022, and seven, five with customers on board, in 2023--all of them by Virgin Glactic. Blue Origin did not fly in 2023. At this point 2026 is looking doubtful for regular flights every week.
20250101
Now 2026 is looking impossible given the data from 2023 and 2024, and one of the two companies being on hiatus for all of 2025, and well into 2026.
Regular paying customer orbital flights.NET 2027Russia offered paid flights to the ISS, but there were only 8 such flights (7 different tourists). They are now suspended indefinitely. 20240101
There were three paid flights in 2021, and one each in 2022, and 2023, with the latter being the Axiom 2 mission using SpaceX hardware. So not regular yet, and certainly not common.
20250101
There were two paid flights in 2024.
Next launch of people into orbit on a US booster.
NET 2019
BY 2021
BY 2022 (2 different companies)

Current schedule says 2018.20240101
Both SpaceX and Boeing were scheduled to have crewed flights in 2018. SpaceX pulled it off in 2020, Boeing's Starliner did not fly at all in 2023, but is scheduled to launch with people onboard for the first time in April 2024.
20250101
The second company did finally launch humans into orbit in June 2024, so it has happened three years later than I predicted and six years later than what had been promised when my prediction was made. Of course, everyone implicitly assumed that along with getting humans into space the companies would also be able to bring them back. Not so for Boeing.
Two paying customers go on a loop around the Moon, launch on Falcon Heavy.
NET 2020
The most recent prediction has been 4th quarter 2018. That is not going to happen.20240101
Starship launched twice in 2023 but didn't get to orbit either time. This is going to be well over six years later than the original prediction by the CEO of SpaceX.
20250101
The billionaire who signed up for this and paid a hefty deposit in 2017 gave up waiting and cancelled the contract in 2024. This fantasy is over, for now at least.
Land cargo on Mars for humans to use at a later date
NET 2026
SpaceX has said by 2022. I think 2026 is optimistic but it might be pushed to happen as a statement that it can be done, rather than for an pressing practical reason.20240101
I was way too optimistic, and bought into the overoptimistic hype of the CEO of SpaceX even though I added four years, doubling his estimated time frame.
20250101
I can now call this as orbital mechanics and Hohmann transfer windows dictate that the cargo would need to have been launched a few months ago for it to get to Mars in 2025. It has not been launched.
Humans on Mars make use of cargo previously landed there.NET 2032Sorry, it is just going to take longer than every one expects.
First "permanent" human colony on Mars.NET 2036It will be magical for the human race if this happens by then. It will truly inspire us all.
Point to point transport on Earth in an hour or so (using a BF rocket).NIMLThis will not happen without some major new breakthrough of which we currently have no inkling.
Regular service of Hyperloop between two cities.
NIML
I can't help but be reminded of when Chuck Yeager described the Mercury program as "Spam in a can".
20240101
Calling this one 26 years early. As of today no-one is still working on this in an operating company.

Boeing’s Starliner

First announced in 2010 Boeing’s Starliner was originally scheduled to fly a human crew in 2018. It carried out its second uncrewed flight in May 2022, and finally did make its first crewed flight on June 5th. The crew of two docked with the ISS, but there were problems with multiple gas thrusters for fine motion during the docking. The original plan was that the crew would stay on the ISS for about a week and then return to Earth for a touchdown on to hard soil (as all Russian and Chinese crewed missions end along with all Blue Origin sub-orbital flights).

The option of that return was considered, but the thrusters were on a section of the vehicle which is discarded along the way before the landing so there was no possibility of getting a look at the hardware back on Earth.  So a program of tests while docked to the ISS was started delaying the crew return.

Eventually it was decided that it was too risky for the crew to return on the craft and so it returned empty on  September 7th, landing in New Mexico. As it happened, although there were more anomalies with the thrusters the crew would have landed safely had they been on board.

Now the crew was stranded in space with no designated ride home. It was decided to remove two crew from the Crew-9 launch and have the Starliner astronauts, Barry Wilmore and Sunita Williams, fly back on that SpaceX Dragon with the other two, which after additional delays is now scheduled to happen some time in March 2025. Their one week visit to the ISS will have stretched out to nine months by then.

Boeing has committed to fixing the problems with Starliner. The boosters that it uses are no longer being built, but there are five existing ones reserved for the five additional contracted flights that Boeing has with NASA. They are supposed to happen once per year.

We do not know at this point, but I think it would not be a huge surprise if Starliner never flies again.

SpaceX Falcon 9 

Once again the Falcon 9 launch system has broken all sorts of records for number of launches and reuse.

During 2024 there were 132 single booster launches.  For two of those flights no attempt was made to recover the first stage (there is a performance penalty for the primary payload in order to recover the first stage). One attempted recovery failed when the booster (on its 23rd flight) caught fire as it landed on the recovery barge. Another booster has since flown a total of 24 times.

In terms of mission success all but one of these flights succeeded; one failed when the second stage failed during re-ignition for adjusting the orbit.

There were also two Falcon Heavy, the three booster version, launches, both of which succeeded. One of the had successful landings for the two side boosters, but there was no attempt to recoer the central booster on that flight and no attempt to recover any of the three boosters on the other Heavy flight.

This brings the total number of launches of the single booster version to 417 along with 11 launches of the three booster Heavy version.  These numbers are way beyond the number of launches for any other orbital booster.  Additionally it is the only flying orbital system that is reusable at the moment, though  Blue Origin and Rocket Lab both plan on joining the club soon.

It is worth, once again, looking at how long it has taken to get to a total (across both single booster and Heavy triple booster versions) of 428 launches, with only three failures to deliver the payload to where it was intended to go.

The first launch occured in June 2010, and there were a total of 4 launches in the first three years.  The first successful booster recover happened on the 20th flight, in December 2015, five and a half years in. The first reuse of a booster occured in 2017, in the 8th year of the program.

Since 2021 there has been a steady increase in the number of launches per year,

Year# of launches
20102
20110
20122
20133
20146
20157
20168
201718
201821
201913
202026
202131
202261
202396
2024134

SpaceX had previously gotten satellites to orbit with its first rocket, the Falcon 1.  Falcon 9 has been a spectacular success.  But it was not instantaneous.  It took time to build from the cadence of launches, about 10 years before the hockey stick curve showed up.  Deployment is never sudden but comes after a long build.

SpaceX Starship

Starship is SpaceX’s superheavy two stage rocket, designed to put 150 tons of payload into orbit, but also be able to go to the Moon or Mars. There is the booster which is designed only to work in Earth atmosphere with 33 Raptor engines both to get the second stage high enough and fast enough and to let the first stage have a controlled return to the launch site. The second stage, called Starship, is both a booster and the payload.  It has three Raptor engines and three Raptor vacuum engines. The Raptor engines are designed to get the Starship into orbit after the first stage drops away, and to guide the Starship as it returns to its Earth launch site. The Raptor vacuum engines are meant for breaking out of Earth orbit and going to the Moon or Mars, and to do soft landings on those two bodies where there is no or almost no atmosphere.

In 2024 SpaceX made steady progress with four launches of the two stages coupled together.  The first two launches lead to both stages blowing up.

The third and fourth launches were a big improvement.  As with earlier flights they launched from the coast of Texas. In both cases the second stage did a reentry burn on it first orbit and then did a soft landing in a target zone in the Indian Ocean.  In the third flight the main booster returned to the launch site and hovered next to the launch tower betweeen two giant arms which then captured it and the engines shot down successfully. It was sifficiently damaged during flight however, that it was not reusable. In the fourth flight there were health anomalies to the first stage was ditched in the Gulf of Mexico.

On the fourth flight there was both less heat shielding and much less damage from heat during reentry. This is definite forward progress. But it is still quite a long way from both being operational and both stages being reusable. And it is even further away from being human rated.

This is the vehicle that the CEO of SpaceX recently said would be launched to Mars and attempt a soft landing there.  He also said that if successful the humans would fly to Mars on it in 2030. These are enormously ambitious goals just from a maturity of technology standpoint. The real show stopper however may be human physiology as evidence accumulates that humans would not survive three years (the minimum duration of a Mars mission, due to orbital mechanics) in space with current shielding practices and current lack of gravity on board designs. Those two challenges may take decades, or even centuries to overcome (recall that Leonardo Da Vinci had designs for flying machines that took centuries to be developed…).

The President of SpaceX may be taking a leaf out of the CEO’s always overly optimistic predictions. In November she said “I would not be surprised if we fly 400 Starship launches in the next four years”. Looking at the success of Falcon 9 it is certainly plausible that I may live to see 400 Starship launches in a four year period, but I am quite confident that it will not happen in the next four years (2025 through 2028).

One more thing. Back when I first made the predictions there had been an announcement by the CEO of SpaceX that in 2018 the company was under contract to send a very rich paying customer in a trip around the moon in 2018, launched on a Falcon Heavy. I was completely skeptical. Over the years the date got pushed back and pushed back, and the proposed flight vehicle was changed to be Starship. As we all know the flight of the Japanese billionaire around the Moon still hasn’t happened. In 2024 Yusaku Maezawa finally gave up waiting and cancelled the contract.

NASA Artemis

NASA’s plan is that the second Artemis mission, using the Orion Capsule, Artemis II, will fly to the Moon with four people aboard, the first crewed Artemis flight. An uncrewed flight of Orion around the Moon flew in 2022.  The crewed flight was scheduled to launch in May 2024, but it was first delayed by six months and then a little more and in the last year it has slipped another full year. It is now scheduled to fly in April 2026.

Artemis III was scheduled to launch in 2025 with a return to the surface of the Moon. However that relied on using a Starship (itself refueled in LEO by 14 (yes, fourteen!!) other Starship launches) to land there.  No one any longer believes that schedule, and willlikely delay a few years, given where Starship is in its development and current capability.  The officieal schedule says mid 2027, but that seems unlikely.

You can find the architecture of the Artemis III mission at this website.

Blue Origin Orbital BE-4 Engines and New Glenn

The suborbital tourist flights that Blue Origin operates are not its main business. It has ambitions to compete head to head with SpaceX. Another billionaire vs billionaire competition.

It has developed the BE-4 engine designed to fly 100 times, and to power the first stage of its massive New Glenn rocket (see below).  But in the meantime it has started selling the BE-4 to ULA (United Launch Alliance) to power their Vulcan Centaur heavy launch vehicle. It’s first stage uses two BE-4 engines, along with a variable number of solid fuel strap ons.

Vulcan Centaur flew two times in 2024 and the BE-4 engines worked perfectly both times, on January 8th and again on October 4th. This is a solid validation of the engine’s capabilities.

Blue Origin’s own first orbital class rocket, New Glenn, is massive, and comparable to the Flacon Heavy (three boosters) rather than the Falcon 9 in capability.   It has been in development for a long time, but saw its first visits to a launch pad, fully stacked in 2024. The first stage uses seven BE-4 engines, and is intended to land on a barge and be fully reusable. The second stage uses two BE-3U engines, a variant of the single engine used on their New Shepard sub-orbital space tourism vehicle. There is a project underway to make a fully reusable version of the second stage.

Launch seems imminent.  Here it is at the launch pad in November 2024.

On Friday December 27th, 2024, it was fully fueled in both stages and went through a countdown and fired its seven BE-4 engines for 24 seconds. Now it will leave the pad to have its payload installed. The launch could be as early January 6th.  The very first launch will be an all up affair, attempting to get something to orbit and land the booster on its first flight. This is a very different development approach to that used by SpaceX.

Let’s Continue a Noble Tradition!

The billionaire founders of both Virgin Galactic and Blue Origin had faith in the systems they had created. They both personally flew on the first operational flights of their sub-orbital launch systems. They went way beyond simply talking about how great their technology was, they believed in it, and flew in it.

Let’s hope this tradition continues. Let’s hope the billionaire founder/CEO of SpaceX will be onboard the first crewed flight of Starship to Mars, and that it happens sooner than I expect. We can all cheer for that.

Tips For Building and Deploying Robots

rodneybrooks.com/tips-for-building-and-deploying-robots/

This post is not about research or developing software for robots. Instead it is some tips on how to go about building robots for mass deployment and how to leverage those deployed robots for improving your product.

The four tips are straightforward but I explain them more below.

  1. Use other people’s supply chain scale wherever possible.
  2. Changing infrastructure up front eases robot success but kills ROI for customers and their ease of buying decisions.
  3. New infrastructure already developed for people is often good infrastructure for robots.
  4. Deployed robots can collect real world data.
1. Other People’s Supply Chain Scale

When you start deploying a robot it is unlikely to be initially in large numbers unless it is a very low price.  When we started selling Roombas at iRobot our first manufacturing batch was 70,000 robots, which is enormous by most robot deployment standards. We got to millions of robots per year. But these numbers are unusual for robots. Even so we got supply chain economies of scale at the millions we did not have at the tens of thousands.

How are you going to get economies of scale when you are shipping just a couple of robots per month, or even tens per week? And really how big an influence can you have over suppliers even when the volume is enormous, for you, at a small number of hundreds of robots per week?

My advice is to find someone else’s big juicy supply chain and grab a hold of that for yourself. Juicy in this context means three things. (1) Supply chains with significant downward price pressures. (2) Supply chains with enormous volumes. (3) Supply chains with many different suppliers, supplying standard parts, and all competing, and operating, in multiple geographies.

For example, you can use exactly the same motors that someone else is using in the millions per year, preferably a standard, and preferably ones that are built by multiple suppliers. Then buy those same motors from one or more of those standard suppliers, though first make sure they have adequate quality control and and quality guarantees. Ten years ago it made sense to get custom windings for motors in Eastern Europe. Now you should try to use existing motors from large scale consumer products.

Do the same for battery packs, or at least battery cells. Fifteen years ago it made sense to have unique battery chemistries both for robot performance and for the regulatory requirements at the geographical point of deployment. Now there are so many at scale different battery cells and chemistries available for at scale consumer products (including electric vehicles) it makes sense to pick the closest fit and use those in your robot.

Do the same for every other part of your robot that you can.

Doesn’t this make designing the production robot harder, getting standard parts rather than ones optimized for your robot?  Yes, and no.  It makes design a slightly different process, but one that rewards you handsomely in terms of lower BOM (Bill Of Materials) cost, and in stability of supply. Those of us (which is just about everyone) who went through the turbulence of COVID-era supply chains can testify to how that turbulence can kill the stability of your careful designs.

Oh, and by the way, you are already, to at least a limited extent, practicing this type of design. For your fasteners no manufacturing engineer is going to let you get away with deciding on an M5.3 bolt [an M5 bolt is 5mm, and an M6 bolt is 6mm, and they are standard sizes, with no standard in between], with a custom head, and a custom pitch, and a custom length of 10.27mm. No, they are going to insist that you use an M5 or an M6 bolt, with one of the many standard heads and pitches and one of the many standard lengths.  And the reasons for that are precisely the same as the reasons I gave above for motors and battery cells, and processor boards, connectors, etc. There is an enormous supply chain for standard sized bolts, so they are way cheaper than custom sized bolts, by two or three orders of magnitude.

2. Changing Infrastructure

I have a rule when designing robots that are going to be deployed to real customers. No one in the company is allowed to say “the customer/end-user can just …”. If we are asking a customer or end-user to do something they wouldn’t naturally do already we are making it harder for them to use our product, compared to what they are already doing, or making it more expensive for them to install our product. This is true of both user interfaces and infrastructure.

As a business you need to decide whether you are selling infrastructure, or selling something that goes into an existing environment. A robot that can move about is a powerful thing to add to an existing environment because it provides mobility and mobile sensors without, in general, having to change the infrastructure.

The Roomba was an easy sale, not just for its price, but because it did not demand that people had to install some infrastructure in their home environment before they could have a Roomba clean their floors. All that a Roomba needed was a standard electrical outlet into which a human could plug its charger. Nothing else.

Since we have been working at Robust AI on a new class of autonomous mobile robots for warehouses and factories, many people (mostly outside of the company, and not potential customers) have suggested all sorts of infrastructure changes that would make it easier to deploy our robots. These have included rubber floors, floors with a wireless recharging system underneath them so the robots can stay charged at all times, radio or laser beacons for localization of the robot, QR codes on every pillar, also for localization, structured lighting systems for the whole interior of the building, separate floors for human driven forklifts and the robots, etc., etc.

All these things would be additional expense for deploying the robots. It would have three bad impacts. (1) There would be a pre-phase for deployment where the infrastructure would have to be built into the existing facility, causing delay and in some cases downtime for the existing facility. (2) Lower ROI (Return On Investment) for the customer as one way or another they would end up paying for the change in infrastructure. (3) A much harder buying decision for the customer both because of a longer deployment delay and because of additional cost.

Your robot business will be much simpler if you don’t need changes in infrastructure.

3. But There Is Good News On Infrastructure

As technology advances we humans build new infrastructure for us. Often that same infrastructure turns out to be useful for robots.

When people first had autonomous vehicles driving on freeways (back in the 1980’s!! — see the work of Ernst Dickmanns) there was not much external infrastructure, besides white lines painted on the roads, to help the vehicles. But since then a lot has changed, with new infrastructure that has been installed to help human drivers. We now have GPS, and digital maps, so that our human driven cars know exactly where they are on those maps, and can display it to us, and search the maps for recommended routes. In addition there is congestion information that gets sent to our human driven cars and our phones, which let our route planners suggest faster routes. And the systems in some cars know the speed limits everywhere without having to read the speed limit signs. All these capabilities make it much easier for robot cars, autonomous vehicles, to be able to drive around. That human assist infrastructure has been a real boon towards getting to autonomous vehicles.

In the early days of robots for hospitals and for hospitality (i.e., hotels) and for managed care facilities for the elderly, it was first necessary to install Wi-Fi in the buildings as it was needed for the robots to know what to do next. That was a big barrier to entry for those robots.  Now, however, the humans in those environments expect to have Wi-Fi available everywhere. It started to be deployed in hospitals to support the handheld devices supplied to doctors and nurses so they could record notes and data, but now patients and their visitors expect it too. The same with hotel guests. So requiring Wi-Fi for robots is no longer that much of a barrier to adoption.

In warehouses and factories the hand held bar code scanners have forced the introduction of Wi-Fi there. And in order to make it easy for humans to push around carts, as they do in both those environments, the floors are now generally trash free, and the slopes and lips in the floors are all quite small–infrastructure introduced for the benefit of humans. But this infrastructure, along with plenty of lighting for humans so they can see, make it much easier to introduce robots than it would have been a few decades ago, even if we had had robots back then.

Look for human driven infrastructure changes in whatever domain you want to have your robots work, and see if the robots can make use of that infrastructure. It will likely make them better.

4. deployed Robots Live In A Sea Of Real World Data

The last few years have seen a clear rise in the utility of machine learning. (Just for reference the term “machine learning” first appeared in 1959 in a paper on checkers written by Arthur Samuel; I wrote a really quite bad master’s thesis on machine learning back in 1977.) The web has provided massive amounts of text for machine learning which has given rise to LLMs. It has also supplied images and movies but they, and other images and movies explicitly collected for the purpose, have required human labeling. Recently people have had sensors attached to them and have been asked to do physical tasks to provide data for research projects (they are “research” projects even if done in well funded companies, as this data has not yet led to deployed robots) aiming to use machine learned control systems for robots (it is not clear to me that the type of data currently collected, nor the assumptions about the form of those control systems, will lead to useful results).

If you have deployed robots then data collection with labels may be available to you pretty much for free.

If your robots have cameras on them, then once they are deployed they are able to observe lots of real world things happening around them. If they are doing real work, then those video feeds are synchronized to what is happening in the world. and to external data interactions that the robot has with the work management system installed in the workplace. Thus the visual data is pre-labelled in some way, and can be used to feed learning algorithms with vast quantities of data over time.

Here is an example of some collected data that is used to feed learning algorithms at Robust AI.

Get your robots deployed then have them learn from real world data to improve and extend their performance.