Can reinforcement learning replace the robotics approach for self-driven cars? Looks like the second generation of autonomous vehicles is here
Robots are robots, after all
The robotics approach traditionally undertaken by self-driving cars considers driving to be a sequence of distinctly separate problems involving route identification, perception, on-road decision-making, vehicle control, and so on. Each of these problems is addressed through one dedicated automation module. The modules are integrated to communicate with each other as required – all working together to run the vehicle as required. Combined with the high-definition 3D maps that meticulously plot the area at the street level – traditional autonomous vehicle (AV) makers treat the car as a robot that has been programmed to run within pre-set parameters based on decisions it takes as per pre-defined algorithms.
But that is so far from the realities of a city road! However exhaustively you predict situations, it is impossible to feed in the algorithm for every conceivable on-road situation. However inclusively you chart a city through high-definition 3D maps with all the street details plotted centimeter-by-centimeter, and road layouts can change at any moment – rendering all maps obsolete and exposing the AV to accident risks. And logically, such an AV can be cent-percent perfect only if you define the entire world through such maps, which is another absurd task.
Let’s face it – a human driver drives not merely by executing “X” task in “Y” situation – and so on – according to some pre-defined collision-aversion technique etched in mind. At the steering wheel, we, humans take in the road conditions in real-time and take decisions accordingly – as per the need of the moment. Two similar situations might not elicit exactly the same response from a human driver – depending on a host of other factors at any given moment.
Chuck the map!
The shortcomings of the traditional robotics-based AV technology are now being challenged by a new breed of AV start-ups. They are discarding both the module-based robotics approach, as well as static map-based navigation. Instead, they are relying on AI and neural-network-based reinforcement learning.
Almost all of them have relinquished HD maps. Rather, they would prefer that their vehicles learn to read the road using sensor data alone. And just like human drivers, these cars will actually have to “learn” through trial and error, how to react to any specific situation on the go. This is where reinforcement learning comes into play. The process might be more time-consuming and error-prone, to begin with – but once perfected, such autonomous vehicles will never need to remain tied to any particular location via maps. They will be able to adjust to any road condition without meticulous mapping.
The other major change in approach is replacing the one multiple-module,robotics-based system with a flexible neural network that figures out the details by itself. And instead of building one overarching system in which multiple neural networks are interconnected, most AV start-ups are focussing on a composite neural network that can build upon the details by itself based on live inputs. This is much closer to what a human driver does.
The idea is to capture all live on-road data through the vehicle-mounted sensors and provide them to the AI. The algorithm takes this input data and, depending on the situation, converts it into output action for the vehicle to perform. How does it know what output to choose for a specific input? Simply by trial and error – like all humans – and then retaining that learning for future use. In this way, the more such AI-controlled vehicle drives, the more accomplished it becomes through accumulated learning. Training a neural network to perform a task via trial-and-error is called Reinforcement Learning. This technique is what GPT-3 employed for natural-language processing and AlphaZero used for its Go and chess-playing algorithms.
Meet the start-up kids
- Wayve: This UK-based start-up was the first to rock the boat in the self-driving vehicles industry. It all started in 2017, when Alex Kendall, the founder, fitted his test vehicle with a neural network and a string of cameras and took the car for a spin in the British countryside. Kendall drove it like a regular vehicle for some time and then left the steering wheel alone. As the car tended to move off-course, he intervened via the steering and corrected the course. Again he took off his hands, and again the car swerved to a side. This went on for half an hour, by the end of which the neural network had learnt how to self-correct the car whenever off course. Kendall’s experiment was the first successful use of reinforcement learning to teach a car to drive from scratch on a real road.
Currently, Wayve trains its cars directly on rush-hour London streets. It has also demonstrated successfully that their London-trained cars can navigate in other cities without any additional training whatsoever. No traditional AV-maker can yet claim that! Wayve did it in five UK cities – Cambridge, Coventry, Leeds, Liverpool, and Manchester.
Although Wayve aims to be the first to deploy driverless cars in 100 cities, they are not rushing things. Their logic is: doing everything perfect in the first city would take time; but once that is done correctly, it can be scaled anywhere. Wayve has recently collaborated with Microsoft to train its neural network on Microsoft’s cloud-based supercomputerAzure. Looks like Wayve is playing all their cards right.
- Autobrains Technologies: A start-up based in Israel, Autobrains’ automotive visual intelligence platform revolutionizes how deep learning is applied to AVs, with a new self-learning approach that mimics human driving perception. They use an end-to-end approach too, but in a way that is different from Wayve. Instead of training one massive neural network to figure out all possible driving situations, Autobrainstrains thousands of smaller networks to handle specific situations each. The logic is: a real-life driver never takes in every available information cue from the road ahead – rather a human driver would concentrate on the most immediate problems contextually and focus on those alone.
Autobrains’ AI algorithm has analyzed a million miles of driving data, to come up with nearly 200,000 unique driving situations or scenarios. These are the “contextual problems”, and the developers are training individual neural networks to handle each of them. During driving, the vehicle runs the sensor data through an AI that matches the on-road situation to one of many possible scenarios in its databank –for example: rain, pedestrians ahead, traffic signals, a bike turning right or left, a speeding car behind, to sample just a few. Although the approach is unique, it is still experimental. So long, the start-up has been in partnership with car manufacturers for testing its technology. Very recently it has acquired a fleet of its own vehicles.
- Waabi: This Canadian start-up is also leveraging the possibilities of AI-based end-to-end learning, similar to Wayve.However, the approach is different. It is developing its AI almost fully inside a super-realistic closed-loop driving simulation. This enables testing at the scale of both common and safety-critical driving scenarios within a safer and more affordable environment. Waabi is leveraging deep learning, probabilistic inference, and complex optimization to create a self-driving algorithm that is end-to-end trainable, interpretable, and capable of very complex reasoning. All this means, Waabi is not using real cars on road yet, and we will have to wait to see how things shape up for this promising AV start-up.
Old versus new
Traditional AV manufacturers are not yet ready to give full credence to these new breeds of self-driving carmakers. Innovative, yes – but not practical, they say. They are not convinced that the state-of-the-art AI-based technology is yet anywhere near to where traditional driverless vehicles like Cruise or Waymo is right now.
Some experts argue that the mainstream modular would be easier to scale as modules could simply be replaced as advanced technology emerges. Also, the robotics-based approach has already proved itself in complex urban settings, and hence scaling to other cities would not be as big an obstacle as these start-ups are projecting it to be. And indeed, Cruise’s robo-taxis are already being used by paying customers in a city as populous as San Francisco. The new start-ups are yet to get their vehicles out of the stable, so to say!
Even then, the developments look exciting. The new AV start-ups are already being hailed as AV2.0 firms – harbingers of the second generation of autonomous vehicles. Reinforcement learning did bring about a revolution in several areas of advanced computing – so why not in the AV industry too?
Know more about the syllabus and placement record of our Top Ranked Data Science Course in Kolkata, Data Science course in Bangalore, Data Science course in Hyderabad, and Data Science course in Chennai.