ML Planning & Simulation

Speaker: Peter Ondruska

Modern production self-driving systems used in the industry still rely excessively on hand-engineering, especially when it comes to planning and simulation. This is becoming a limiting factor in self-driving development. Autonomy 2.0 we cover in this tutorial is a paradigm of using ML-first approaches for these components offering greater scalability, safety and comfort of self-driving cars.

Bio: Peter Ondruska is a Head of Research at Level 5 division, where he leads a team working on ML planning and simulation technology for self-driving cars. Prior to Lyft Peter was the co-founder and CEO of Blue Vision Labs a company that developed large-scale localisation and mapping solutions for robotics and AR. The company was acquired by Lyft in 2018. Peter did his Ph.D. at Oxford Robotics Institute and worked at Google, Facebook, Microsoft Research, Metamind on computer vision and ML systems.

Scaling the Perception and Autonomous Driving Systems via Efficient Data, Learning, and Simulation Techniques

Speaker: Chen Wu

Modern machine learning techniques already achieve strong baseline performance for many components of the autonomous driving systems. The real challenge to reaching the product level of high performance lies in the capability of very fast evolution of an extremely large and hybrid machine learning system. In 2020, Waymo announces the 5th generation Waymo Driver. Waymo’s Perception system adapts to such major sensing changes rapidly and methodically; same for different vehicle platforms such as Waymo Via. Such large cross-platform support is enabled through aggressively applying domain adaptation techniques, together with effective usage of new platform data (even at a very low volume). One key challenge on the path to productionizing autonomous driving systems is the longtail issue, and the ability to respond quickly as longtails are discovered. Numerous data augmentation techniques are applied by the Waymo machine learning researchers and practitioners, to make our system learn faster, and more importantly to achieve safety. In the end I’ll talk about how advanced simulation helps to close the fast feedback loop, for all the aforementioned techniques to be deployed in the fleet safely and timely.

Bio: Chen Wu is an Engineering Director in Perception at Waymo, a self-driving technology company with a mission to make it safe and easy for people and things to move around. In her role, Chen leads a team responsible for how our vehicles use Waymo’s custom sensors, including cameras, lidar and radar, to see the world around it and recognize a diverse set of objects. Her team develops a wide range of machine learning techniques across individual sensor modalities as well as sensor fusion, applying them on the vehicle’s perception system that enables the vehicle to make real-time decisions. Prior to Waymo, Chen worked on the cameras on Google Glass. Before that, Chen was at YouTube where she used machine learning to enable 2D videos to be viewed in 3D. Chen holds a Ph.D. and M.S. in electrical engineering from Stanford University, and a B.S. in control theory from Tsinghua University in China.

Towards Realistic and Scalable Simulation for Autonomous Driving

Speaker: Shenlong Wang

Self-driving vehicles could make ground transportation safer, cleaner, and more scalable. Achieving this requires earning public trust by showing self-driving vehicles have demonstrably safe behavior. Computer simulations are a reliable and scalable tool for validating and training self-driving autonomy. However, there is a significant realism gap between the simulated environment and the real world – a demonstration of safety in a simulation might not be reliable. In this talk, I will first present how we integrate real-world assets, learnable components, and physical models to simulate realistic sensory inputs, such as LiDARs and cameras. I will then demonstrate a novel deep generative model to generate high-fidelity and diverse traffic scenarios. Finally, I will give a brief personal outlook on open research topics on building simulation environments that benefit the development of self-driving autonomy.

Bio: Shenlong Wang will be joining the UIUC Department of Computer Science as an Assistant Professor in Fall 2021. He is currently visiting Intel Intelligent Systems Lab, working with Dr. Vladlen Koltun. He had been a research scientist at Uber ATG and a Ph.D. student at the University of Toronto under the supervision of Prof. Raquel Urtasun. Shenlong's research interests span the spectrum from computer vision, robotics, and machine learning. His recent work involves developing robust algorithms for self-driving and making autonomous vehicles more reliable and scalable. His research has resulted in over 40 papers at top conferences, including over 15 oral and spotlight presentations. His co-authored work received IROS Best Application Award Finalist in 2020. He was selected as the recipient of the Facebook, Adobe, and Royal Bank of Canada Fellowships in 2017.

Flexible and safe intent-driven predictions

Speaker: Anca Dragan

In control theory, we do human "prediction" by treating the person as adversarial -- anything they can physically do (their "forward reachable set") is an option to be guarded against by the robot. This fails when guarding against all options means our cars can not leave the garage without violating a safety constraint. In machine learning, we do human prediction by fitting high capacity models to large datasets of human behavior. This fails when the model picks up on some spurious correlation from training that no longer holds at test time. In my group, we have been working on robustifying such predictors without going all the way to the forward reachable set -- we do this by leveraging the fact that human actions are not arbitrary, but based on decisions people make in service of the intentions they have, e.g. we drive the way we do because we want to get places while staying safe. Instead of learning the human actions directly from data, we learn the space of intentions people have and the reward functions that make sense of their actions under those intentions. However, this too can fail in novel situations, where the person's intent and what they care about goes unmodelled. In such cases, it'd be great to just fall back on avoiding their forward reachable set. This talk introduces a model of human behavior that, by estimating online the extent to which intent-driven predictions fit with the current human's behavior, automatically interpolates between the learned model and the forward reachable set. In turn, this leads to the robot acting more conservatively whenever people start acting in unmodelled ways.

Bio: Anca Dragan is an Assistant Professor in EECS at UC Berkeley, where she runs the InterACT lab. Her goal is to enable robots to work with, around, and in support of people. She works on algorithms for a) coordinating with people in shared spaces, and b) learning what people want the robot to do in the first place. She is also a Staff Research Scientist at Waymo, where she advises the Behavior team, who is responsible for prediction and planning. Anca did her Ph.D. in the Robotics Institute at Carnegie Mellon University on legible motion planning. At Berkeley, she helped found the Berkeley AI Research Lab, is a co-PI for the Center for Human-Compatible AI. Anca has been honored by the Presidential Early Career Award for Scientists and Engineers (PECASE), the Sloan Fellowship, the NSF CAREER award, MIT's TR35, and the RAS Early Career Award.

Data-Driven Control: Reinforcement Learning without Trial and Error

Speaker: Sergey Levine

How can reinforcement learning be applied to autonomous driving problems? On the surface, the notion of using reinforcement learning, which has come to be synonymous with learning by trial and error, seems highly problematic in safety-critic settings such as autonomous driving. However, the appeal of an end-to-end learning paradigm is considerable: if we can dispense with hand-engineered perception, prediction, and planning components, we could directly learn near-optimal strategies that optimally adapt perception and control systems to one another, in principle providing both better performance and a significantly simpler approach to the problem. In this talk, I will discuss some of the techniques being developed today that could make this possible. The core principle behind these techniques is the idea that reinforcement learning can learn directly from previously collected data, rather than learning through trial and error or active interaction. I will discuss how we can develop offline reinforcement learning algorithms that carry appealing theoretical guarantees, how these methods can be extended to provide safety during online data collection, and describe a few recent real-world robotics experiments that illustrate how offline data can be leveraged to learn robotic navigation strategies.

Bio: Sergey Levine is an Assistant Professor in the Department of Electrical Engineering and Computer Sciences at UC Berkeley. In his research, he focuses on the intersection between control and machine learning, with the aim of developing algorithms and techniques that can endow machines with the ability to autonomously acquire the skills for executing complex tasks. In particular, he is interested in how learning can be used to acquire complex behavioral skills, in order to endow machines with greater autonomy and intelligence.

Offboard Perception for Autonomous Driving

Speaker: Charles R. Qi

While current 3D object recognition research mostly focuses on the real-time, onboard scenario, there are many offboard use cases of perception that are largely underexplored, such as using machines to automatically generate high-quality 3D labels. Existing 3D object detectors fail to satisfy the high-quality requirement for offboard uses due to the limited input and speed constraints. In this talk, we introduce a novel offboard 3D object detection pipeline using point cloud sequence data. Observing that different frames capture complementary views of objects, we design the offboard detector to make use of the temporal points through both multi-frame object detection and novel object-centric refinement models. Evaluated on the Waymo Open Dataset, our pipeline named 3D Auto Labeling shows significant gains compared to the state-of-the-art onboard detectors and our offboard baselines. Its performance is even on par with human labels verified through a human label study. Further experiments demonstrate the application of auto labels for semi-supervised learning and unsupervised domain adaptation, as well as the application to build a large scale motion forecasting dataset.

Bio: Charles Qi is a research scientist at Waymo LLC. Previously, he was a postdoctoral researcher at Facebook AI Research (FAIR) in 2019 and received his Ph.D. from Stanford University (Stanford AI Lab and Geometric Computation Group) in 2018, advised by Professor Leonidas J. Guibas. Prior to Stanford, he received his B.Eng. from Tsinghua University in 2013. His research focuses on deep learning, computer vision and 3D. He has developed novel deep learning architectures for 3D data (point clouds, volumetric grids and multi-view images) that have wide applications in 3D object classification, object part segmentation, semantic scene parsing, scene flow estimation and 3D reconstruction; such deep architectures have been well adopted by both academic and industrial groups across the world.

When our Human Modeling Assumptions Fail

Speaker: Dorsa Sadigh

Prediction and behavior modeling is still an important challenge in autonomous driving. Most current techniques, model-based or data-driven, make strong assumptions about human behaviors, adn simplistic assumptions about how autonomous cars should drive near the end of the risk spectrum. In this talk, we discuss settings where these assumptions fail to hold, and provide techniques and overarching paradigms that can capture human behavior even in complex scenarios. We will first discuss a hierarchical approach for driving in near-accident scenarios under phase transitions. We will then go over how to model human behaviors in near the end of the risk spectrum scenarios using ideas from cumulative prospect theory. Finally, we end the talk by going over other structures present in human interaction data, and how that can be useful in building better predictive models for autonomous driving.

Bio: Dorsa Sadigh is an assistant professor in Computer Science and Electrical Engineering at Stanford University. Her research interests lie in the intersection of robotics, learning, and control theory. Specifically, she is interested in developing algorithms for safe and adaptive human-robot interaction. Dorsa received her doctoral degree in Electrical Engineering and Computer Sciences (EECS) from UC Berkeley in 2017, and received her bachelor’s degree in EECS from UC Berkeley in 2012. She is awarded the NSF CAREER award, the AFOSR Young Investigator award, the IEEE TCCPS early career award, the Google Faculty Award, and the Amazon Faculty Research Award.

The Value Proposition of Forecasting for motion planning

Speaker: Arun Venkatraman

One of the greatest challenges of self-driving is the interaction with other actors on the road-- whether they be drivers, cyclists, or pedestrians. Autonomous Vehicles (AVs) would already likely be ubiquitous if these technical challenges of interaction were not so significant. A key recognition in the community is that the ability to forecast other actors is a powerful tool to enable computing safe, interpretable, and responsive driving actions. Forecasting other actors has traditionally been treated as a step to be cascaded after perception to provide key input to the motion planning sub-system. Aurora’s clean sheet approach to self-driving gave us the space to identify a better way: integrating forecasting within our decision making (i.e., Motion Planning) architecture. By understanding that Forecasting drives the creation of the Utility or Value function that governs the actions of the AV, we are able to directly consider the impact of the AV’s own decisions on the motions of other actors. Such reasoning empowers the Aurora Driver to keep the goal of making the correct decisions, rather than forecasting other actors' AV agnostic multi-modal futures. We’re not interested in how accurately we can forecast another actor’s actions in the abstract but only in how it leads the Aurora Driver to make safe, expert-driver-like decisions.

Bio: Arun Venkatraman is a founding engineer at Aurora, the company delivering self-driving technology safely, quickly, and broadly. Arun completed his PhD, Training Strategies for Time Series: Learning for Prediction, Filtering, and Reinforcement Learning, at the Robotics Institute at Carnegie Mellon University co-advised by Dr. Drew Bagnell and Dr. Martial Hebert. During his time at CMU and NREC, Arun worked on a variety of robotics applications and received a best paper award at Robotics Science and Systems 2015 for work on autonomy assisted teleoperation via a brain-computer interface. At Aurora, Arun leads the Motion Planning Behavior Planning team, bringing together the best in machine learning with the best practices in robotics development to develop the Aurora Driver.

Persistence priors aid 3D detection with past traversals

Speaker: Kilian Q. Weinberger

While in operation, autonomous vehicles collect LiDAR data, which is typically discarded immediately after its use for object detection and subsequent actions. In this talk, we propose to utilize this data for future traversals of the same route (by the same or other vehicles). By comparing the point clouds across multiple traversals, we generate a persistence prior over the local scene, indicating if a local region is part of an object that is stationary over time. This prior information can be computed and queried very efficiently (in parallel) for each LiDAR point on demand, and a self-driving car traversing this route can use it as a scene prior for the detector. More importantly, persistence prior is compatible with most modern 3D detection architectures --- as an extra input channel --- and can be incorporated easily on-the-fly by a detector trained with such information. I will also discuss Ithaca-365, a new dataset we have been collecting for facilitating this research.

Bio: Kilian Weinberger is an Associate Professor in the Department of Computer Science at Cornell University. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul and his undergraduate degree in Mathematics and Computing from the University of Oxford. During his career he has won several best paper awards at ICML (2004), CVPR (2004, 2017), AISTATS (2005) and KDD (2014, runner-up award). In 2011 he was awarded the Outstanding AAAI Senior Program Chair Award and in 2012 he received an NSF CAREER award. He was elected co-Program Chair for ICML 2016 and for AAAI 2018. In 2016 he was the recipient of the Daniel M Lazar '29 Excellence in Teaching Award. Kilian Weinberger's research focuses on Machine Learning and its applications. In particular, he focuses on learning under resource constraints, metric learning, Gaussian Processes, computer vision and deep learning. Before joining Cornell University, he was an Associate Professor at Washington University in St. Louis and before that he worked as a research scientist at Yahoo!