Press Release

SoftBank corp. develops traffic understanding multimodal AI for autonomous driving

Press Release, 5 November 2024

SoftBank Corp. (“SoftBank”) announced it developed a traffic understanding multimodal AI* designed for remote support of autonomous driving, designed to operate on low-latency edge AI servers with the goal of achieving fully unmanned operations. The traffic understanding multimodal AI aims to address key challenges in the social implementation of autonomous driving by enhancing vehicle safety and reducing operational costs through external support. By running the multimodal AI in real time on SoftBank’s MEC (Multi-access Edge Computing) and other edge AI servers with low latency and high security, the system enables real-time understanding of autonomous vehicle status, providing reliable remote support for autonomous driving.

In October 2024, SoftBank launched a field trial of this remote support solution for autonomous driving using the traffic understanding multimodal AI at the Keio University Shonan Fujisawa Campus (location: Fujisawa, Kanagawa, “SFC”). This experiment aims to verify whether the traffic understanding multimodal AI can provide remote support to autonomous driving, ensuring smooth operation even when vehicles encounter unforeseen situations that may impede driving.

Key Features of the Traffic Understanding Multimodal AI

The traffic understanding multimodal AI processes forward-facing footage from autonomous vehicles (such as dashcam video) and prompts about current traffic conditions to assess complex driving situations and potential risks, generating recommended actions to enable safe driving. The foundational AI model has been trained on a broad range of Japanese traffic knowledge, including traffic manuals and regulations, along with general driving scenarios and risk situations that are difficult to predict, as well as corresponding countermeasures. This training enables the traffic understanding multimodal AI to acquire a wide spectrum of knowledge essential for the safe operation of autonomous vehicles, providing it with an advanced understanding of traffic conditions and potential driving risks.

Overview of the Remote Support Solution for Autonomous Driving

In this solution, camera footage from autonomous vehicles is transmitted in real time to MEC (Multi-access Edge Computing) via a 5G (5th Generation Mobile Network) system. The traffic understanding multimodal AI, operating on GPUs (Graphics Processing Units), instantly analyzes potential driving risks based on the transmitted footage and other data, translating these risks and recommended countermeasures into language to provide real-time remote support for autonomous driving. This allows autonomous vehicles to continue driving safely, even in situations where they are unable to assess risks independently. Currently, remote operators issue instructions to autonomous vehicles based on information analyzed and verbalized by the traffic understanding multimodal AI. However, the ultimate goal is to achieve fully unmanned operations by allowing the traffic understanding multimodal AI to directly issue instructions to the autonomous vehicles.

Example of the Field Trial

In the field trial conducted at SFC, one scenario tested involved driving in a situation where a vehicle is stopped in front of a crosswalk. In this scenario, there is a risk of overlooking a person attempting to cross from behind the stopped vehicle on the left, which could result in a collision with a pedestrian emerging from the vehicle’s blind spot as the autonomous vehicle approaches the crossing. According to Japanese traffic regulations, when approaching an unsignalized crosswalk with a stopped vehicle in front, drivers are required to come to a complete stop before proceeding forward.

When there is a stopped vehicle in front of a crosswalk, a temporary stop is required

In this case, if the autonomous vehicle approaches the crosswalk at high speed or fails to perform a stop, the remote operator must intervene to prevent an accident. However, since remote operators are monitoring multiple vehicles simultaneously, there is a risk that they may not detect dangers immediately or may be unable to intervene appropriately. To address this, the traffic understanding multimodal AI generates real-time information on the current “traffic conditions,” “driving risks,” and “recommended actions to mitigate risks,” issuing instructions directly to the autonomous vehicle. This enables effective remote support for autonomous driving from an external location.

In this field trial, when the autonomous vehicle approaches a stopped vehicle and a crosswalk, increasing the risk level, the system generates the instruction: “There is a vehicle stopped ahead of the crosswalk. Please come to a complete stop, as a pedestrian may suddenly appear.” This confirmed the feasibility of remotely supporting autonomous driving from an external location.

Example of the Field Trial

This remote support solution for autonomous driving is being utilized experimentally in trials using autonomous driving technology conducted by MONET Technologies Inc. By continuously learning from unpredictable driving risks and recommended actions encountered in real driving environments, the traffic understanding multimodal AI’s accuracy will be further enhanced.

Hironobu Tamba, Vice President and Head of Data Platform Strategy Division at SoftBank said, “SoftBank is advancing the development of one of the largest AI computing infrastructures in Japan, alongside a domestic large language model (LLM). With the successful development of SoftBank’s proprietary traffic understanding multimodal AI and its trials in remote support solutions for autonomous driving, I am confident that the integration of communication technology and AI can offer promising solutions to societal challenges.”

Going forward, SoftBank will continue to promote research and development aimed at the social implementation of autonomous driving, striving to address societal challenges.

Back to top button