APA Style
Sonda Fourati, Wael Jaafar, Noura Baccar. (2025). A Novel MLLM-based Approach for Autonomous Driving in Different Weather Conditions. Computing&AI Connect, 2 (Article ID: 0018). https://doi.org/Registering DOIMLA Style
Sonda Fourati, Wael Jaafar, Noura Baccar. "A Novel MLLM-based Approach for Autonomous Driving in Different Weather Conditions". Computing&AI Connect, vol. 2, 2025, Article ID: 0018, https://doi.org/Registering DOI.Chicago Style
Sonda Fourati, Wael Jaafar, Noura Baccar. 2025. "A Novel MLLM-based Approach for Autonomous Driving in Different Weather Conditions." Computing&AI Connect 2 (2025): 0018. https://doi.org/Registering DOI.Volume 2, Article ID: 2025.0018
Sonda Fourati
sonda.fourati@medtech.tn
Wael Jaafar
wael.jaafar@etsmtl.ca
Noura Baccar
noura.baccar@medtech.tn
1 The Computer Systems Engineering Department, Mediterranean Institute of Technology (MedTech), Tunis, Tunisia.
2 The Software and IT Engineering Department, École de Technologie Supérieure (ÉTS), Montreal, QC H3C 1K3, Canada
* Author to whom correspondence should be addressed
Received: 09 Feb 2025 Accepted: 22 Jun 2025 Available Online: 01 Jul 2025
Autonomous driving (AD) technology promises to revolutionize daily transportation by making it safer, more efficient, and more comfortable. Its role in reducing traffic accidents and improving mobility is vital to the future of intelligent transportation systems. AD systems (ADS) are expected to function reliably across diverse and challenging environments. However, existing solutions often struggle under harsh weather conditions such as foggy, rainy, or stormy, and mostly rely on unimodal inputs, thus limiting their adaptability and performance. Meanwhile, multimodal large language models (MLLMs) have shown remarkable capabilities in perception, reasoning, and decision-making, yet their application in AD, particularly under extreme environmental conditions, remains largely unexplored. Consequently, this paper proposes MLLM-AD-4o, a novel AD agent that leverages prompt engineering to integrate camera and LiDAR inputs for enhanced perception and control. MLLM-AD-4o dynamically adapts to available sensor modalities and is built upon GPT-4o to ensure contextual reasoning and decision- making. To support realistic evaluation, the agent was developed using the LimSim++ framework, which integrates the SUMO and CARLA driving simulators. Experiments are conducted under harsh conditions, including bad weather, poor visibility, and complex traffic scenarios. The MLLM-AD-4o agent’s robustness and performance are assessed for decision-making, perception, and control. The obtained results demonstrate the agent’s ability to maintain high levels of safety and efficiency, even in extreme conditions, and using different perception components (e.g., cameras only, cameras with LiDAR, etc.). Finally, this work provides valuable insights into integrating MLLMs with AD frameworks, paving the way for fully safe ADS.
Disclaimer: This is not the final version of the article. Changes may occur when the manuscript is published in its final format.
We use cookies to improve your experience on our site. By continuing to use our site, you accept our use of cookies. Learn more