Abstract
Large Language Models (LLMs) have demonstrated remarkable reasoning capabilities in robotics; however, their deployment in multi-robot systems remains limited and unstructured, particularly in managing task dependencies. This study introduces DART-LLM, which employs Directed Acyclic Graphs (DAGs) to model task dependencies, enabling the effective decomposition of natural language instructions into interdependent subtasks for coordinated multi-robot execution. The DART-LLM framework integrates a Question-Answering (QA) LLM module for instruction parsing and dependency-aware task decomposition, a Breakdown Function module for task parsing and robot assignment, an Actuation module for task execution, and a Vision-Language Model (VLM)-based object detector module for environmental perception, achieving end-to-end task execution. Experimental results across 3 task complexity levels demonstrate that DART-LLM achieves state-of-the-art performance, significantly outperforming the baseline across all evaluation metrics. Among the tested models, DeepSeek-r1 achieves the highest success rate, while Llama3.1 demonstrates superior reliability in response time. Additionally, ablation studies have confirmed that incorporating explicit dependency modeling improves performance across all models, particularly enhancing the reasoning capabilities of smaller models.
Experimental Results
Note: The simulation results shown below represent the first 6 examples from the YongdongWang/dart_llm_tasks dataset. We have also conducted real robot experiments with 2 examples. The natural language prompts displayed below each demonstration video represent the original task instructions provided to the DART-LLM system. For real robot experiments, annotations such as "(Real Robot) - * view" are added for clarity and are not part of the input prompts.
"L1-T1-001: Dump truck 1 goes to the puddle for inspection, after which all robots avoid the puddle."
"L1-T2-001: Drive the Excavator 1 to the obstacle, and perform excavation to clear the obstacle."
"L2-T1-001: Send Excavator 1 and Dump Truck 1 to the soil area; Excavator 1 will excavate and unload, followed by Dump Truck 1 proceeding to the puddle for unloading."
"L2-T2-001: Move Excavator 1 and Dump Truck 1 to soil area 2; Excavator 1 will excavate and unload, then Dump Truck 1 returns to the starting position to unload."
"L3-T1-001: Excavator 1 is guided to the obstacle to excavate and unload to clear the obstacle, then excavator 1 and dump truck 1 are moved to the soil area, and the excavator excavates and unloads. Finally, dump truck 1 unloads the soil into the puddle."
"L3-T2-001: Excavator 1 goes to the obstacle to excavate and unload to clear the obstacle. Once the obstacle is cleared, mobilize all available robots to proceed to the puddle area for inspection."
Real Robot Experiments
"L1-T1-001: Dump truck 1 goes to the puddle for inspection, after which all robots avoid the puddle. (Real Robot) - Top view"
"L1-T1-001: Dump truck 1 goes to the puddle for inspection, after which all robots avoid the puddle. (Real Robot) - Camera view"
"L2-T1-001: Send Excavator 1 and Dump Truck 1 to the soil area; Excavator 1 will excavate and unload, followed by Dump Truck 1 proceeding to the puddle for unloading. (Real Robot) - Top view"
"L2-T1-001: Send Excavator 1 and Dump Truck 1 to the soil area; Excavator 1 will excavate and unload, followed by Dump Truck 1 proceeding to the puddle for unloading. (Real Robot) - Camera view"