Decentralized Deadlock-free Trajectory Planning for Quadrotor Swarmin Obstacle-rich Environments

Reinforcement Learning from scratch

Actor-Critic Model Predictive Control (Talk ICRA 2024)

Epic Ghost Camp EP.36 พิสูจน์ผี!! บ้านกรีนคนเห็นผี!! (หลอนสุดๆ)

เตะบอลลูกยิงโคตรสวย ฟุตบอลยูโร ปี 1984 - 2020

เกมพลิกชะตา ! สงครามไม่จบ “บิ๊กโจ๊ก” เปิดศึกรอบทิศ รุกฆาต “ชิงบัลลังก์ ตร.” #ถกไม่เถียง

Mono-Camera-Only Target Chasing for a Drone in a Dense Environment by Cross-Modal Learning

LARR SNU

มุมมอง 276

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 26 พ.ค. 2024
* Status: Accepted for publication in IEEE Robotics and Automation Letters (RA-L)
* Category: Vision-Based Navigation, Visual Learning, Deep Learning for Visual Perception
* Author: Seungyeon Yoo¹, Seungwoo Jung¹, Yunwoo Lee, Dongseok Shim, and H. Jin Kim
* Abstract: Chasing a dynamic target in a dense environment is one of the challenging applications of autonomous drones. The task requires multi-modal data, such as RGB and depth, to accomplish safe and robust maneuver. However, using different types of modalities can be difficult due to the limited capacity of drones in aspects of hardware complexity and sensor cost. Our framework resolves such restrictions in the target chasing task by using only a monocular camera instead of multiple sensor inputs. From an RGB input, the perception module can extract a cross-modal representation containing information from multiple data modalities. To learn cross-modal representations at training time, we employ variational autoencoder (VAE) structures and the joint objective function across heterogeneous data. Subsequently, using latent vectors acquired from the pre-trained perception module, the planning module generates a proper next-time-step waypoint by imitation learning of the expert, which performs a numerical optimization using the privileged RGB-D data. Furthermore, the planning module considers temporal information of the target to improve tracking performance through consecutive cross-modal representations. Ultimately, we demonstrate the effectiveness of our framework through the reconstruction results of the perception module, the target chasing performance of the planning module, and the zero-shot sim-to-real deployment of a drone.
* Contact: syeon.yoo@snu.ac.kr; tmddn833@snu.ac.kr
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 2

@hyungtaelim1033 27 วันที่ผ่านมา
The real-world demo is amazing!
@seungwoojung1203 11 วันที่ผ่านมา
Thank you!

ต่อไป

เล่นอัตโนมัติ

Decentralized Deadlock-free Trajectory Planning for Quadrotor Swarmin Obstacle-rich Environments

Decentralized Deadlock-free Trajectory Planning for Quadrotor Swarmin Obstacle-rich Environments

Reinforcement Learning from scratch

Reinforcement Learning from scratch

Actor-Critic Model Predictive Control (Talk ICRA 2024)

Actor-Critic Model Predictive Control (Talk ICRA 2024)

Epic Ghost Camp EP.36 พิสูจน์ผี!! บ้านกรีนคนเห็นผี!! (หลอนสุดๆ)

Epic Ghost Camp EP.36 พิสูจน์ผี!! บ้านกรีนคนเห็นผี!! (หลอนสุดๆ)

เตะบอลลูกยิงโคตรสวย ฟุตบอลยูโร ปี 1984 - 2020

เตะบอลลูกยิงโคตรสวย ฟุตบอลยูโร ปี 1984 - 2020

เกมพลิกชะตา ! สงครามไม่จบ “บิ๊กโจ๊ก” เปิดศึกรอบทิศ รุกฆาต “ชิงบัลลังก์ ตร.” #ถกไม่เถียง

เกมพลิกชะตา ! สงครามไม่จบ “บิ๊กโจ๊ก” เปิดศึกรอบทิศ รุกฆาต “ชิงบัลลังก์ ตร.” #ถกไม่เถียง

เปิดตัวพิพิธภัณฑ์! สานฝันสัตว์เลี้ยงตัวเเรก!? | Coral Island #8

เปิดตัวพิพิธภัณฑ์! สานฝันสัตว์เลี้ยงตัวเเรก!? | Coral Island #8

A Hybrid Controller Enhancing Transient Perform. for an Aerial Manipulator Extracting a Wedged Obj.

A Hybrid Controller Enhancing Transient Perform. for an Aerial Manipulator Extracting a Wedged Obj.

2022 - Non-Euclidean Doom: what happens to a game when pi is not 3.14159…

2022 - Non-Euclidean Doom: what happens to a game when pi is not 3.14159…

Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow

Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow

The U-Net (actually) explained in 10 minutes

The U-Net (actually) explained in 10 minutes

Why Computer Vision Is a Hard Problem for AI

Why Computer Vision Is a Hard Problem for AI

Making another pickproof lock (but better)

Making another pickproof lock (but better)

The rarest move in chess

The rarest move in chess

ViPlanner: Visual Semantic Imperative Learning for Local Navigation (ICRA 2024)

ViPlanner: Visual Semantic Imperative Learning for Local Navigation (ICRA 2024)

Who's Adam and What's He Optimizing? | Deep Dive into Optimizers for Machine Learning!

Who's Adam and What's He Optimizing? | Deep Dive into Optimizers for Machine Learning!

Changing image Background, Converting One image into Two images and write Text in image.

Changing image Background, Converting One image into Two images and write Text in image.

The Ultimate Guide to Professional Photography Editing Workflows

The Ultimate Guide to Professional Photography Editing Workflows

เค้าบอกว่า Nvidia ให้ระวัง? Cerebras เอ็งเป็นใคร? ทำไมมี "คอมพิวเตอร์ AI ที่เร็วที่สุดในโลก"!!

เค้าบอกว่า Nvidia ให้ระวัง? Cerebras เอ็งเป็นใคร? ทำไมมี "คอมพิวเตอร์ AI ที่เร็วที่สุดในโลก"!!

แอปปลอมระบาด!! เตือนผู้ใช้ iPhone หลีกเลี่ยงติดตั้งแอปนอกจาก App Store

แอปปลอมระบาด!! เตือนผู้ใช้ iPhone หลีกเลี่ยงติดตั้งแอปนอกจาก App Store

Vivo S1 Pro Mobile Cover 📱 || Review Best Model Price Only 300 RS

Vivo S1 Pro Mobile Cover 📱 || Review Best Model Price Only 300 RS

ปลั๊กที่ไม่กลัวไฟดับ #voxstudiopowerstrip #ปลั๊กไฟvox

ปลั๊กที่ไม่กลัวไฟดับ #voxstudiopowerstrip #ปลั๊กไฟvox

ลองสั่งอุปกรณ์สายลับ.. จากจีน? [ โกงมั้ยครับ ep.71 ] | DOM

ลองสั่งอุปกรณ์สายลับ.. จากจีน? [ โกงมั้ยครับ ep.71 ] | DOM

พรีวิว Sony Xperia 1 VI จากคนที่ไม่เคยอยากใช้มือถือ SONY

พรีวิว Sony Xperia 1 VI จากคนที่ไม่เคยอยากใช้มือถือ SONY