文章摘要
基于激光-视觉BEV特征融合的目标检测方法
Object Detection Method Based on LiDAR-Vision BEV Feature Fusion
投稿时间:2025-10-24  修订日期:2025-11-28
DOI:
中文关键词: 四旋翼无人机  全景激光雷达  RGB相机  目标检测  特征融合
英文关键词: Quadrotor UAV  Panoramic LiDAR  RGB Camera  Object Detection  Feature Fusion  
基金项目:航空科学基金资助项目(20230055063008)
作者单位
陈广永 中国航空无线电电子研究所 航空电子综合与体系集成全国重点实验室 
侯乾磊 大连理工大学 控制科学与工程学院 
闫 飞* 大连理工大学 控制科学与工程学院 
摘要点击次数: 76
全文下载次数: 0
中文摘要:
      本文针对无人机在目标检测任务中因依赖单一传感器而面临的感知范围受限、深度信息缺失及低光照性能退化等问题,提出了一种基于激光-视觉BEV特征融合的目标检测方法。研究在全景激光雷达与相机组成的多模态感知平台上展开,通过对点云与图像分别进行特征提取,并在鸟瞰图视角下实现特征对齐,有效融合了激光雷达的精确测距能力与相机的丰富纹理信息。方法中引入深度可分离卷积与通道注意力机制以提升融合效率,并采用Transformer解码器对多模态特征进行自注意力与交叉注意力计算,最终通过检测头输出目标的类别、位置与尺寸信息。在NuScenes数据集与自建的校园数据集上进行实验验证,结果表明,所提出的方法在保持较高检测精度的同时,显著提升了复杂环境下检测的鲁棒性与实用性,适用于对感知可靠性要求较高的无人机应用场景。
英文摘要:
      This study addresses challenges in UAV object detection, including limited perception range, missing depth information, and performance degradation in low light. These issues arise from reliance on single sensors. We propose a feature fusion method based on laser-visual Bird’s Eye View (BEV). The method uses a panoramic LiDAR and a camera. It extracts features from point clouds and images separately, aligns them in BEV space, and fuses them effectively. Our approach integrates depthwise separable convolution and a channel attention mechanism to improve fusion efficiency. A Transformer decoder computes self-attention and cross-attention on multimodal features. The detection head then predicts object categories, locations, and sizes. Experiments on the NuScenes dataset and a self-collected campus dataset show that our method maintains high detection accuracy while significantly improving robustness and practicality in complex environments. The approach is suitable for UAV applications requiring high perceptual reliability.
View Fulltext   查看/发表评论  下载PDF阅读器
关闭