MV-FCOS3D++

Authors	Tai Wang, Qing Lian, Chenming Zhu, Xinge Zhu, Wenwei Zhang
Description	We build a multi-view framework with temporal stereo modeling to convert multi-view features to a 3D grid space and perform 3D detection thereon. The ResNet101-DCN backbone based on FCOS3D++ is pretrained on Waymo with only object annotations. We do not involve lidar depth labels both during training and inference. Code will be released at MMDetection3D.
Project Link	Link

TYPE_VEHICLE