282 words
1 minute
LOSP (Learnable Orthogonal Subspace Projection)
核心思想
- 高光是加法项: I = R⊙L + S,传统log方法无法分离
- 假设高光向量S在RGB空间具有低秩特性(方向相对固定)
- 学习投影基W,使得 W·S ≈ 0,物理消除高光
- 投影后的特征可继续使用YOLA的Log-Ratio逻辑
模块组成
- LOSPProjection: 可学习正交投影层(核心创新)
- DualInvariantExtractor: 双流特征提取(光谱+空间不变量)
- AdaptiveFusion: 自适应融合
- SoftTailLoss: 可微的长尾抑制损失
- OrthogonalityLoss: 正交多样性约束
数学基础
- 投影: proj_W(I) = W·I,其中W是单位正交基
- 光谱不变量: log(C_i) - log(C_j) 消除乘法光照
- 空间不变量: ∇(log(C)) 保留单通道纹理
指标
baseline的baseline
yolov3
| 类别 | GT数量 | 检测数量 | Recall | AP |
|---|---|---|---|---|
| Bicycle | 418 | 1311 | 0.866 | 0.801 |
| Boat | 515 | 1221 | 0.817 | 0.722 |
| Bottle | 433 | 1382 | 0.813 | 0.724 |
| Bus | 164 | 511 | 0.909 | 0.848 |
| Car | 919 | 2429 | 0.862 | 0.783 |
| Cat | 425 | 1348 | 0.831 | 0.669 |
| Chair | 609 | 2970 | 0.772 | 0.651 |
| Cup | 356 | 1084 | 0.831 | 0.711 |
| Dog | 490 | 1296 | 0.898 | 0.786 |
| Motorbike | 242 | 1435 | 0.826 | 0.641 |
| People | 2235 | 6152 | 0.858 | 0.768 |
| Table | 311 | 2079 | 0.746 | 0.494 |
| mRecall | 0.836 | |||
| mAP | 0.717 |
yolov3纯IIM
| 类别 | GT数量 | 检测数量 | Recall | AP |
|---|---|---|---|---|
| Bicycle | 418 | 5830 | 0.713 | 0.491 |
| Boat | 515 | 11495 | 0.713 | 0.338 |
| Bottle | 433 | 9673 | 0.637 | 0.393 |
| Bus | 164 | 3939 | 0.811 | 0.586 |
| Car | 919 | 22777 | 0.691 | 0.394 |
| Cat | 425 | 10333 | 0.706 | 0.165 |
| Chair | 609 | 18323 | 0.696 | 0.368 |
| Cup | 356 | 14124 | 0.643 | 0.315 |
| Dog | 490 | 10147 | 0.810 | 0.349 |
| Motorbike | 242 | 13163 | 0.624 | 0.175 |
| People | 2235 | 33656 | 0.723 | 0.419 |
| Table | 311 | 12463 | 0.640 | 0.180 |
| mRecall | 0.701 | |||
| mAP | 0.348 |
TOOD
| 类别 | GT数量 | 检测数量 | Recall | AP |
|---|---|---|---|---|
| Bicycle | 418 | 3331 | 0.911 | 0.802 |
| Boat | 515 | 6075 | 0.926 | 0.765 |
| Bottle | 433 | 5791 | 0.868 | 0.720 |
| Bus | 164 | 2208 | 0.970 | 0.892 |
| Car | 919 | 9996 | 0.930 | 0.780 |
| Cat | 425 | 3840 | 0.894 | 0.725 |
| Chair | 609 | 11617 | 0.890 | 0.672 |
| Cup | 356 | 5500 | 0.904 | 0.713 |
| Dog | 490 | 4477 | 0.971 | 0.861 |
| Motorbike | 242 | 5284 | 0.917 | 0.624 |
| People | 2235 | 22387 | 0.906 | 0.762 |
| Table | 311 | 10395 | 0.897 | 0.501 |
| mRecall | 0.915 | |||
| mAP | 0.735 |
YOLA-yolov3
| 类别 | GT数量 | 检测数量 | Recall | AP |
|---|---|---|---|---|
| Bicycle | 418 | 1159 | 0.895 | 0.832 |
| Boat | 515 | 1275 | 0.860 | 0.770 |
| Bottle | 433 | 1404 | 0.818 | 0.726 |
| Bus | 164 | 446 | 0.939 | 0.881 |
| Car | 919 | 2327 | 0.890 | 0.816 |
| Cat | 425 | 1407 | 0.845 | 0.685 |
| Chair | 609 | 2648 | 0.788 | 0.668 |
| Cup | 356 | 1201 | 0.820 | 0.699 |
| Dog | 490 | 1345 | 0.906 | 0.801 |
| Motorbike | 242 | 1182 | 0.798 | 0.638 |
| People | 2235 | 6357 | 0.868 | 0.778 |
| Table | 311 | 2123 | 0.807 | 0.494 |
| mRecall | 0.853 | |||
| mAP | 0.732 |
YOLA-TOOD
| 类别 | GT数量 | 检测数量 | Recall | AP |
|---|---|---|---|---|
| Bicycle | 418 | 3584 | 0.933 | 0.833 |
| Boat | 515 | 5438 | 0.940 | 0.785 |
| Bottle | 433 | 5793 | 0.891 | 0.742 |
| Bus | 164 | 2042 | 0.957 | 0.888 |
| Car | 919 | 9011 | 0.933 | 0.794 |
| Cat | 425 | 3733 | 0.908 | 0.749 |
| Chair | 609 | 10100 | 0.893 | 0.706 |
| Cup | 356 | 5269 | 0.899 | 0.722 |
| Dog | 490 | 4176 | 0.963 | 0.862 |
| Motorbike | 242 | 5374 | 0.905 | 0.633 |
| People | 2235 | 20178 | 0.905 | 0.778 |
| Table | 311 | 8944 | 0.884 | 0.495 |
| mRecall | 0.918 | |||
| mAP | 0.749 |
LOSP-TOOD初版-未融合原始RGB
| 类别 | GT数量 | 检测数量 | Recall | AP |
|---|---|---|---|---|
| Bicycle | 418 | 4115 | 0.933 | 0.819 |
| Boat | 515 | 7038 | 0.951 | 0.789 |
| Bottle | 433 | 6269 | 0.882 | 0.741 |
| Bus | 164 | 2061 | 0.976 | 0.896 |
| Car | 919 | 9976 | 0.939 | 0.800 |
| Cat | 425 | 5306 | 0.911 | 0.724 |
| Chair | 609 | 12550 | 0.910 | 0.690 |
| Cup | 356 | 6336 | 0.919 | 0.709 |
| Dog | 490 | 4977 | 0.969 | 0.858 |
| Motorbike | 242 | 5505 | 0.930 | 0.651 |
| People | 2235 | 24726 | 0.923 | 0.782 |
| Table | 311 | 9962 | 0.900 | 0.512 |
| mRecall | 0.929 | |||
| mAP | 0.748 |
LOSP-yolov3初版-未融合原始RGB
| 类别 | GT数量 | 检测数量 | Recall | AP |
|---|---|---|---|---|
| Bicycle | 418 | 1271 | 0.888 | 0.816 |
| Boat | 515 | 1361 | 0.845 | 0.740 |
| Bottle | 433 | 1635 | 0.813 | 0.714 |
| Bus | 164 | 480 | 0.909 | 0.859 |
| Car | 919 | 2508 | 0.887 | 0.801 |
| Cat | 425 | 1334 | 0.849 | 0.685 |
| Chair | 609 | 3055 | 0.801 | 0.677 |
| Cup | 356 | 1373 | 0.846 | 0.698 |
| Dog | 490 | 1545 | 0.914 | 0.784 |
| Motorbike | 242 | 1412 | 0.781 | 0.608 |
| People | 2235 | 6582 | 0.864 | 0.774 |
| Table | 311 | 2395 | 0.765 | 0.448 |
| mRecall | 0.847 | |||
| mAP | 0.717 |
LOSP-TOODv2
并未改善
LOSP-yolov3v2
并未改善
LOSP (Learnable Orthogonal Subspace Projection)
/blog/posts/科研笔记/losp/ Some information may be outdated