LOADING
282 words
1 minute
LOSP (Learnable Orthogonal Subspace Projection)
2026-02-04
No Tags

核心思想#

  1. 高光是加法项: I = R⊙L + S,传统log方法无法分离
  2. 假设高光向量S在RGB空间具有低秩特性(方向相对固定)
  3. 学习投影基W,使得 W·S ≈ 0,物理消除高光
  4. 投影后的特征可继续使用YOLA的Log-Ratio逻辑

模块组成#

  • LOSPProjection: 可学习正交投影层(核心创新)
  • DualInvariantExtractor: 双流特征提取(光谱+空间不变量)
  • AdaptiveFusion: 自适应融合
  • SoftTailLoss: 可微的长尾抑制损失
  • OrthogonalityLoss: 正交多样性约束

数学基础#

  • 投影: proj_W(I) = W·I,其中W是单位正交基
  • 光谱不变量: log(C_i) - log(C_j) 消除乘法光照
  • 空间不变量: ∇(log(C)) 保留单通道纹理

指标#

baseline的baseline#

yolov3#

类别GT数量检测数量RecallAP
Bicycle41813110.8660.801
Boat51512210.8170.722
Bottle43313820.8130.724
Bus1645110.9090.848
Car91924290.8620.783
Cat42513480.8310.669
Chair60929700.7720.651
Cup35610840.8310.711
Dog49012960.8980.786
Motorbike24214350.8260.641
People223561520.8580.768
Table31120790.7460.494
mRecall0.836
mAP0.717

yolov3纯IIM#

类别GT数量检测数量RecallAP
Bicycle41858300.7130.491
Boat515114950.7130.338
Bottle43396730.6370.393
Bus16439390.8110.586
Car919227770.6910.394
Cat425103330.7060.165
Chair609183230.6960.368
Cup356141240.6430.315
Dog490101470.8100.349
Motorbike242131630.6240.175
People2235336560.7230.419
Table311124630.6400.180
mRecall0.701
mAP0.348

TOOD#

类别GT数量检测数量RecallAP
Bicycle41833310.9110.802
Boat51560750.9260.765
Bottle43357910.8680.720
Bus16422080.9700.892
Car91999960.9300.780
Cat42538400.8940.725
Chair609116170.8900.672
Cup35655000.9040.713
Dog49044770.9710.861
Motorbike24252840.9170.624
People2235223870.9060.762
Table311103950.8970.501
mRecall0.915
mAP0.735

YOLA-yolov3#

类别GT数量检测数量RecallAP
Bicycle41811590.8950.832
Boat51512750.8600.770
Bottle43314040.8180.726
Bus1644460.9390.881
Car91923270.8900.816
Cat42514070.8450.685
Chair60926480.7880.668
Cup35612010.8200.699
Dog49013450.9060.801
Motorbike24211820.7980.638
People223563570.8680.778
Table31121230.8070.494
mRecall0.853
mAP0.732

YOLA-TOOD#

类别GT数量检测数量RecallAP
Bicycle41835840.9330.833
Boat51554380.9400.785
Bottle43357930.8910.742
Bus16420420.9570.888
Car91990110.9330.794
Cat42537330.9080.749
Chair609101000.8930.706
Cup35652690.8990.722
Dog49041760.9630.862
Motorbike24253740.9050.633
People2235201780.9050.778
Table31189440.8840.495
mRecall0.918
mAP0.749

LOSP-TOOD初版-未融合原始RGB#

类别GT数量检测数量RecallAP
Bicycle41841150.9330.819
Boat51570380.9510.789
Bottle43362690.8820.741
Bus16420610.9760.896
Car91999760.9390.800
Cat42553060.9110.724
Chair609125500.9100.690
Cup35663360.9190.709
Dog49049770.9690.858
Motorbike24255050.9300.651
People2235247260.9230.782
Table31199620.9000.512
mRecall0.929
mAP0.748

LOSP-yolov3初版-未融合原始RGB#

类别GT数量检测数量RecallAP
Bicycle41812710.8880.816
Boat51513610.8450.740
Bottle43316350.8130.714
Bus1644800.9090.859
Car91925080.8870.801
Cat42513340.8490.685
Chair60930550.8010.677
Cup35613730.8460.698
Dog49015450.9140.784
Motorbike24214120.7810.608
People223565820.8640.774
Table31123950.7650.448
mRecall0.847
mAP0.717

LOSP-TOODv2#

并未改善

LOSP-yolov3v2#

并未改善

LOSP (Learnable Orthogonal Subspace Projection)
/blog/posts/科研笔记/losp/
Author
Zenfish
Published at
2026-02-04
License
CC BY-NC-SA 4.0

Some information may be outdated