专栏首页深度应用『深度思考』对CenterNet的一些思考与质疑·测试对比CenterNet与U版YoloV3速度与精度

『深度思考』对CenterNet的一些思考与质疑·测试对比CenterNet与U版YoloV3速度与精度

0.引子

笔者很喜欢CenterNet极简的网络结构,CenterNet只通过FCN(全卷积)的方法实现了对于目标的检测与分类,无需anchor与nms等复杂的操作高效的同时精度也不差。同时也可以很将此结构简单的修改就可以应用到人体姿态估计与三维目标检测之中

后面一些针对CenterNet结构应用于其他任务,也取得不错的效果,比如人脸检测CenterFace以及目标追踪CenterTrack与FairMot。这些内容后面等笔者研习过后再补充,后面应该会做一个类CenterNet结构总结对比,感兴趣的读者可以持续关注一下。

下面要引出写此篇博文的了,在研习CenterNet时看到了CenterNet与YoloV3的对比,在速度与精度都实现了超越,其实针对这个结论笔者还是略带怀疑态度的。

YoloV3网络的特点是速度快,精度不是很高,常用于实际的检测项目中,实现实时检测识别。相较于二阶段(two stage)的Faster Rcnn具备速度优势,相较于单阶段(one stage)的SSD(Single Shot Detection)与RetinaNet有速度与精度的优势。

所以笔者对CenterNet针对YoloV3速度的提升还是有些怀疑的,YoloV3可以说目前是工业上最常用也是最好用的目标检测算法,如果真的如CenterNet的论文结论所述,CenterNet同时也具备结构简单使用方便的特点(先忽略DCN,部署全面支持只是时间问题),肯定能取代YoloV3的地位。

针对上述情况,笔者打算做一下对比实验,测试在相同的硬件与环境的条件下,来测试CenterNet与YoloV3的精度与速度的测试,其实为了简化实验,这里只测试在相同尺寸下CenterNet与YoloV3的速度对比,精度以文章的内容为准。

1.实验条件

为了读者能认可与方便复现笔者的结果,这里列出实验的硬件与环境:

  • 系统:Ubuntu 18.04.4 LTS
  • CPU:Intel® Core™ i5-9400F CPU @ 2.90GHz × 6
  • GPU:GeForce RTX 2060 SUPER/PCIe/SSE2
  • Cuda:10.1
  • Pytorch:1.5.0

实验参考开源:

CenterNet:https://github.com/xingyizhou/CenterNet

YoloV3:https://github.com/ultralytics/yolov3

2.实验过程

1.U版YoloV3

1.最大边放缩 320

运行:~/yolov3$

python detect.py --weights weights/yolov3-spp-ultralytics.pt --img-size 320 --cfg cfg/yolov3-spp.cfg --source data/samples/

结果:模型平均耗时 12ms

Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
image 1/10 data/samples/16004479832_a748d55f21_k.jpg: 256x320 4 persons, 2 dogs, Done. (0.011s)
image 2/10 data/samples/17790319373_bd19b24cfc_k.jpg: 192x320 9 persons, 2 cars, 3 motorcycles, 1 trucks, Done. (0.013s)
image 3/10 data/samples/18124840932_e42b3e377c_k.jpg: 256x320 3 persons, 2 boats, Done. (0.012s)
image 4/10 data/samples/19064748793_bb942deea1_k.jpg: 256x320 14 cars, Done. (0.011s)
image 5/10 data/samples/24274813513_0cfd2ce6d0_k.jpg: 256x320 11 persons, 1 cars, Done. (0.013s)
image 6/10 data/samples/33823288584_1d21cf0a26_k.jpg: 256x320 9 persons, 7 bicycles, 1 backpacks, Done. (0.011s)
image 7/10 data/samples/33887522274_eebd074106_k.jpg: 256x320 4 persons, 1 cars, 1 buss, 1 trucks, Done. (0.011s)
image 8/10 data/samples/34501842524_3c858b3080_k.jpg: 256x320 3 cars, 2 stop signs, Done. (0.011s)
image 9/10 data/samples/bus.jpg: 320x256 4 persons, 1 buss, 1 handbags, Done. (0.011s)
image 10/10 data/samples/zidane.jpg: 192x320 3 persons, 3 ties, Done. (0.010s)

2.最大边放缩 512

运行:~/yolov3$

python detect.py --weights weights/yolov3-spp-ultralytics.pt --img-size 512 --cfg cfg/yolov3-spp.cfg --source data/samples/

输出: 模型平均耗时 20ms

Using CUDA device0 _CudaDeviceProperties(name='GeForce RTX 2060 SUPER', total_memory=7979MB)

Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
image 1/10 data/samples/16004479832_a748d55f21_k.jpg: 384x512 4 persons, 2 dogs, Done. (0.018s)
image 2/10 data/samples/17790319373_bd19b24cfc_k.jpg: 320x512 9 persons, 4 cars, 2 motorcycles, 1 trucks, 1 benchs, 1 chairs, Done. (0.019s)
image 3/10 data/samples/18124840932_e42b3e377c_k.jpg: 384x512 3 persons, 3 boats, 1 birds, Done. (0.018s)
image 4/10 data/samples/19064748793_bb942deea1_k.jpg: 384x512 4 persons, 18 cars, 2 traffic lights, Done. (0.019s)
image 5/10 data/samples/24274813513_0cfd2ce6d0_k.jpg: 384x512 13 persons, 1 trucks, Done. (0.020s)
image 6/10 data/samples/33823288584_1d21cf0a26_k.jpg: 384x512 16 persons, 6 bicycles, 2 backpacks, 1 bottles, 1 cell phones, Done. (0.020s)
image 7/10 data/samples/33887522274_eebd074106_k.jpg: 384x512 3 persons, 1 cars, 1 buss, Done. (0.020s)
image 8/10 data/samples/34501842524_3c858b3080_k.jpg: 384x512 3 cars, 1 stop signs, Done. (0.020s)
image 9/10 data/samples/bus.jpg: 512x384 4 persons, 1 buss, 1 stop signs, 1 ties, 1 skateboards, Done. (0.018s)
image 10/10 data/samples/zidane.jpg: 320x512 3 persons, 2 ties, Done. (0.015s)

3.最大边放缩 800

运行:~/yolov3$

python detect.py --weights weights/yolov3-spp-ultralytics.pt --img-size 800 --cfg cfg/yolov3-spp.cfg --source data/samples/

输出:模型平均耗时 40ms

Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
image 1/10 data/samples/16004479832_a748d55f21_k.jpg: 544x800 7 persons, 2 dogs, 1 handbags, Done. (0.041s)
image 2/10 data/samples/17790319373_bd19b24cfc_k.jpg: 480x800 10 persons, 8 cars, 1 motorcycles, 2 trucks, 1 chairs, Done. (0.040s)
image 3/10 data/samples/18124840932_e42b3e377c_k.jpg: 544x800 3 persons, 4 boats, Done. (0.045s)
image 4/10 data/samples/19064748793_bb942deea1_k.jpg: 544x800 6 persons, 27 cars, 1 buss, 1 trucks, 6 traffic lights, Done. (0.045s)
image 5/10 data/samples/24274813513_0cfd2ce6d0_k.jpg: 544x800 14 persons, 2 cars, 1 trucks, 1 ties, Done. (0.047s)
image 6/10 data/samples/33823288584_1d21cf0a26_k.jpg: 544x800 21 persons, 11 bicycles, 1 backpacks, 2 handbags, 4 bottles, 1 cell phones, Done. (0.038s)
image 7/10 data/samples/33887522274_eebd074106_k.jpg: 608x800 6 persons, 1 cars, 1 buss, Done. (0.038s)
image 8/10 data/samples/34501842524_3c858b3080_k.jpg: 544x800 4 cars, 1 trucks, 2 stop signs, Done. (0.037s)
image 9/10 data/samples/bus.jpg: 800x608 4 persons, 1 bicycles, 1 buss, 1 ties, Done. (0.039s)
image 10/10 data/samples/zidane.jpg: 480x800 3 persons, 2 ties, Done. (0.029s)

4.最大边放缩 1024

运行:~/yolov3$

python detect.py --weights weights/yolov3-spp-ultralytics.pt --img-size 1024 --cfg cfg/yolov3-spp.cfg --source data/samples/

结果: 模型平均耗时 50ms

Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
image 1/10 data/samples/16004479832_a748d55f21_k.jpg: 704x1024 5 persons, 1 dogs, 1 handbags, Done. (0.054s)
image 2/10 data/samples/17790319373_bd19b24cfc_k.jpg: 576x1024 13 persons, 5 cars, 1 motorcycles, 2 trucks, 1 umbrellas, 1 chairs, Done. (0.049s)
image 3/10 data/samples/18124840932_e42b3e377c_k.jpg: 704x1024 3 persons, 6 boats, 4 birds, Done. (0.049s)
image 4/10 data/samples/19064748793_bb942deea1_k.jpg: 704x1024 10 persons, 24 cars, 1 buss, 2 trucks, 7 traffic lights, Done. (0.051s)
image 5/10 data/samples/24274813513_0cfd2ce6d0_k.jpg: 704x1024 14 persons, 1 cars, 1 trucks, 2 handbags, 2 ties, Done. (0.048s)
image 6/10 data/samples/33823288584_1d21cf0a26_k.jpg: 704x1024 25 persons, 8 bicycles, 1 backpacks, 1 handbags, 1 kites, 2 bottles, 1 cell phones, Done. (0.051s)
image 7/10 data/samples/33887522274_eebd074106_k.jpg: 768x1024 5 persons, 1 cars, 1 buss, Done. (0.052s)
image 8/10 data/samples/34501842524_3c858b3080_k.jpg: 704x1024 4 cars, 1 trucks, 1 stop signs, Done. (0.048s)
image 9/10 data/samples/bus.jpg: 1024x768 3 persons, 1 bicycles, 1 buss, 1 cell phones, Done. (0.054s)
image 10/10 data/samples/zidane.jpg: 576x1024 1 persons, 3 ties, Done. (0.042s)
Results saved to /home/song/yolov3/output
Done. (0.943s) avg time (0.094s)

YoloV3输出照片

2.CenterNet

1.最大边放缩 320

运行:~/CenterNet/src$

python demo.py ctdet --demo ../data/samples --load_model ../models/ctdet_coco_dla_2x.pth --input_h 256 --input_w 320

结果: 模型平均耗时 16ms

Creating model...
loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
torch.Size([1, 3, 256, 320])
/opt/conda/conda-bld/pytorch_1587428398394/work/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
tot 0.202s |load 0.005s |pre 0.003s |net 0.191s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.059s |load 0.031s |pre 0.009s |net 0.016s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.027s |load 0.010s |pre 0.002s |net 0.012s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.052s |load 0.022s |pre 0.008s |net 0.018s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.053s |load 0.021s |pre 0.009s |net 0.019s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.057s |load 0.028s |pre 0.008s |net 0.017s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.036s |load 0.008s |pre 0.005s |net 0.019s |dec 0.001s |post 0.003s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.053s |load 0.027s |pre 0.006s |net 0.016s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.044s |load 0.017s |pre 0.005s |net 0.019s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 256, 320])
tot 0.028s |load 0.009s |pre 0.003s |net 0.012s |dec 0.002s |post 0.002s |merge 0.000s |

2.最大边放缩 512

运行:~/CenterNet/src$

python demo.py ctdet --demo ../data/samples --load_model ../models/ctdet_coco_dla_2x.pth --input_h 384 --input_w 512

输出:模型平均耗时 20ms

loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
torch.Size([1, 3, 384, 512])
/opt/conda/conda-bld/pytorch_1587428398394/work/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
tot 0.206s |load 0.005s |pre 0.006s |net 0.193s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.041s |load 0.012s |pre 0.007s |net 0.019s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.031s |load 0.005s |pre 0.005s |net 0.019s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.060s |load 0.018s |pre 0.016s |net 0.022s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.046s |load 0.009s |pre 0.010s |net 0.023s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.055s |load 0.018s |pre 0.014s |net 0.018s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.059s |load 0.021s |pre 0.015s |net 0.019s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.037s |load 0.008s |pre 0.007s |net 0.018s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.070s |load 0.038s |pre 0.009s |net 0.020s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 384, 512])
tot 0.062s |load 0.027s |pre 0.014s |net 0.019s |dec 0.001s |post 0.002s |merge 0.000s |

3.最大边放缩 800

运行:~/CenterNet/src$

python demo.py ctdet --demo ../data/samples --load_model ../models/ctdet_coco_dla_2x.pth --input_h 544 --input_w 800

结果:模型平均耗时41ms

Creating model...
loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
torch.Size([1, 3, 544, 800])
/opt/conda/conda-bld/pytorch_1587428398394/work/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
tot 0.234s |load 0.005s |pre 0.015s |net 0.211s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.105s |load 0.031s |pre 0.027s |net 0.044s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.092s |load 0.023s |pre 0.023s |net 0.042s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.073s |load 0.009s |pre 0.021s |net 0.040s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.091s |load 0.021s |pre 0.026s |net 0.040s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.085s |load 0.019s |pre 0.022s |net 0.042s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.091s |load 0.021s |pre 0.026s |net 0.040s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.085s |load 0.017s |pre 0.022s |net 0.042s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.098s |load 0.033s |pre 0.020s |net 0.040s |dec 0.003s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 544, 800])
tot 0.074s |load 0.011s |pre 0.020s |net 0.040s |dec 0.001s |post 0.002s |merge 0.000s |

4.最大边放缩 1024

运行:~/CenterNet/src$

python demo.py ctdet --demo ../data/samples --load_model ../models/ctdet_coco_dla_2x.pth --input_h 704 --input_w 1024

结果:模型平均耗时53ms

loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
torch.Size([1, 3, 704, 1024])
/opt/conda/conda-bld/pytorch_1587428398394/work/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
tot 0.260s |load 0.005s |pre 0.025s |net 0.227s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.100s |load 0.007s |pre 0.027s |net 0.063s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.112s |load 0.014s |pre 0.034s |net 0.060s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.117s |load 0.022s |pre 0.029s |net 0.062s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.119s |load 0.021s |pre 0.033s |net 0.061s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.090s |load 0.007s |pre 0.018s |net 0.061s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.108s |load 0.020s |pre 0.033s |net 0.051s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.078s |load 0.007s |pre 0.018s |net 0.051s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.116s |load 0.036s |pre 0.025s |net 0.050s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 704, 1024])
tot 0.082s |load 0.006s |pre 0.020s |net 0.051s |dec 0.003s |post 0.002s |merge 0.000s | 

3.补充实验

在进一步研究了两个代码的实现后,笔者发现了实验的一个问题,只对比了模型推理速度,虽然能看出模型推理效率。但是在实际应用场景中,前后处理也有一定耗时,所以笔者增加了一个在 640/1280 尺寸上整体耗时对比,来说明实际应用时速度差异。

1.U版YoloV3

1.最大边放缩 640

运行:~/yolov3$

python detect.py --weights weights/yolov3-spp-ultralytics.pt --img-size 640  --cfg cfg/yolov3-spp.cfg --source data/samples/

结果:模型平均耗时26ms,整体平均耗时64ms

Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
image 1/10 data/samples/16004479832_a748d55f21_k.jpg: 448x640 5 persons, 2 dogs, Done. (0.028s)
image 2/10 data/samples/17790319373_bd19b24cfc_k.jpg: 384x640 12 persons, 6 cars, 3 motorcycles, 1 trucks, 1 chairs, Done. (0.026s)
image 3/10 data/samples/18124840932_e42b3e377c_k.jpg: 448x640 3 persons, 4 boats, Done. (0.025s)
image 4/10 data/samples/19064748793_bb942deea1_k.jpg: 448x640 8 persons, 22 cars, 1 buss, 1 trucks, 3 traffic lights, 1 clocks, Done. (0.028s)
image 5/10 data/samples/24274813513_0cfd2ce6d0_k.jpg: 448x640 14 persons, 1 cars, 1 trucks, Done. (0.026s)
image 6/10 data/samples/33823288584_1d21cf0a26_k.jpg: 448x640 19 persons, 6 bicycles, 1 backpacks, 2 bottles, 1 cell phones, Done. (0.028s)
image 7/10 data/samples/33887522274_eebd074106_k.jpg: 512x640 5 persons, 1 cars, 1 buss, Done. (0.025s)
image 8/10 data/samples/34501842524_3c858b3080_k.jpg: 448x640 5 cars, 1 trucks, 2 stop signs, Done. (0.024s)
image 9/10 data/samples/bus.jpg: 640x512 4 persons, 1 buss, Done. (0.025s)
image 10/10 data/samples/zidane.jpg: 384x640 3 persons, 2 ties, Done. (0.021s)
Results saved to /home/song/yolov3/output
Done. (0.638s) avg time (0.064s)

2.最大边放缩 1280

运行:~/yolov3$

python detect.py --weights weights/yolov3-spp-ultralytics.pt --img-size 1280  --cfg cfg/yolov3-spp.cfg --source data/samples/

结果:模型平均耗时78ms,整体平均耗时124ms

Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
image 1/10 data/samples/16004479832_a748d55f21_k.jpg: 896x1280 4 persons, 3 dogs, 1 backpacks, Done. (0.085s)
image 2/10 data/samples/17790319373_bd19b24cfc_k.jpg: 704x1280 16 persons, 10 cars, 1 motorcycles, 1 buss, 1 trucks, 1 umbrellas, 1 chairs, Done. (0.061s)
image 3/10 data/samples/18124840932_e42b3e377c_k.jpg: 896x1280 5 persons, 3 boats, 3 birds, Done. (0.075s)
image 4/10 data/samples/19064748793_bb942deea1_k.jpg: 896x1280 11 persons, 27 cars, 1 buss, 5 trucks, 7 traffic lights, 1 remotes, Done. (0.075s)
image 5/10 data/samples/24274813513_0cfd2ce6d0_k.jpg: 896x1280 14 persons, 2 cars, 1 trucks, 5 handbags, 7 ties, Done. (0.074s)
image 6/10 data/samples/33823288584_1d21cf0a26_k.jpg: 832x1280 27 persons, 10 bicycles, 2 backpacks, 3 handbags, 1 kites, 2 bottles, 1 cell phones, Done. (0.070s)
image 7/10 data/samples/33887522274_eebd074106_k.jpg: 960x1280 5 persons, 1 cars, 1 buss, Done. (0.079s)
image 8/10 data/samples/34501842524_3c858b3080_k.jpg: 896x1280 5 cars, 1 trucks, 1 stop signs, 1 benchs, Done. (0.075s)
image 9/10 data/samples/bus.jpg: 1280x960 4 persons, 1 bicycles, 1 ties, 1 cups, Done. (0.078s)
image 10/10 data/samples/zidane.jpg: 768x1280 2 ties, Done. (0.062s)
Results saved to /home/song/yolov3/output
Done. (1.236s) avg time (0.124s)

2.CenterNet

1.最大边放缩 640

运行:~/CenterNet/src$

python demo.py ctdet --demo  ../data/samples --load_model ../models/ctdet_coco_dla_2x.pth  --input_h 448 --input_w 640

结果:模型平均耗时27ms,整体平均耗时50ms

Creating model...
loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
torch.Size([1, 3, 448, 640])
/opt/conda/conda-bld/pytorch_1587428398394/work/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
tot 0.223s |load 0.005s |pre 0.010s |net 0.206s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.079s |load 0.026s |pre 0.021s |net 0.029s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.074s |load 0.023s |pre 0.018s |net 0.029s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.044s |load 0.005s |pre 0.007s |net 0.029s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.043s |load 0.005s |pre 0.006s |net 0.029s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.044s |load 0.006s |pre 0.006s |net 0.029s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.042s |load 0.004s |pre 0.007s |net 0.028s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.047s |load 0.006s |pre 0.007s |net 0.032s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.043s |load 0.009s |pre 0.008s |net 0.024s |dec 0.001s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 448, 640])
tot 0.040s |load 0.006s |pre 0.007s |net 0.024s |dec 0.001s |post 0.002s |merge 0.000s |

2.最大边放缩 1280

运行:~/CenterNet/src$

python demo.py ctdet --demo  ../data/samples --load_model ../models/ctdet_coco_dla_2x.pth  --input_h 896 --input_w 1280

结果:模型平均耗时78ms,整体平均耗时115ms

loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
torch.Size([1, 3, 896, 1280])
/opt/conda/conda-bld/pytorch_1587428398394/work/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
tot 0.302s |load 0.005s |pre 0.039s |net 0.254s |dec 0.003s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.137s |load 0.007s |pre 0.041s |net 0.085s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.114s |load 0.005s |pre 0.027s |net 0.077s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.160s |load 0.023s |pre 0.057s |net 0.076s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.114s |load 0.004s |pre 0.025s |net 0.080s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.113s |load 0.006s |pre 0.026s |net 0.076s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.111s |load 0.004s |pre 0.025s |net 0.077s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.126s |load 0.014s |pre 0.029s |net 0.079s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.125s |load 0.010s |pre 0.031s |net 0.080s |dec 0.002s |post 0.002s |merge 0.000s |
torch.Size([1, 3, 896, 1280])
tot 0.114s |load 0.006s |pre 0.026s |net 0.078s |dec 0.002s |post 0.002s |merge 0.000s |

4.实验总结

CenterNet vs YoloV3速度 模型推理耗时

模型\尺寸

320

512

800

1024

YoloV3-spp-ultralytics

12ms

20ms

40ms

50ms

CenterNet-DLA-34

16ms

20ms

41ms

53ms

CenterNet vs YoloV3速度 模型推理/整体耗时

模型\尺寸

640模型

640整体

1280模型

1280整体

YoloV3-spp-ultralytics

26ms

64ms

77ms

124ms

CenterNet-DLA-34

27ms

50ms

78ms

115ms

CenterNet vs YoloV3速度 模型大小/内存消耗

模型\资源

模型体积

1280尺寸内存占用

YoloV3-spp-ultralytics

252.3 MB (252,297,867 字节)

1.7G

CenterNet-DLA-34

80.9 MB (80,911,783 字节)

1.2G

速度与资源依据,笔者亲测结果,欢迎复现质疑

模型\尺寸

512

YoloV3

32.7 map

YoloV3-spp

35.6 map

YoloV3-spp-ultralytics

42.6 map

CenterNet-DLA-34

37.4 map

精度依据:

1.https://github.com/ultralytics/yolov3

2.https://github.com/xingyizhou/CenterNet

结论如下:

关于笔者的质疑部分“笔者对CenterNet针对YoloV3速度的提升还是有些怀疑的”,实验结果部分证明笔者怀疑的正确性。

单纯看模型推理速度方面,CenterNet-DLA-34 在不同尺度下均比YoloV3-spp版本耗时增加一些(1%-3%)与论文略有不符。但是如果将处理时间也考虑进去,CenterNet-DLA-34 在不同尺度下均比YoloV3-spp版本耗时减少还是很明显的,约有5%-10%的提速。

在模型大小与内存占用方面,CenterNet-DLA-34 效果较与YoloV3-spp版本提升还是比较明显,体积下降为YoloV3-spp版本的25%左右,推理GPU内存占用也下降为70%左右,考虑这是Anchor Free方法带来的优势。

从表格 CenterNet vs YoloV3x coco精度 中可以看出在相同尺度下,CenterNet相较于YoloV3原版提升比较明显5个百分点,相较于YoloV3-spp也有2个百分点提升 ,但是相较于YoloV3-spp-ultralytics(U版YoloV3-spp),还是有5个百分点的不足。当然这个前提是这些数据准确可靠的,我倾向于相信这个结果,但无法对此结果负责。总结一下:CenterNet相较于YoloV3原版提升比较明显,但是针对改进YoloV3-spp 提升不明显,也低于U版 YoloV3-spp。

总结如下,CenterNet不失为开创性的工作,统一了关键点与目标检测的流程,结构简单,使用便捷,笔者非常喜爱这个网络,把它应用到实际场景之中,速度精度较YoloV3乃至YoloV3-spp均有提升,除了部署难度会稍微大些(主要是DCN目前推理框架支持不友好,但是也是有解决方法的)。

CenterNet凭借结构简单,使用便捷,速度快精度高,占用内存少等优点,是可以替换YoloV3,具备一定优势。虽然YoloV4也出来了,笔者觉得,但是YoloV4在精度提升的同时,整体的复杂程度模型耗时也增加一些,YoloV4完全替换YoloV3,并不现实(读者如果对YoloV4对比YoloV3效果感兴趣,可以评论说出来,如果感兴趣朋友多,笔者可以更新一篇)。

最后到了本文的结论,CenterNet相较于YoloV3优势很明显,推荐尝试替换

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • [Python3 开发技巧]·如何打乱字典中多个对应数组

    当我们把数个对应数组保存到字典中,在我们读取的时候这些数据会按照我们保存的顺序读取出来。如果我们需要打乱顺序,但不改变对应数组的关系时,例如原先位置0对应的各个...

    小宋是呢
  • [Keras实用技巧]·错误Sequential has no attribution “validation_data”解决

    错误描述:Sequential has no attribution “validation_data”

    小宋是呢
  • [PyTorch小试牛刀]实战四·CNN实现逻辑回归对FashionMNIST数据集进行分类(使用GPU)

    结果分析 我笔记本配置为CPU i5 8250u GPU MX150 2G内存 经过测试,使用GPU运算CNN速率大概是CPU的12~15倍(23/1.7...

    小宋是呢
  • javascript设计模式 -- 工厂模式

    工厂模式哈,看了半天感觉大概意思就是说,有这么个函数,它会创建什么样的实例出来, 完全是取决于你传了什么样的参数进去。 创建出来的这些实例,都拥有相同的接口,就...

    web前端教室
  • 【ML】深入理解CatBoost

    CatBoost是俄罗斯的搜索巨头Yandex在2017年开源的机器学习库,是Boosting族算法的一种。CatBoost和XGBoost、LightGBM并...

    yuquanle
  • R语言可视化——地图填充与散点图图层叠加

    今天跟大家分享关于如何在地图图层上添加散点图。 散点图需要精确的经纬度信息才能在叠加的图层上进行映射,因此我们选用中国省级轮廓地图以及各省省会城市的经纬度进行案...

    数据小磨坊
  • 深入理解CatBoost

    CatBoost是俄罗斯的搜索巨头Yandex在2017年开源的机器学习库,是Boosting族算法的一种。CatBoost和XGBoost、LightGBM并...

    Datawhale
  • 百家号爬取(3)

    Centy Zhao
  • 2020.6.11日报:收不到mouse down消息

    1,chromium ie模式,加载视威还是啥的ocx,里面的按钮点击了没mouse down。搞了几天,只能hook每个子窗口,然后响应mouse activ...

    龙泉寺扫地僧
  • java代码实现FTP协议

    前几节我们完成了ftp协议的主要讲解,同时使用wireshark抓包了解ftp数据协议包的特征,本节我们使用代码完成ftp协议,代码将模仿ftp客户端,它与服务...

    望月从良

扫码关注云+社区

领取腾讯云代金券