跳转到主要内容

xdputil工具的应用指南

judy 提交于

<font color="#FF8000">作者:Grace Sun,AMD工程师;来源:AMD开发者社区</font>

Vitis AI Library包含了xdputil工具,可作为板级开发的辅助调试手段,其源代码位于以下位置:

https://github.com/Xilinx/Vitis-AI/tree/master/src/vai_library/usefulto…

在预编译的官方board image和Vitis AI docker中均已安装了xdputil。对于定制的target board,安装方式可参考对应版本的Vitis AI Library用户指南,例如:

https://docs.amd.com/r/en-US/ug1354-xilinx-ai-sdk/Step-3-Installing-the…

在docker环境下跑xdputil,可运行usr/bin/python3 -m xdputil。以下是运行xdputil -h以后的用法概览:
<center><img src="https://fpga.eetrend.com/files/2024-09/wen_zhang_/100584824-361982-1.pn…; alt=""></center>

大部分的子命令需要关联DPU和Device信息,只能在目标板上运行。一般在docker里面可对xmodel文件做进一步解析和查看。

对于xdputil xmodel子命令,可以进一步用-h查看用法。
<center><img src="https://fpga.eetrend.com/files/2024-09/wen_zhang_/100584824-361983-2.pn…; alt=""></center>

以下给出了一些具体示例及命令输出。

<strong>显示xmodel subgraph信息,包括input/output tensor,kernel。</strong>
<pre>
xdputil xmodel -l yolov6m_pt.xmodel

{

"subgraphs":[

{

"index":0,

"name":"subgraph_ModelNNDct__ModelNNDct_QuantStub_quant__input_1",

"device":"USER"

},

{

"index":1,

"name":"subgraph_ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3",

"device":"DPU",

"fingerprint":"0x603000b56011861",

"DPU Arch":"DPUCVDX8G_ISA3_C32B6",

"workload":81415208000,

"input_tensor":[

{

"index":0,

"name":"ModelNNDct__ModelNNDct_QuantStub_quant__input_1_fix",

"shape":[

1,

640,

640,

3

],

"fixpos":6

}

],

"output_tensor":[

{

"index":0,

"name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_1__inputs_7_fix",

"shape":[

1,

40,

40,

4

],

"fixpos":3

},

{

"index":1,

"name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_2__24287_fix",

"shape":[

1,

20,

20,

80

],

"fixpos":4

},

{

"index":2,

"name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_2__inputs_11_fix",

"shape":[

1,

20,

20,

4

],

"fixpos":3

},

{

"index":3,

"name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_0__24021_fix",

"shape":[

1,

80,

80,

80

],

"fixpos":4

},

{

"index":4,

"name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_0__inputs_3_fix",

"shape":[

1,

80,

80,

4

],

"fixpos":4

},

{

"index":5,

"name":"ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_1__24154_fix",

"shape":[

1,

40,

40,

80

],

"fixpos":4

}

],

"reg info":[

{

"name":"REG_0",

"context type":"CONST",

"size":34244864

},

{

"name":"REG_1",

"context type":"WORKSPACE",

"size":37017600

},

{

"name":"REG_2",

"context type":"DATA_LOCAL_INPUT",

"size":1230784

},

{

"name":"REG_3",

"context type":"DATA_LOCAL_OUTPUT",

"size":705600

}

],

"instruction reg":213312

},

{

"index":2,

"name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_2__inputs_11_fix_",

"device":"CPU"

},

{

"index":3,

"name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_1__inputs_7_fix_",

"device":"CPU"

},

{

"index":4,

"name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_reg_preds__ModuleList_0__inputs_3_fix_",

"device":"CPU"

},

{

"index":5,

"name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_2__24287_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs",

"device":"CPU"

},

{

"index":6,

"name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_1__24154_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs_9",

"device":"CPU"

},

{

"index":7,

"name":"subgraph_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__Conv2d__module__ModuleList_cls_preds__ModuleList_0__24021_fix_ModelNNDct__ModelNNDct_Detect__module__Detect_detect__inputs_5",

"device":"CPU"

}

]

}
</pre>

把xmodel转成其他格式
<img src="https://fpga.eetrend.com/files/2024-09/wen_zhang_/100584824-361989-qita…; alt="">

以-t为例,xdputil xmodel yolov6m_pt.xmodel -t yolov6_mt_xmodel.txt

从导出的.txt中可以获取input/output tensor,op_node等的详细属性。

<strong>显示xmodel中某一个operator的信息,op_name可从上述导出的.txt中获取。</strong>
<pre>
xdputil xmodel yolov6m_pt.xmodel --op ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3

xmodel: yolov6m_pt.xmodel

op_name: ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3

{

"name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3",

"type" : "conv2d-fix",

"attrs" : {

"workload" : 270336000,

"device" : "DPU",

"bias_term" : true,

"workload_on_arch" : 635699200,

"shift_hsigmoid" : -128,

"nonlinear" : "RELU",

"kernel" : [

3,

3

],

"dilation" : [

1,

1

],

"hsigmoid_in" : -128,

"stride" : [

2,

2

],

"out_dim" : 48,

"channel_augmentation" : 1,

"pad_mode" : "FLOOR",

"shift_hswish" : -128,

"pad" : [

1,

0,

1,

0

],

"in_dim" : 3,

"group" : 1

},

"inputs" : [

{

"index" : 0,

"op_name" : "ModelNNDct__ModelNNDct_QuantStub_quant__input_1_upload_0",

"tensor_name" : "ModelNNDct__ModelNNDct_QuantStub_quant__input_1_fix_upload_0",

"shape" : [

1,

640,

640,

3

],

"data_type" : "xint8"

},

{

"index" : 1,

"op_name" : "ModelNNDct___module_backbone_stem_conv_weight",

"tensor_name" : "ModelNNDct___module_backbone_stem_conv_weight_fix",

"shape" : [

48,

3,

3,

3

],

"data_type" : "xint8"

},

{

"index" : 2,

"op_name" : "ModelNNDct___module_backbone_stem_conv_bias",

"tensor_name" : "ModelNNDct___module_backbone_stem_conv_bias_fix",

"shape" : [

48

],

"data_type" : "xint8"

}

],

"outputs" : {

"op_name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__Conv2d_conv__input_3",

"tensor_name" : "ModelNNDct__ModelNNDct_CSPBepBackbone__module__CSPBepBackbone_backbone__RealVGGBlock_stem__ReLU_relu__input_7_fix",

"shape" : [

1,

320,

320,

48

],

"data_type" : "xint8",

"attrs" : {

"round_mode" : "DPU_ROUND",

"bit_width" : 8,

"location" : 1,

"reg_id" : 1,

"fix_point" : 4,

"if_signed" : true,

"ddr_addr" : 11827200,

"stride" : [

4915200,

15360,

48,

1

]

}

}

}
</pre>

<strong>显示device信息,包括DPU 配置,指纹信息,runtime版本等,这可以帮助用户快速了解当前board的DPU重要信息,辅助调试运行中跟DPU兼容性相关的失败。</strong>

xdputil query
<center><img src="https://fpga.eetrend.com/files/2024-09/wen_zhang_/100584824-361984-3.pn…; alt=""></center>

<strong>显示DPU寄存器状态</strong>
xdputil status
<center><img src="https://fpga.eetrend.com/files/2024-09/wen_zhang_/100584824-361985-4.pn…; alt=""></center>

<strong>做benchmark测试</strong>
xdputil benchmark <xmodel> [-i subgraph_index] <num_of_threads>

subgraph_index从0开始,-i设成-1表示跑整个graph。Subgraph_index可从xdputil xmodel -l的输出中获取。

如果第一级为USER subgraph,那么-i 0会报错。

{

"index":0,

"name":"subgraph_ModelNNDct__ModelNNDct_QuantStub_quant__input_1",

"device":"USER"

},
<center><img src="https://fpga.eetrend.com/files/2024-09/wen_zhang_/100584824-361986-5.pn…; alt=""></center>

改成-i 1后可以正常测试。
<center><img src="https://fpga.eetrend.com/files/2024-09/wen_zhang_/100584824-361987-6.pn…; alt=""></center>

xdputil run可用于DPU运行结果不正确的调试,交叉检查参考值和DPU推理值。UG1414中给出了具体步骤:
https://docs.amd.com/r/en-US/ug1414-vitis-ai/DPU-Debug-with-VART

总之,xdputil的用法简单,可以辅助用户更直观深入地了解编译后的模型以及当前DPU的一些信息,在调试诸如DPU无法找到,指纹不匹配,以及和量化后准确率差异过大等问题的时候是一个有效的调试手段。