convert_fx¶
- class torch.ao.quantization.quantize_fx.convert_fx(graph_module, convert_custom_config=None, _remove_qconfig=True, qconfig_mapping=None, backend_config=None, keep_original_weights=False)[source][source]¶
將經過校準或訓練的模型轉換為量化模型
- 引數
graph_module (*) – 經過準備和校準/訓練的模型 (GraphModule)
convert_custom_config (*) – convert 函式的自定義配置。詳見
ConvertCustomConfig_remove_qconfig (*) – 轉換後移除模型中 qconfig 屬性的選項。
qconfig_mapping (*) –
指定如何進行模型量化的配置。
鍵必須包含傳遞給 prepare_fx 或 prepare_qat_fx 中的 qconfig_mapping 的鍵,且值相同或為 None。可以指定額外的鍵,其值設定為 None。
對於值為 None 的每個條目,我們將跳過對模型中該條目的量化
qconfig_mapping = QConfigMapping .set_global(qconfig_from_prepare) .set_object_type(torch.nn.functional.add, None) # skip quantizing torch.nn.functional.add .set_object_type(torch.nn.functional.linear, qconfig_from_prepare) .set_module_name("foo.bar", None) # skip quantizing module "foo.bar"
- backend_config (BackendConfig):後端配置,描述瞭如何
在後端中對運算元進行量化,這包括量化模式支援(靜態/動態/僅權重)、dtype 支援(quint8/qint8 等)、每個運算元和融合運算元的觀察器放置。詳見
BackendConfig
- 返回
一個量化模型 (torch.nn.Module)
- 返回型別
示例
# prepared_model: the model after prepare_fx/prepare_qat_fx and calibration/training # convert_fx converts a calibrated/trained model to a quantized model for the # target hardware, this includes converting the model first to a reference # quantized model, and then lower the reference quantized model to a backend # Currently, the supported backends are fbgemm (onednn), qnnpack (xnnpack) and # they share the same set of quantized operators, so we are using the same # lowering procedure # # backend_config defines the corresponding reference quantized module for # the weighted modules in the model, e.g. nn.Linear # TODO: add backend_config after we split the backend_config for fbgemm and qnnpack # e.g. backend_config = get_default_backend_config("fbgemm") quantized_model = convert_fx(prepared_model)