ExecuTorch 入門¶

本節旨在描述將 PyTorch 模型轉換為 ExecuTorch 並執行所需的步驟。要使用該框架，您通常需要執行以下步驟：

安裝 ExecuTorch python 包和執行時庫。
為目標硬體配置匯出 PyTorch 模型。
在您的開發平臺使用 ExecuTorch 執行時 API 執行模型。
使用 ExecuTorch 執行時將模型部署到目標平臺。

系統要求¶

安裝 ExecuTorch 主機庫（匯出模型和從 Python 執行所需）需要滿足以下條件。目標終端使用者裝置的具體要求取決於後端。有關更多資訊，請參閱相應的後端文件。

Python 3.10 - 3.12
g++ 版本 7 或更高，clang++ 版本 5 或更高，或另一個相容 C++17 的工具鏈。
Linux 或 MacOS 作業系統 (Arm 或 x86)。
- 透過 WSL 支援 Windows。

安裝¶

要使用 ExecuTorch，您需要安裝 Python 包和相應的平臺特定執行時庫。 Pip 是安裝 ExecuTorch python 包的推薦方式。

此包包含匯出 PyTorch 模型所需的依賴項，以及用於模型測試和評估的 Python 執行時繫結。考慮在虛擬環境中安裝 ExecuTorch，例如 conda 或 venv 提供的環境。

pip install executorch

要從原始碼構建框架，請參閱從原始碼構建。Backend delegate 可能需要額外的依賴項。有關更多資訊，請參閱相應的後端文件。

準備模型¶

匯出是將 PyTorch 模型轉換為 ExecuTorch 執行時使用的 .pte 檔案格式的過程。這透過 Python API 完成。常見的模型（如 Llama 3.2）的 PTE 檔案可以在 HuggingFace 的 ExecuTorch 社群下找到。這些模型已經為 ExecuTorch 匯出和降級，可以直接部署，無需經過降級過程。

匯出、降級和驗證 MobileNet V2 的完整示例可在 Colab 筆記本中找到。

要求¶

一個 PyTorch 模型。
示例模型輸入，通常是 PyTorch tensors。您應該能夠使用這些輸入成功執行 PyTorch 模型。
一個或多個目標硬體後端。

選擇後端¶

ExecuTorch 為各種硬體提供硬體加速。最常用的後端包括用於 Arm 和 x86 CPU 的 XNNPACK，Core ML（用於 iOS），Vulkan（用於 Android GPU），以及 Qualcomm（用於高通晶片驅動的 Android 手機）。

對於移動用例，考慮將 XNNPACK 用於 Android，將 Core ML 或 XNNPACK 用於 iOS 作為第一步。有關更多資訊，請參閱硬體後端。

匯出¶

匯出使用 Python API 完成。ExecuTorch 在匯出過程中提供了高度的定製性，但典型的流程如下。本示例使用 torchvision 中 MobileNet V2 影像分類模型的實現，但此過程支援任何符合匯出規範的 PyTorch 模型。對於使用 Hugging Face 模型的使用者，您可以在 huggingface/optimum-executorch 倉庫中找到支援的模型列表。

import torch
import torchvision.models as models
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
from executorch.exir import to_edge_transform_and_lower

model = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
sample_inputs = (torch.randn(1, 3, 224, 224), )

et_program = to_edge_transform_and_lower(
    torch.export.export(model, sample_inputs),
    partitioner=[XnnpackPartitioner()]
).to_executorch()

with open("model.pte", "wb") as f:
    f.write(et_program.buffer)

如果模型需要可變輸入大小，您需要在 export 呼叫中指定可變的維度和邊界。有關更多資訊，請參閱模型匯出和降級。

要定位的硬體後端由 to_edge_transform_and_lower 的 partitioner 引數控制。在此示例中，使用 XnnpackPartitioner 來定位移動 CPU。有關如何使用每個後端的資訊，請參閱特定後端文件。

量化也可以在此階段完成，以減少模型大小和執行時。量化是後端特定的。有關支援的量化方案的完整描述，請參閱目標後端的文件。

測試模型¶

成功生成 .pte 檔案後，通常使用 Python 執行時 API 在開發平臺上驗證模型。這可用於在裝置上執行之前評估模型精度。

對於本示例中使用的 torchvision 的 MobileNet V2 模型，影像輸入預期為歸一化的 float32 張量，其維度為 (batch, channels, height, width)。有關此模型的輸入和輸出張量格式的更多資訊，請參見 torchvision.models.mobilenet_v2。

import torch
from executorch.runtime import Runtime
from typing import List

runtime = Runtime.get()

input_tensor: torch.Tensor = torch.randn(1, 3, 224, 224)
program = runtime.load_program("model.pte")
method = program.load_method("forward")
output: List[torch.Tensor] = method.execute([input_tensor])
print("Run succesfully via executorch")

from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
import torchvision.models as models

eager_reference_model = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
eager_reference_output = eager_reference_model(input_tensor)

print("Comparing against original PyTorch module")
print(torch.allclose(output[0], eager_reference_output, rtol=1e-3, atol=1e-5))

有關匯出和執行模型的完整示例，請參閱我們的示例 GitHub 倉庫。

此外，如果您使用 Hugging Face 模型，huggingface/optimum-executorch 庫簡化了使用熟悉的 Hugging Face API 在 ExecuTorch 中端到端執行這些模型的過程。請訪問該倉庫檢視具體示例和支援的模型。

在裝置上執行¶

ExecuTorch 提供了 Java、Objective-C 和 C++ 的執行時 API。

快速連結

Android
iOS
C++

Android¶

安裝¶

ExecuTorch 為 Android 用途提供了 Java 繫結，可用於 Java 和 Kotlin。要將庫新增到您的應用，請將以下依賴項新增到 gradle 構建規則中。

# app/build.gradle.kts
dependencies {
  implementation("org.pytorch:executorch-android:0.6.0-rc3")
}

# See latest available versions in https://mvnrepository.com/artifact/org.pytorch/executorch-android

執行時 API¶

可以使用 Module 類載入和執行模型

import org.pytorch.executorch.EValue;
import org.pytorch.executorch.Module;
import org.pytorch.executorch.Tensor;

// …

Module model = Module.load("/path/to/model.pte");

Tensor input_tensor = Tensor.fromBlob(float_data, new long[] { 1, 3, height, width });
EValue input_evalue = EValue.from(input_tensor);
EValue[] output = model.forward(input_evalue);
float[] scores = output[0].toTensor().getDataAsFloatArray();

有關在 Android 上執行模型的完整示例，請參閱 DeepLabV3AndroidDemo。有關 Android 開發的更多資訊，包括從原始碼構建、Java API 的完整描述以及從 Android native 程式碼使用 ExecuTorch 的資訊，請參閱在 Android 上使用 ExecuTorch。

iOS¶

安裝¶

ExecuTorch 透過 C++ 支援 iOS 和 MacOS，並提供 CoreML、MPS 和 CPU 的硬體後端。iOS 執行時庫作為 .xcframework 目標的集合提供，並以 Swift PM 包的形式提供。

要開始使用 Xcode，請轉到 File > Add Package Dependencies。將 ExecuTorch 倉庫的 URL 貼上到搜尋欄中並選中它。確保將分支名稱更改為所需的 ExecuTorch 版本，格式為“swiftpm-”，（例如，“swiftpm-0.6.0”）。也可以手動將 ExecuTorch 依賴項新增到 package 檔案中。有關更多資訊，請參閱在 iOS 上使用 ExecuTorch。

執行時 API¶

可以使用 C++ API 從 Objective-C 載入和執行模型。

有關 iOS 整合的更多資訊，包括 API 參考、日誌設定和從原始碼構建，請參閱在 iOS 上使用 ExecuTorch。

C++¶

ExecuTorch 提供 C++ API，可用於嵌入式或移動裝置。與其它語言繫結相比，C++ API 提供了更高的控制級別，允許進行高階記憶體管理、資料載入和平臺整合。

安裝¶

CMake 是 ExecuTorch C++ 執行時首選的構建系統。要在 CMake 中使用，將 ExecuTorch 倉庫克隆為專案的子目錄，並使用 CMake 的 add_subdirectory("executorch") 來包含依賴項。executorch 目標以及 kernel 和 backend 目標將可用於連結。執行時也可以獨立構建以支援不同的工具鏈。有關構建整合、目標和交叉編譯的詳細描述，請參閱將 ExecuTorch 與 C++ 結合使用。

git clone -b release/0.6 https://github.com/pytorch/executorch.git

# CMakeLists.txt
add_subdirectory("executorch")
...
target_link_libraries(
  my_target
  PRIVATE executorch
          extension_module_static
          extension_tensor
          optimized_native_cpu_ops_lib
          xnnpack_backend)

執行時 API¶

提供了高階和低階 C++ API。低階 API 獨立於平臺，不動態分配記憶體，最適合資源受限的嵌入式系統。高階 API 是低階 API 的便捷包裝，並使用動態記憶體分配和標準庫結構來減少冗餘。

ExecuTorch 使用 CMake 進行 native 構建。整合通常透過克隆 ExecuTorch 倉庫並使用 CMake add_subdirectory 新增依賴項來完成。

使用高階 API 載入和執行模型可以按如下方式完成

#include <executorch/extension/module/module.h>
#include <executorch/extension/tensor/tensor.h>

using namespace ::executorch::extension;

// Load the model.
Module module("/path/to/model.pte");

// Create an input tensor.
float input[1 * 3 * 224 * 224];
auto tensor = from_blob(input, {1, 3, 224, 224});

// Perform an inference.
const auto result = module.forward(tensor);

if (result.ok()) {
  // Retrieve the output data.
  const auto output = result->at(0).toTensor().const_data_ptr<float>();
}

有關 C++ API 的更多資訊，請參閱在 C++ 中使用 Module Extension 執行 ExecuTorch 模型和在 C++ 中管理 Tensor 記憶體。

有關構建和執行 C++ 應用程式的完整示例，請參閱我們的示例 GitHub 倉庫。

下一步¶

ExecuTorch 提供了高度的定製性，以支援各種硬體目標。根據您的用例，考慮探索以下頁面中的一個或多個：

匯出和降級，瞭解高階模型轉換選項。
後端概覽，瞭解可用後端和配置選項。
在 Android 上使用 ExecuTorch 和在 iOS 上使用 ExecuTorch，瞭解移動執行時整合。
將 ExecuTorch 與 C++ 結合使用，瞭解嵌入式和移動 native 開發。
效能分析和除錯，瞭解開發者工具和除錯。
API 參考，瞭解可用 API 的完整描述。
示例，瞭解示例應用和示例程式碼。

ExecuTorch 入門¶

系統要求¶

安裝¶

準備模型¶

要求¶

選擇後端¶

匯出¶

測試模型¶

在裝置上執行¶

Android¶

安裝¶

執行時 API¶

iOS¶

安裝¶

執行時 API¶

C++¶

安裝¶

執行時 API¶

下一步¶

文件

教程

資源