Torch-TensorRT 詳解¶

Torch-TensorRT 是一個用於 PyTorch 模型的編譯器，透過 TensorRT 模型最佳化 SDK 定位 NVIDIA GPU。它旨在為 PyTorch 模型提供更好的推理效能，同時保持 PyTorch 出色的易用性。

Dynamo 前端¶

Dynamo 前端是 Torch-TensorRT 的預設前端。它利用了 PyTorch 的 dynamo 編譯器棧。

`torch.compile` (即時編譯 JIT)¶

torch.compile 是一個 JIT 編譯器棧，因此編譯會延遲到首次使用時進行。這意味著當圖中的條件發生變化時，圖會自動重新編譯。這為使用者提供了最大的執行時靈活性，但限制了有關序列化的選項。

在底層，torch.compile 將其認為可以下層到 Torch-TensorRT 的子圖委託給它。Torch-TensorRT 進一步將這些圖下層為僅包含 Core ATen Operators 或適合 TensorRT 加速的特定“高階運算元”的操作。子圖會進一步劃分為將在 PyTorch 中執行的部分和根據運算元支援情況將進一步編譯為 TensorRT 的部分。然後，TensorRT 引擎會替換受支援的塊，並將混合子圖返回給 torch.compile 以便在呼叫時執行。

接受的格式¶

torch.fx GraphModule (torch.fx.GraphModule)
PyTorch 模組 (torch.nn.Module)

返回值¶

首次呼叫時觸發編譯的封裝函式

`torch_tensorrt.dynamo.compile` (提前編譯 AOT)¶

torch_tensorrt.dynamo.compile 是一個 AOT 編譯器，模型在顯式的編譯階段進行編譯。這些編譯產物可以被序列化並在以後重新載入。圖透過 torch.export.trace 系統下層為僅包含 Core ATen Operators 或適合 TensorRT 加速的特定“高階運算元”的操作圖。子圖會進一步劃分為將在 PyTorch 中執行的部分和根據運算元支援情況將進一步編譯為 TensorRT 的部分。然後，TensorRT 引擎會替換受支援的塊，並將混合子圖打包到 ExportedProgram 中，該程式可以被序列化和重新載入。

接受的格式¶

torch.export.ExportedProgram (torch.export.ExportedProgram)
torch.fx GraphModule (torch.fx.GraphModule) (透過 torch.export.export)
PyTorch 模組 (torch.nn.Module) (透過 torch.export.export)

返回值¶

torch.fx.GraphModule (可使用 torch.export.ExportedProgram 序列化)

傳統前端¶

由於 PyTorch 生態系統多年來出現了一些編譯器技術，Torch-TensorRT 保留了一些針對它們的傳統功能。

TorchScript (torch_tensorrt.ts.compile)¶

TorchScript 前端是 Torch-TensorRT 最初的預設前端，針對 TorchScript 格式的模型。提供的圖將被劃分為受支援和不受支援的塊。受支援的塊將下層到 TensorRT，不受支援的塊將保留使用 LibTorch 執行。結果圖將作為 ScriptModule 返回給使用者，該模組可以使用 Torch-TensorRT PyTorch 執行時擴充套件進行載入和儲存。

接受的格式¶

TorchScript 模組 (torch.jit.ScriptModule)
PyTorch 模組 (torch.nn.Module) (透過 torch.jit.script 或 torch.jit.trace)

返回值¶

TorchScript 模組 (torch.jit.ScriptModule)

FX 圖模組 (torch_tensorrt.fx.compile)¶

此前端幾乎已完全被 Dynamo 前端取代，Dynamo 前端是 FX 前端可用功能的超集。原始 FX 前端保留在程式碼庫中是為了向後相容。

接受的格式¶

torch.fx GraphModule (torch.fx.GraphModule)
PyTorch 模組 (torch.nn.Module) (透過 torch.fx.trace)

返回值¶

torch.fx GraphModule (torch.fx.GraphModule)

`torch_tensorrt.compile`¶

由於存在許多不同的前端和支援的格式，我們提供了一個名為 torch_tensorrt.compile 的便捷層，允許使用者訪問所有不同的編譯器選項。您可以透過設定 ir 選項來指定 torch_tensorrt.compile 使用哪種編譯器路徑，告知 Torch-TensorRT 嘗試透過特定的中間表示形式來下層提供的模型。

`ir` 選項¶

torch_compile: 使用 torch.compile 系統。立即返回一個在首次呼叫時進行編譯的封裝函式。
dynamo: 透過 torch.export/ torchdynamo 棧執行圖。如果輸入模組是 torch.nn.Module，則必須是“可匯出追蹤的”，因為該模組將使用 torch.export.export 進行追蹤。返回一個 torch.fx.GraphModule，該模組可以立即執行或透過 torch.export.export 或 torch_tensorrt.save 進行儲存。
torchscript 或 ts: 透過 TorchScript 棧執行圖。如果輸入模組是 torch.nn.Module，則必須是“可指令碼化的”，因為該模組將使用 torch.jit.script 進行編譯。返回一個 torch.jit.ScriptModule，該模組可以立即執行或透過 torch.save 或 torch_tensorrt.save 進行儲存。
fx: 透過 torch.fx 棧執行圖。如果輸入模組是 torch.nn.Module，它將使用 torch.fx.trace 進行追蹤，並受其限制。

Torch-TensorRT 詳解¶

Dynamo 前端¶

`torch.compile` (即時編譯 JIT)¶

接受的格式¶

返回值¶

`torch_tensorrt.dynamo.compile` (提前編譯 AOT)¶

接受的格式¶

返回值¶

傳統前端¶

TorchScript (torch_tensorrt.ts.compile)¶

接受的格式¶

返回值¶

FX 圖模組 (torch_tensorrt.fx.compile)¶

接受的格式¶

返回值¶

`torch_tensorrt.compile`¶

`ir` 選項¶

文件

教程

資源

Torch-TensorRT 詳解¶

Dynamo 前端¶

torch.compile (即時編譯 JIT)¶

接受的格式¶

返回值¶

torch_tensorrt.dynamo.compile (提前編譯 AOT)¶

接受的格式¶

返回值¶

傳統前端¶

TorchScript (torch_tensorrt.ts.compile)¶

接受的格式¶

返回值¶

FX 圖模組 (torch_tensorrt.fx.compile)¶

接受的格式¶

返回值¶

torch_tensorrt.compile¶

ir 選項¶

文件

教程

資源

`torch.compile` (即時編譯 JIT)¶

`torch_tensorrt.dynamo.compile` (提前編譯 AOT)¶

`torch_tensorrt.compile`¶

`ir` 選項¶