注意

點選這裡下載完整示例程式碼

介紹 || 張量 || Autograd || 構建模型 || TensorBoard 支援 || 訓練模型 || 模型理解

PyTorch 張量簡介¶

創建於： Nov 30, 2021 | 最後更新於： Jan 29, 2025 | 最後驗證於： Nov 05, 2024

觀看下面的影片或在 YouTube 上觀看。

張量是 PyTorch 中的核心資料抽象。這個互動式 Notebook 深入介紹了 torch.Tensor 類。

首先，我們先匯入 PyTorch 模組。我們還將新增 Python 的 math 模組以方便一些示例。

import torch
import math

建立張量¶

建立張量最簡單的方法是使用 torch.empty() 呼叫

x = torch.empty(3, 4)
print(type(x))
print(x)

<class 'torch.Tensor'>
tensor([[9.7006e+34, 3.0663e-41, 2.5645e+27, 3.0663e-41],
        [1.1210e-43, 0.0000e+00, 8.9683e-44, 0.0000e+00],
        [7.6470e-19, 3.0670e-41, 4.6243e-44, 0.0000e+00]])

讓我們來解析一下剛剛的操作

我們使用附屬於 torch 模組的眾多工廠方法之一建立了一個張量。
張量本身是二維的，具有 3 行和 4 列。
返回物件的型別是 torch.Tensor，它是 torch.FloatTensor 的別名；預設情況下，PyTorch 張量填充的是 32 位浮點數。（更多關於資料型別的資訊見下文。）
列印張量時，你可能會看到一些看起來隨機的值。torch.empty() 呼叫為張量分配記憶體，但不使用任何值初始化它 — 所以你看到的是分配記憶體時記憶體中的任何內容。

關於張量及其維數的一些簡要說明，以及術語

你有時會看到一個一維張量，被稱為一個 向量。
同樣地，一個二維張量通常被稱為一個 矩陣。
任何維度超過兩個的，通常就稱為張量。

通常，你會希望使用某個值來初始化你的張量。常見的情況是全部為零、全部為一或隨機值，並且 torch 模組為所有這些情況提供了工廠方法

zeros = torch.zeros(2, 3)
print(zeros)

ones = torch.ones(2, 3)
print(ones)

torch.manual_seed(1729)
random = torch.rand(2, 3)
print(random)

tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[0.3126, 0.3791, 0.3087],
        [0.0736, 0.4216, 0.0691]])

這些工廠方法的功能正如你所預期 — 我們建立了一個全零張量，另一個全一張量，還有一個填充了介於 0 和 1 之間的隨機值的張量。

隨機張量與種子設定¶

說到隨機張量，你注意到緊隨其後的 torch.manual_seed() 呼叫了嗎？使用隨機值初始化張量（例如模型的學習權重）是很常見的，但有時——尤其是在研究環境中——你會需要確保結果的可復現性。手動設定隨機數生成器的種子就是實現這一目標的方法。讓我們仔細看看

torch.manual_seed(1729)
random1 = torch.rand(2, 3)
print(random1)

random2 = torch.rand(2, 3)
print(random2)

torch.manual_seed(1729)
random3 = torch.rand(2, 3)
print(random3)

random4 = torch.rand(2, 3)
print(random4)

tensor([[0.3126, 0.3791, 0.3087],
        [0.0736, 0.4216, 0.0691]])
tensor([[0.2332, 0.4047, 0.2162],
        [0.9927, 0.4128, 0.5938]])
tensor([[0.3126, 0.3791, 0.3087],
        [0.0736, 0.4216, 0.0691]])
tensor([[0.2332, 0.4047, 0.2162],
        [0.9927, 0.4128, 0.5938]])

你應該看到的是 random1 和 random3 的值是相同的，random2 和 random4 的值也是如此。手動設定 RNG（隨機數生成器）的種子會重置它，這樣一來，依賴於隨機數的相同計算在大多數情況下會產生相同的結果。

更多資訊，請參閱 PyTorch 關於可復現性的文件。

張量形狀¶

通常，當你對兩個或更多張量執行操作時，它們需要具有相同的形狀 — 也就是說，具有相同的維數以及每個維度中相同的單元數量。為此，我們有 torch.*_like() 方法

x = torch.empty(2, 2, 3)
print(x.shape)
print(x)

empty_like_x = torch.empty_like(x)
print(empty_like_x.shape)
print(empty_like_x)

zeros_like_x = torch.zeros_like(x)
print(zeros_like_x.shape)
print(zeros_like_x)

ones_like_x = torch.ones_like(x)
print(ones_like_x.shape)
print(ones_like_x)

rand_like_x = torch.rand_like(x)
print(rand_like_x.shape)
print(rand_like_x)

torch.Size([2, 2, 3])
tensor([[[6.9458e-19, 3.0670e-41, 1.4013e-45],
         [0.0000e+00, 1.4013e-45, 0.0000e+00]],

        [[1.4013e-45, 0.0000e+00, 1.4013e-45],
         [0.0000e+00, 1.4013e-45, 0.0000e+00]]])
torch.Size([2, 2, 3])
tensor([[[1.0845e-17, 3.0670e-41, 1.4013e-45],
         [0.0000e+00, 1.4013e-45, 0.0000e+00]],

        [[1.4013e-45, 0.0000e+00, 1.4013e-45],
         [0.0000e+00, 1.4013e-45, 0.0000e+00]]])
torch.Size([2, 2, 3])
tensor([[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]])
torch.Size([2, 2, 3])
tensor([[[1., 1., 1.],
         [1., 1., 1.]],

        [[1., 1., 1.],
         [1., 1., 1.]]])
torch.Size([2, 2, 3])
tensor([[[0.6128, 0.1519, 0.0453],
         [0.5035, 0.9978, 0.3884]],

        [[0.6929, 0.1703, 0.1384],
         [0.4759, 0.7481, 0.0361]]])

上面程式碼單元中的第一個新內容是對張量使用 .shape 屬性。這個屬性包含一個列表，列出了張量每個維度的大小 — 在我們的例子中，x 是一個形狀為 2 x 2 x 3 的三維張量。

在那之下，我們呼叫了 .empty_like()、.zeros_like()、.ones_like() 和 .rand_like() 方法。使用 .shape 屬性，我們可以驗證這些方法都返回了具有相同維度和大小的張量。

我們將介紹的最後一種建立張量的方法是直接從 PyTorch 集合指定其資料

some_constants = torch.tensor([[3.1415926, 2.71828], [1.61803, 0.0072897]])
print(some_constants)

some_integers = torch.tensor((2, 3, 5, 7, 11, 13, 17, 19))
print(some_integers)

more_integers = torch.tensor(((2, 4, 6), [3, 6, 9]))
print(more_integers)

tensor([[3.1416, 2.7183],
        [1.6180, 0.0073]])
tensor([ 2,  3,  5,  7, 11, 13, 17, 19])
tensor([[2, 4, 6],
        [3, 6, 9]])

如果你已經有了 Python 元組或列表中的資料，使用 torch.tensor() 是建立張量最直接的方法。如上所示，巢狀集合會產生一個多維張量。

注意

torch.tensor() 會建立一個數據副本。

張量資料型別¶

設定張量的資料型別有幾種方法

a = torch.ones((2, 3), dtype=torch.int16)
print(a)

b = torch.rand((2, 3), dtype=torch.float64) * 20.
print(b)

c = b.to(torch.int32)
print(c)

tensor([[1, 1, 1],
        [1, 1, 1]], dtype=torch.int16)
tensor([[ 0.9956,  1.4148,  5.8364],
        [11.2406, 11.2083, 11.6692]], dtype=torch.float64)
tensor([[ 0,  1,  5],
        [11, 11, 11]], dtype=torch.int32)

設定張量底層資料型別最簡單的方法是在建立時使用可選引數。在上面程式碼單元的第一行中，我們為張量 a 設定了 dtype=torch.int16。當我們列印 a 時，我們可以看到它填充的是 1 而不是 1. — 這是 Python 細微的提示，表明這是一個整數型別而不是浮點型別。

關於列印 a 還需要注意的另一件事是，不像我們使用預設 dtype (32 位浮點數) 時那樣，列印張量時也指定了它的 dtype。

你可能還注意到，我們從將張量形狀指定為一系列整數引數的方式，變為了將這些引數分組到一個元組中。這並非嚴格必要——PyTorch 會將一系列初始的、未標記的整數引數視為張量形狀——但在新增可選引數時，這樣做可以使你的意圖更易讀。

設定資料型別的另一種方法是使用 .to() 方法。在上面的程式碼單元中，我們以通常的方式建立了一個隨機浮點張量 b。之後，我們透過使用 .to() 方法將 b 轉換為 32 位整數來建立 c。注意 c 包含了與 b 相同的所有值，但截斷為整數。

更多資訊，請參閱資料型別文件。

使用 PyTorch 張量進行數學與邏輯運算¶

既然你瞭解了一些建立張量的方法……你可以用它們做什麼呢？

首先我們來看基本算術運算，以及張量如何與簡單標量互動

ones = torch.zeros(2, 2) + 1
twos = torch.ones(2, 2) * 2
threes = (torch.ones(2, 2) * 7 - 1) / 2
fours = twos ** 2
sqrt2s = twos ** 0.5

print(ones)
print(twos)
print(threes)
print(fours)
print(sqrt2s)

tensor([[1., 1.],
        [1., 1.]])
tensor([[2., 2.],
        [2., 2.]])
tensor([[3., 3.],
        [3., 3.]])
tensor([[4., 4.],
        [4., 4.]])
tensor([[1.4142, 1.4142],
        [1.4142, 1.4142]])

如你在上面看到的，張量與標量之間的算術運算，例如加、減、乘、除和指數運算，會分發到張量的每個元素上。因為此類操作的輸出將是一個張量，你可以按照通常的運算子優先順序規則將它們串聯起來，就像我們建立 threes 的那一行一樣。

兩個張量之間的類似操作，表現也如你直觀期望的那樣

powers2 = twos ** torch.tensor([[1, 2], [3, 4]])
print(powers2)

fives = ones + fours
print(fives)

dozens = threes * fours
print(dozens)

tensor([[ 2.,  4.],
        [ 8., 16.]])
tensor([[5., 5.],
        [5., 5.]])
tensor([[12., 12.],
        [12., 12.]])

這裡需要注意的是，上一個程式碼單元中的所有張量形狀都相同。如果張量形狀不同，當我們嘗試執行二元操作時會發生什麼？

注意

下面的單元格會丟擲一個執行時錯誤。這是預期的行為。

a = torch.rand(2, 3)
b = torch.rand(3, 2)

print(a * b)

一般來說，你不能以這種方式對形狀不同的張量進行操作，即使在像上面那個單元格那樣，張量具有相同數量元素的情況下也不行。

簡而言之：張量廣播¶

注意

如果你熟悉 NumPy ndarrays 中的廣播語義，你會發現相同的規則也適用於這裡。

相同形狀規則的例外是 張量廣播。 這是一個例子

rand = torch.rand(2, 4)
doubled = rand * (torch.ones(1, 4) * 2)

print(rand)
print(doubled)

tensor([[0.6146, 0.5999, 0.5013, 0.9397],
        [0.8656, 0.5207, 0.6865, 0.3614]])
tensor([[1.2291, 1.1998, 1.0026, 1.8793],
        [1.7312, 1.0413, 1.3730, 0.7228]])

這裡的技巧是什麼？我們怎麼能將一個 2x4 的張量乘以一個 1x4 的張量呢？

廣播是一種在形狀相似的張量之間執行操作的方法。在上面的例子中，一行四列的張量被乘以兩行四列張量的兩行。

這在深度學習中是一個重要的操作。常見的例子是將學習權重的張量乘以一批輸入張量，將操作分別應用於批次中的每個例項，並返回一個相同形狀的張量 — 就像我們上面 (2, 4) * (1, 4) 的例子返回了一個形狀為 (2, 4) 的張量一樣。

廣播規則是

每個張量必須至少有一個維度 — 沒有空張量。
比較兩個張量的維度大小，從最後一個維度開始向前比較：
- 每個維度必須相等，或者
- 其中一個維度的大小必須為 1，或者
- 該維度在其中一個張量中不存在

當然，形狀相同的張量可以輕易地進行“廣播”，如你之前所見。

這裡有一些遵循上述規則並允許廣播的情況的例子

a =     torch.ones(4, 3, 2)

b = a * torch.rand(   3, 2) # 3rd & 2nd dims identical to a, dim 1 absent
print(b)

c = a * torch.rand(   3, 1) # 3rd dim = 1, 2nd dim identical to a
print(c)

d = a * torch.rand(   1, 2) # 3rd dim identical to a, 2nd dim = 1
print(d)

tensor([[[0.6493, 0.2633],
         [0.4762, 0.0548],
         [0.2024, 0.5731]],

        [[0.6493, 0.2633],
         [0.4762, 0.0548],
         [0.2024, 0.5731]],

        [[0.6493, 0.2633],
         [0.4762, 0.0548],
         [0.2024, 0.5731]],

        [[0.6493, 0.2633],
         [0.4762, 0.0548],
         [0.2024, 0.5731]]])
tensor([[[0.7191, 0.7191],
         [0.4067, 0.4067],
         [0.7301, 0.7301]],

        [[0.7191, 0.7191],
         [0.4067, 0.4067],
         [0.7301, 0.7301]],

        [[0.7191, 0.7191],
         [0.4067, 0.4067],
         [0.7301, 0.7301]],

        [[0.7191, 0.7191],
         [0.4067, 0.4067],
         [0.7301, 0.7301]]])
tensor([[[0.6276, 0.7357],
         [0.6276, 0.7357],
         [0.6276, 0.7357]],

        [[0.6276, 0.7357],
         [0.6276, 0.7357],
         [0.6276, 0.7357]],

        [[0.6276, 0.7357],
         [0.6276, 0.7357],
         [0.6276, 0.7357]],

        [[0.6276, 0.7357],
         [0.6276, 0.7357],
         [0.6276, 0.7357]]])

仔細看看上面每個張量的值

建立 b 的乘法操作被廣播到了 a 的每一層。
對於 c，該操作被廣播到了 a 的每一層和每一行 — 每一列（3 個元素）都是相同的。
對於 d，我們將其調換了過來 — 現在每一行都是相同的，跨越了層和列。

更多關於廣播的資訊，請參閱 PyTorch 文件。

這裡有一些嘗試廣播但會失敗的例子

注意

下面的單元格會丟擲一個執行時錯誤。這是預期的行為。

a =     torch.ones(4, 3, 2)

b = a * torch.rand(4, 3)    # dimensions must match last-to-first

c = a * torch.rand(   2, 3) # both 3rd & 2nd dims different

d = a * torch.rand((0, ))   # can't broadcast with an empty tensor

使用張量進行更多數學運算¶

PyTorch 張量擁有三百多種可以對其進行的操作。

這裡是一些主要操作類別中的一個小示例

# common functions
a = torch.rand(2, 4) * 2 - 1
print('Common functions:')
print(torch.abs(a))
print(torch.ceil(a))
print(torch.floor(a))
print(torch.clamp(a, -0.5, 0.5))

# trigonometric functions and their inverses
angles = torch.tensor([0, math.pi / 4, math.pi / 2, 3 * math.pi / 4])
sines = torch.sin(angles)
inverses = torch.asin(sines)
print('\nSine and arcsine:')
print(angles)
print(sines)
print(inverses)

# bitwise operations
print('\nBitwise XOR:')
b = torch.tensor([1, 5, 11])
c = torch.tensor([2, 7, 10])
print(torch.bitwise_xor(b, c))

# comparisons:
print('\nBroadcasted, element-wise equality comparison:')
d = torch.tensor([[1., 2.], [3., 4.]])
e = torch.ones(1, 2)  # many comparison ops support broadcasting!
print(torch.eq(d, e)) # returns a tensor of type bool

# reductions:
print('\nReduction ops:')
print(torch.max(d))        # returns a single-element tensor
print(torch.max(d).item()) # extracts the value from the returned tensor
print(torch.mean(d))       # average
print(torch.std(d))        # standard deviation
print(torch.prod(d))       # product of all numbers
print(torch.unique(torch.tensor([1, 2, 1, 2, 1, 2]))) # filter unique elements

# vector and linear algebra operations
v1 = torch.tensor([1., 0., 0.])         # x unit vector
v2 = torch.tensor([0., 1., 0.])         # y unit vector
m1 = torch.rand(2, 2)                   # random matrix
m2 = torch.tensor([[3., 0.], [0., 3.]]) # three times identity matrix

print('\nVectors & Matrices:')
print(torch.linalg.cross(v2, v1)) # negative of z unit vector (v1 x v2 == -v2 x v1)
print(m1)
m3 = torch.linalg.matmul(m1, m2)
print(m3)                  # 3 times m1
print(torch.linalg.svd(m3))       # singular value decomposition

Common functions:
tensor([[0.9238, 0.5724, 0.0791, 0.2629],
        [0.1986, 0.4439, 0.6434, 0.4776]])
tensor([[-0., -0., 1., -0.],
        [-0., 1., 1., -0.]])
tensor([[-1., -1.,  0., -1.],
        [-1.,  0.,  0., -1.]])
tensor([[-0.5000, -0.5000,  0.0791, -0.2629],
        [-0.1986,  0.4439,  0.5000, -0.4776]])

Sine and arcsine:
tensor([0.0000, 0.7854, 1.5708, 2.3562])
tensor([0.0000, 0.7071, 1.0000, 0.7071])
tensor([0.0000, 0.7854, 1.5708, 0.7854])

Bitwise XOR:
tensor([3, 2, 1])

Broadcasted, element-wise equality comparison:
tensor([[ True, False],
        [False, False]])

Reduction ops:
tensor(4.)
4.0
tensor(2.5000)
tensor(1.2910)
tensor(24.)
tensor([1, 2])

Vectors & Matrices:
tensor([ 0.,  0., -1.])
tensor([[0.7375, 0.8328],
        [0.8444, 0.2941]])
tensor([[2.2125, 2.4985],
        [2.5332, 0.8822]])
torch.return_types.linalg_svd(
U=tensor([[-0.7889, -0.6145],
        [-0.6145,  0.7889]]),
S=tensor([4.1498, 1.0548]),
Vh=tensor([[-0.7957, -0.6056],
        [ 0.6056, -0.7957]]))

這只是操作的一小部分示例。更多詳細資訊和完整的數學函式列表，請參閱文件。更多詳細資訊和完整的線性代數操作列表，請參閱此文件。

原地修改張量¶

大多數張量上的二元操作會返回第三個新的張量。當我們寫 c = a * b（其中 a 和 b 是張量）時，新張量 c 將佔用與其它張量不同的記憶體區域。

然而，有時你可能希望原地修改張量 — 例如，如果你正在進行元素級計算，並且可以丟棄中間值。為此，大多數數學函式都有一個帶下劃線 (_) 字尾的版本，它們會原地修改張量。

例如

a = torch.tensor([0, math.pi / 4, math.pi / 2, 3 * math.pi / 4])
print('a:')
print(a)
print(torch.sin(a))   # this operation creates a new tensor in memory
print(a)              # a has not changed

b = torch.tensor([0, math.pi / 4, math.pi / 2, 3 * math.pi / 4])
print('\nb:')
print(b)
print(torch.sin_(b))  # note the underscore
print(b)              # b has changed

a:
tensor([0.0000, 0.7854, 1.5708, 2.3562])
tensor([0.0000, 0.7071, 1.0000, 0.7071])
tensor([0.0000, 0.7854, 1.5708, 2.3562])

b:
tensor([0.0000, 0.7854, 1.5708, 2.3562])
tensor([0.0000, 0.7071, 1.0000, 0.7071])
tensor([0.0000, 0.7071, 1.0000, 0.7071])

對於算術運算，也有行為類似的函式

a = torch.ones(2, 2)
b = torch.rand(2, 2)

print('Before:')
print(a)
print(b)
print('\nAfter adding:')
print(a.add_(b))
print(a)
print(b)
print('\nAfter multiplying')
print(b.mul_(b))
print(b)

Before:
tensor([[1., 1.],
        [1., 1.]])
tensor([[0.3788, 0.4567],
        [0.0649, 0.6677]])

After adding:
tensor([[1.3788, 1.4567],
        [1.0649, 1.6677]])
tensor([[1.3788, 1.4567],
        [1.0649, 1.6677]])
tensor([[0.3788, 0.4567],
        [0.0649, 0.6677]])

After multiplying
tensor([[0.1435, 0.2086],
        [0.0042, 0.4459]])
tensor([[0.1435, 0.2086],
        [0.0042, 0.4459]])

注意這些原地算術函式是 torch.Tensor 物件的方法，而不像許多其他函式（例如 torch.sin()）那樣附屬於 torch 模組。如你從 a.add_(b) 中看到的，呼叫方張量是原地修改的那個。

還有另一種將計算結果放入已存在的已分配張量中的選項。到目前為止我們見過的許多方法和函式——包括建立方法！——都有一個 out 引數，允許你指定一個張量來接收輸出。如果 out 張量的形狀和 dtype 正確，就可以避免新的記憶體分配

a = torch.rand(2, 2)
b = torch.rand(2, 2)
c = torch.zeros(2, 2)
old_id = id(c)

print(c)
d = torch.matmul(a, b, out=c)
print(c)                # contents of c have changed

assert c is d           # test c & d are same object, not just containing equal values
assert id(c) == old_id  # make sure that our new c is the same object as the old one

torch.rand(2, 2, out=c) # works for creation too!
print(c)                # c has changed again
assert id(c) == old_id  # still the same object!

tensor([[0., 0.],
        [0., 0.]])
tensor([[0.3653, 0.8699],
        [0.2364, 0.3604]])
tensor([[0.0776, 0.4004],
        [0.9877, 0.0352]])

複製張量¶

和 Python 中的任何物件一樣，將一個張量賦值給一個變數會使該變數成為張量的一個標籤，而不會複製它。例如

a = torch.ones(2, 2)
b = a

a[0][1] = 561  # we change a...
print(b)       # ...and b is also altered

tensor([[  1., 561.],
        [  1.,   1.]])

但如果你想獲得一個獨立的資料副本進行操作呢？clone() 方法可以幫助你

a = torch.ones(2, 2)
b = a.clone()

assert b is not a      # different objects in memory...
print(torch.eq(a, b))  # ...but still with the same contents!

a[0][1] = 561          # a changes...
print(b)               # ...but b is still all ones

tensor([[True, True],
        [True, True]])
tensor([[1., 1.],
        [1., 1.]])

使用 clone() 時，有一件重要的事情需要注意。如果源張量啟用了 autograd，那麼克隆出來的張量也會啟用。這一點將在關於 autograd 的影片中更深入地講解，但如果你想了解一些簡單的細節，可以繼續閱讀。

在很多情況下，這正是你想要的。 例如，如果你的模型在其 forward() 方法中有多個計算路徑，並且原始張量及其克隆體都對模型的輸出有貢獻，那麼為了支援模型學習，你會希望這兩個張量都開啟 autograd。如果你的源張量啟用了 autograd（如果它是一組學習權重或源自涉及權重的計算，通常會啟用），那麼你就會得到想要的結果。

另一方面，如果你正在進行一個計算，其中原始張量和它的克隆體都不需要跟蹤梯度，那麼只要源張量關閉了 autograd，就沒問題了。

然而，還有第三種情況： 想象你在模型的 forward() 函式中執行計算，其中梯度預設對所有內容都開啟，但你想要在計算過程中取出一些值來生成一些指標。在這種情況下，你不希望源張量的克隆副本跟蹤梯度 — 關閉 autograd 的歷史跟蹤可以提高效能。為此，你可以對源張量使用 .detach() 方法

a = torch.rand(2, 2, requires_grad=True) # turn on autograd
print(a)

b = a.clone()
print(b)

c = a.detach().clone()
print(c)

print(a)

tensor([[0.0905, 0.4485],
        [0.8740, 0.2526]], requires_grad=True)
tensor([[0.0905, 0.4485],
        [0.8740, 0.2526]], grad_fn=<CloneBackward0>)
tensor([[0.0905, 0.4485],
        [0.8740, 0.2526]])
tensor([[0.0905, 0.4485],
        [0.8740, 0.2526]], requires_grad=True)

這裡發生了什麼？

我們建立 a 時開啟了 requires_grad=True。我們尚未講解這個可選引數，但在關於 autograd 的單元中會進行講解。
當我們列印 a 時，它會告訴我們屬性 requires_grad=True — 這意味著 autograd 和計算歷史跟蹤已開啟。
我們克隆了 a 並將其標記為 b。當我們列印 b 時，我們可以看到它正在跟蹤其計算歷史 — 它繼承了 a 的 autograd 設定，並新增到了計算歷史中。
我們將 a 克隆到 c 中，但我們首先呼叫了 detach()。
列印 c，我們看不到計算歷史，也沒有 requires_grad=True。

detach() 方法 將張量從其計算歷史中分離出來。 它表示：“接下來進行的任何操作，都彷彿 autograd 是關閉的。”它這樣做 並不會 改變 a — 你可以看到，當我們在最後再次列印 a 時，它保留了其 requires_grad=True 屬性。

移至加速器 ¶

PyTorch 的主要優勢之一在於其在加速器（如 CUDA、MPS、MTIA 或 XPU）上的強大加速能力。到目前為止，我們進行的所有操作都是在 CPU 上完成的。我們如何轉移到更快的硬體上呢？

首先，我們應該使用 is_available() 方法檢查加速器是否可用。

注意

如果你沒有加速器，本節中的可執行單元格將不會執行任何與加速器相關的程式碼。

if torch.accelerator.is_available():
    print('We have an accelerator!')
else:
    print('Sorry, CPU only.')

We have an accelerator!

一旦我們確定有一個或多個加速器可用，我們就需要將資料放在加速器可以訪問的地方。你的 CPU 對計算機 RAM 中的資料進行計算。你的加速器有專用的附加記憶體。無論何時你想要在裝置上執行計算，你必須將該計算所需的所有資料移動到該裝置可訪問的記憶體中。（通俗地說，“將資料移動到 GPU 可訪問的記憶體中”簡稱為“將資料移動到 GPU”。）

有多種方法可以將你的資料放到目標裝置上。你可以在建立時就完成此操作

if torch.accelerator.is_available():
    gpu_rand = torch.rand(2, 2, device=torch.accelerator.current_accelerator())
    print(gpu_rand)
else:
    print('Sorry, CPU only.')

tensor([[0.3344, 0.2640],
        [0.2119, 0.0582]], device='cuda:0')

預設情況下，新張量是在 CPU 上建立的，所以我們必須使用可選的 device 引數指定何時在加速器上建立張量。你可以看到，當我們列印新張量時，PyTorch 會告知它所在的裝置（如果不在 CPU 上）。

你可以使用 torch.accelerator.device_count() 查詢加速器的數量。如果你有多個加速器，可以按索引指定，以 CUDA 為例：device='cuda:0', device='cuda:1' 等。

作為一種編碼習慣，隨處使用字串常量指定裝置是相當脆弱的。在理想情況下，無論你是在 CPU 還是加速器硬體上，你的程式碼都應該能健壯地執行。你可以透過建立一個裝置控制代碼來實現這一點，該控制代碼可以傳遞給你的張量，而不是字串。

my_device = torch.accelerator.current_accelerator() if torch.accelerator.is_available() else torch.device('cpu')
print('Device: {}'.format(my_device))

x = torch.rand(2, 2, device=my_device)
print(x)

Device: cuda
tensor([[0.0024, 0.6778],
        [0.2441, 0.6812]], device='cuda:0')

如果你有一個已存在於某個裝置上的張量，可以使用 to() 方法將其移動到另一個裝置。下面的程式碼行在 CPU 上建立一個張量，並將其移動到你在上一個單元格中獲取的裝置控制代碼上。

y = torch.rand(2, 2)
y = y.to(my_device)

重要的是要知道，為了進行涉及兩個或多個張量的計算，*所有張量必須位於同一裝置上*。以下程式碼將會丟擲執行時錯誤，無論你是否擁有加速器裝置可用，以 CUDA 為例

x = torch.rand(2, 2)
y = torch.rand(2, 2, device='cuda')
z = x + y  # exception will be thrown

操縱張量形狀¶

有時，你需要改變張量的形狀。下面，我們將看看一些常見的情況以及如何處理它們。

改變維度數量¶

你可能需要改變維度數量的一個情況是將單個輸入例項傳遞給模型。PyTorch 模型通常期望的是輸入*批次*。

例如，想象一個處理 3 x 226 x 226 影像的模型——一個 226 畫素的正方形，具有 3 個顏色通道。當你載入並轉換它時，你會得到一個形狀為 (3, 226, 226) 的張量。然而，你的模型期望的輸入形狀是 (N, 3, 226, 226)，其中 N 是批次中的影像數量。那麼你如何建立一個包含一個影像的批次呢？

a = torch.rand(3, 226, 226)
b = a.unsqueeze(0)

print(a.shape)
print(b.shape)

torch.Size([3, 226, 226])
torch.Size([1, 3, 226, 226])

unsqueeze() 方法新增一個大小為 1 的維度。unsqueeze(0) 將其新增為新的第零個維度——現在你就擁有了一個大小為一的批次！

那麼如果那是*解除擠壓*（unsqueezing）？擠壓（squeezing）是什麼意思呢？我們利用了一個事實，即任何大小為 1 的維度都*不會*改變張量中元素的數量。

c = torch.rand(1, 1, 1, 1, 1)
print(c)

tensor([[[[[0.2347]]]]])

繼續上面的例子，假設模型對每個輸入輸出一個包含 20 個元素的向量。那麼你期望的輸出形狀是 (N, 20)，其中 N 是輸入批次中的例項數量。這意味著對於我們的單輸入批次，我們將得到形狀為 (1, 20) 的輸出。

如果你想對該輸出執行一些*非批處理*計算——即期望一個包含 20 個元素的向量的計算——怎麼辦？

a = torch.rand(1, 20)
print(a.shape)
print(a)

b = a.squeeze(0)
print(b.shape)
print(b)

c = torch.rand(2, 2)
print(c.shape)

d = c.squeeze(0)
print(d.shape)

torch.Size([1, 20])
tensor([[0.1899, 0.4067, 0.1519, 0.1506, 0.9585, 0.7756, 0.8973, 0.4929, 0.2367,
         0.8194, 0.4509, 0.2690, 0.8381, 0.8207, 0.6818, 0.5057, 0.9335, 0.9769,
         0.2792, 0.3277]])
torch.Size([20])
tensor([0.1899, 0.4067, 0.1519, 0.1506, 0.9585, 0.7756, 0.8973, 0.4929, 0.2367,
        0.8194, 0.4509, 0.2690, 0.8381, 0.8207, 0.6818, 0.5057, 0.9335, 0.9769,
        0.2792, 0.3277])
torch.Size([2, 2])
torch.Size([2, 2])

從形狀可以看出，我們的 2 維張量現在是 1 維的，如果你仔細觀察上面單元格的輸出，你會看到列印 a 時由於多了一個維度而顯示出“額外”的一對方括號 []。

你只能 squeeze() 大小為 1 的維度。請看上面我們嘗試在 c 中擠壓一個大小為 2 的維度，結果得到了與起始形狀相同的形狀。對 squeeze() 和 unsqueeze() 的呼叫只能作用於大小為 1 的維度，因為否則會改變張量中元素的數量。

你可能使用 unsqueeze() 的另一個地方是為了簡化廣播。回想一下上面我們有以下程式碼的例子

a = torch.ones(4, 3, 2)

c = a * torch.rand(   3, 1) # 3rd dim = 1, 2nd dim identical to a
print(c)

其最終效果是將操作廣播到維度 0 和 2 上，使得隨機的 3 x 1 張量與 a 中每個 3 元素列進行逐元素相乘。

如果隨機向量只是一個 3 元素向量呢？我們將失去進行廣播的能力，因為最終維度不符合廣播規則。這時 unsqueeze() 就派上用場了

a = torch.ones(4, 3, 2)
b = torch.rand(   3)     # trying to multiply a * b will give a runtime error
c = b.unsqueeze(1)       # change to a 2-dimensional tensor, adding new dim at the end
print(c.shape)
print(a * c)             # broadcasting works again!

torch.Size([3, 1])
tensor([[[0.1891, 0.1891],
         [0.3952, 0.3952],
         [0.9176, 0.9176]],

        [[0.1891, 0.1891],
         [0.3952, 0.3952],
         [0.9176, 0.9176]],

        [[0.1891, 0.1891],
         [0.3952, 0.3952],
         [0.9176, 0.9176]],

        [[0.1891, 0.1891],
         [0.3952, 0.3952],
         [0.9176, 0.9176]]])

squeeze() 和 unsqueeze() 方法也有原地（in-place）版本，即 squeeze_() 和 unsqueeze_()

batch_me = torch.rand(3, 226, 226)
print(batch_me.shape)
batch_me.unsqueeze_(0)
print(batch_me.shape)

torch.Size([3, 226, 226])
torch.Size([1, 3, 226, 226])

有時你會想更徹底地改變張量的形狀，同時仍保留元素的數量及其內容。一個常見的情況發生在模型的卷積層和線性層之間——這在影像分類模型中很常見。卷積核會產生一個形狀為 *特徵數 x 寬度 x 高度* 的輸出張量，但隨後的線性層需要一維輸入。reshape() 可以為你完成此操作，前提是你請求的維度能產生與輸入張量相同數量的元素。

output3d = torch.rand(6, 20, 20)
print(output3d.shape)

input1d = output3d.reshape(6 * 20 * 20)
print(input1d.shape)

# can also call it as a method on the torch module:
print(torch.reshape(output3d, (6 * 20 * 20,)).shape)

torch.Size([6, 20, 20])
torch.Size([2400])
torch.Size([2400])

注意

上面單元格最後一行中的 (6 * 20 * 20,) 引數是因為 PyTorch 在指定張量形狀時需要一個元組——但當形狀是方法的第一個引數時，它允許我們“投機取巧”，只使用一系列整數。在這裡，我們必須新增括號和逗號，以使方法確信這確實是一個單元素元組。

在可能的情況下，reshape() 將返回要改變的張量的一個檢視——也就是說，一個獨立的張量物件，它檢視同一片底層記憶體區域。這一點很重要：這意味著對源張量進行的任何更改都會反映在該張量的檢視中，除非你對其進行 clone() 操作。

存在一些情況（超出了本介紹的範圍），其中 reshape() 必須返回一個包含資料副本的張量。更多資訊，請參閱文件。

NumPy 橋接¶

在上面關於廣播的部分中提到，PyTorch 的廣播語義與 NumPy 相容——但 PyTorch 和 NumPy 之間的聯絡遠不止於此。

如果你現有的 ML 或科學計算程式碼將資料儲存在 NumPy ndarray 中，你可能希望將相同的資料表示為 PyTorch 張量，無論是為了利用 PyTorch 的 GPU 加速，還是為了利用其構建 ML 模型的高效抽象。在 ndarray 和 PyTorch 張量之間切換非常容易

import numpy as np

numpy_array = np.ones((2, 3))
print(numpy_array)

pytorch_tensor = torch.from_numpy(numpy_array)
print(pytorch_tensor)

[[1. 1. 1.]
 [1. 1. 1.]]
tensor([[1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)

PyTorch 會建立一個形狀和資料與 NumPy 陣列相同的張量，甚至保留了 NumPy 預設的 64 位浮點資料型別。

轉換反過來也同樣容易。

pytorch_rand = torch.rand(2, 3)
print(pytorch_rand)

numpy_rand = pytorch_rand.numpy()
print(numpy_rand)

tensor([[0.8716, 0.2459, 0.3499],
        [0.2853, 0.9091, 0.5695]])
[[0.87163675 0.2458961  0.34993553]
 [0.2853077  0.90905803 0.5695162 ]]

重要的是要知道，這些轉換後的物件與其源物件使用相同的底層記憶體，這意味著對一個物件進行的更改會反映在另一個物件中

numpy_array[1, 1] = 23
print(pytorch_tensor)

pytorch_rand[1, 1] = 17
print(numpy_rand)

tensor([[ 1.,  1.,  1.],
        [ 1., 23.,  1.]], dtype=torch.float64)
[[ 0.87163675  0.2458961   0.34993553]
 [ 0.2853077  17.          0.5695162 ]]

指令碼總執行時間： ( 0 minutes 0.257 seconds)

由 Sphinx-Gallery 生成的相簿

PyTorch 張量簡介¶

建立張量¶

隨機張量與種子設定¶

張量形狀¶

張量資料型別¶

使用 PyTorch 張量進行數學與邏輯運算¶

簡而言之：張量廣播¶

使用張量進行更多數學運算¶

原地修改張量¶

複製張量¶

移至加速器 ¶

操縱張量形狀¶

改變維度數量¶

NumPy 橋接¶

文件

教程

資源

PyTorch 張量簡介¶

建立張量¶

隨機張量與種子設定¶

張量形狀¶

張量資料型別¶

使用 PyTorch 張量進行數學與邏輯運算¶

簡而言之：張量廣播¶

使用張量進行更多數學運算¶

原地修改張量¶

複製張量¶

移至 加速器¶

操縱張量形狀¶

改變維度數量¶

NumPy 橋接¶

文件

教程

資源

移至加速器 ¶