機率分佈 - torch.distributions¶

distributions 套件包含可參數化的機率分佈和取樣函數。這允許建構隨機計算圖和用於優化的隨機梯度估計器。此套件通常遵循 TensorFlow Distributions 套件的設計。

無法直接透過隨機樣本進行反向傳播。但是，有兩種主要方法可以建立可以透過其進行反向傳播的代理函數。它們是分數函數估計器/似然比估計器/REINFORCE 和路徑導數估計器。REINFORCE 通常被視為強化學習中策略梯度法的基礎，而路徑導數估計器通常出現在變分自編碼器的重新參數化技巧中。分數函數只需要樣本值 $f(x)$ ，而路徑導數需要導數 $f'(x)$ 。下一節將在強化學習範例中討論這兩種方法。如需更多詳細資訊，請參閱使用隨機計算圖進行梯度估計。

分數函數¶

當機率密度函數相對於其參數可微分時，我們只需要 sample() 和 log_prob() 即可實現 REINFORCE

\Delta\theta = \alpha r \frac{\partial\log p(a|\pi^\theta(s))}{\partial\theta}

其中 $\theta$ 是參數， $\alpha$ 是學習率， $r$ 是獎勵，而 $p(a|\pi^\theta(s))$ 是在策略 $\pi^\theta$ 下，在狀態 $s$ 採取動作 $a$ 的機率。

實際上，我們會從網路的輸出中取樣一個動作，將此動作應用於環境中，然後使用 log_prob 來構建等效的損失函數。請注意，我們使用負值是因為優化器使用梯度下降，而上述規則假設梯度上升。對於分類策略，實作 REINFORCE 的程式碼如下所示：

probs = policy_network(state)
# Note that this is equivalent to what used to be called multinomial
m = Categorical(probs)
action = m.sample()
next_state, reward = env.step(action)
loss = -m.log_prob(action) * reward
loss.backward()

路徑導數¶

實現這些隨機/策略梯度的另一種方法是使用 rsample() 方法中的重新參數化技巧，其中參數化隨機變數可以透過參數化確定性函數和無參數隨機變數來構建。因此，重新參數化的樣本變得可微分。實現路徑導數的程式碼如下所示：

params = policy_network(state)
m = Normal(*params)
# Any distribution with .has_rsample == True could work based on the application
action = m.rsample()
next_state, reward = env.step(action)  # Assuming that reward is differentiable
loss = -reward
loss.backward()

分佈¶

class torch.distributions.distribution.Distribution(batch_shape=torch.Size([]), event_shape=torch.Size([]), validate_args=None)[source]¶

基類：object

Distribution 是機率分佈的抽象基類。

property arg_constraints: Dict[str, Constraint]¶: 傳回從參數名稱到 Constraint 物件的字典，這些物件應滿足此分佈的每個參數。非張量的參數不需要出現在此字典中。

property batch_shape: Size¶: 傳回參數批次處理的形狀。

cdf(value)[source]¶

傳回在 value 處評估的累積密度/質量函數。

參數: value (Tensor) –
傳回類型: Tensor

entropy()[source]¶

傳回分佈的熵，以 batch_shape 批次處理。

傳回: 形狀為 batch_shape 的張量。
傳回類型: Tensor

enumerate_support(expand=True)[source]¶

傳回包含離散分佈支援的所有值的張量。結果將枚舉維度 0，因此結果的形狀將為 (cardinality,) + batch_shape + event_shape（其中單變量分佈的 event_shape = ()）。

請注意，這會以鎖定步驟 [[0, 0], [1, 1], …] 枚舉所有批次處理的張量。使用 expand=False 時，枚舉沿著維度 0 進行，但其餘批次維度為單一維度，[[0], [1], ..。

要迭代整個笛卡爾積，請使用 itertools.product(m.enumerate_support())。

參數: expand (bool) – 是否沿著批次維度展開支援以匹配分佈的 batch_shape。
傳回: 迭代維度 0 的張量。
傳回類型: Tensor

property event_shape: Size¶: 傳回單一樣本的形狀（不含批次處理）。

expand(batch_shape, _instance=None)[source]¶

傳回一個新的分佈實例（或填充由衍生類別提供的現有實例），並將批次維度擴展到 batch_shape。此方法會在分佈的參數上呼叫 expand。因此，這不會為擴展的分佈實例分配新的記憶體。此外，這不會在第一次建立實例時重複 __init__.py 中的任何參數檢查或參數廣播。

參數

batch_shape (torch.Size) – 所需的擴展大小。
_instance – 由需要覆寫 .expand 的子類別提供的新實例。

傳回

批次維度擴展到 batch_size 的新分佈實例。

icdf(value)[source]¶

傳回在 value 處評估的反累積密度/質量函數。

參數: value (Tensor) –
傳回類型: Tensor

log_prob(value)[source]¶

傳回在 value 處評估的機率密度/質量函數的對數。

參數: value (Tensor) –
傳回類型: Tensor

property mean: Tensor¶: 傳回分佈的均值。

property mode: Tensor¶: 傳回分佈的眾數。

perplexity()[source]¶

傳回分佈的困惑度，以 batch_shape 批次處理。

傳回: 形狀為 batch_shape 的張量。
傳回類型: Tensor

rsample(sample_shape=torch.Size([]))[原始碼]¶

產生一個 `sample_shape` 形狀的重新參數化樣本，如果分配參數是批次化的，則產生 `sample_shape` 形狀的重新參數化樣本批次。

傳回類型: Tensor

sample(sample_shape=torch.Size([]))[原始碼]¶

產生一個 `sample_shape` 形狀的樣本，如果分配參數是批次化的，則產生 `sample_shape` 形狀的樣本批次。

傳回類型: Tensor

sample_n(n)[原始碼]¶

產生 `n` 個樣本，如果分配參數是批次化的，則產生 `n` 個樣本批次。

傳回類型: Tensor

static set_default_validate_args(value)[原始碼]¶

設定是否啟用驗證。

預設行為模仿 Python 的 assert 語句：預設情況下啟用驗證，但如果 Python 在最佳化模式下執行（透過 python -O），則會停用驗證。驗證可能會很耗時，因此您可能希望在模型運作後將其停用。

參數: value (bool) – 是否啟用驗證。

property stddev: Tensor¶: 傳回分配的標準差。

property support: Optional[Any]¶: 傳回一個 Constraint 物件，表示此分配的支援。

property variance: Tensor¶: 傳回分配的變異數。

指數族¶

class torch.distributions.exp_family.ExponentialFamily(batch_shape=torch.Size([]), event_shape=torch.Size([]), validate_args=None)[原始碼]¶

基底類別： Distribution

指數族是指數族中機率分配的抽象基底類別，其機率質量/密度函數的形式定義如下

p_{F}(x; \theta) = \exp(\langle t(x), \theta\rangle - F(\theta) + k(x))

其中 $\theta$ 表示自然參數， $t(x)$ 表示充分統計量， $F(\theta)$ 是給定族的對數正規化函數， $k(x)$ 是載子測度。

備註

此類別是 `Distribution` 類別和屬於指數族的分配之間的中介，主要用於檢查 `.entropy()` 和解析 KL 散度方法的正確性。我們使用此類別透過 AD 架構和布雷格曼散度來計算熵和 KL 散度（由 Frank Nielsen 和 Richard Nock 提供，指數族的熵和交叉熵）。

entropy()[原始碼]¶: 使用對數正規化器的布雷格曼散度來計算熵的方法。

伯努利¶

class torch.distributions.bernoulli.Bernoulli(probs=None, logits=None, validate_args=None)[原始碼]¶

基底類別： ExponentialFamily

建立一個由 probs 或 logits 參數化的伯努利分配（但不能同時使用兩者）。

樣本是二元的（0 或 1）。它們取值為 `1` 的機率為 `p`，取值為 `0` 的機率為 `1 - p`。

範例

>>> m = Bernoulli(torch.tensor([0.3]))
>>> m.sample()  # 30% chance 1; 70% chance 0
tensor([ 0.])

參數

probs (Number, Tensor) – 採樣到 `1` 的機率
logits (Number, Tensor) – 採樣到 `1` 的對數機率比

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

entropy()[原始碼]¶

enumerate_support(expand=True)[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_enumerate_support = True¶

log_prob(value)[原始碼]¶

property logits¶

property mean¶

property mode¶

property param_shape¶

property probs¶

sample(sample_shape=torch.Size([]))[原始碼]¶

support = Boolean()¶

property variance¶

Beta¶

class torch.distributions.beta.Beta(concentration1, concentration0, validate_args=None)[原始碼]¶

基底類別： ExponentialFamily

由 concentration1 和 concentration0 參數化的 Beta 分佈。

範例

>>> m = Beta(torch.tensor([0.5]), torch.tensor([0.5]))
>>> m.sample()  # Beta distributed with concentration concentration1 and concentration0
tensor([ 0.1046])

參數

concentration1 (float 或 Tensor) – 分佈的第一個濃度參數（通常稱為 alpha）
concentration0 (float 或 Tensor) – 分佈的第二個濃度參數（通常稱為 beta）

arg_constraints = {'concentration0': GreaterThan(lower_bound=0.0), 'concentration1': GreaterThan(lower_bound=0.0)}¶

property concentration0¶

property concentration1¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=())[原始碼]¶

support = Interval(lower_bound=0.0, upper_bound=1.0)¶

property variance¶

Binomial¶

class torch.distributions.binomial.Binomial(total_count=1, probs=None, logits=None, validate_args=None)[原始碼]¶

基底類別： Distribution

建立一個由 total_count 以及 probs 或 logits（但不同時使用）參數化的二項分佈。 total_count 必須與 probs/logits 可廣播。

範例

>>> m = Binomial(100, torch.tensor([0 , .2, .8, 1]))
>>> x = m.sample()
tensor([   0.,   22.,   71.,  100.])

>>> m = Binomial(torch.tensor([[5.], [10.]]), torch.tensor([0.5, 0.8]))
>>> x = m.sample()
tensor([[ 4.,  5.],
        [ 7.,  6.]])

參數

total_count (int 或 Tensor) – 伯努利試驗次數
probs (Tensor) – 事件機率
logits (Tensor) – 事件對數機率

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0), 'total_count': IntegerGreaterThan(lower_bound=0)}¶

entropy()[原始碼]¶

enumerate_support(expand=True)[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_enumerate_support = True¶

log_prob(value)[原始碼]¶

property logits¶

property mean¶

property mode¶

property param_shape¶

property probs¶

sample(sample_shape=torch.Size([]))[原始碼]¶

property support¶

property variance¶

Categorical¶

class torch.distributions.categorical.Categorical(probs=None, logits=None, validate_args=None)[原始碼]¶

基底類別： Distribution

建立一個由 probs 或 logits（但不能同時使用兩者）參數化的分類分佈。

備註

這等效於 torch.multinomial() 進行取樣的分配。

樣本是來自 $\{0, \ldots, K-1\}$ 的整數，其中 K 是 probs.size(-1)。

如果 probs 是一維的，長度為 K，則每個元素是該索引處類別的相對取樣機率。

如果 probs 是 N 維的，則前 N-1 維被視為一批相對機率向量。

備註

probs 參數必須是非負的、有限的且總和不為零，並且它將沿最後一個維度歸一化為總和為 1。 probs 將返回此標準化值。 logits 參數將被解釋為未標準化的對數機率，因此可以是任何實數。它同樣會被標準化，以便生成的機率沿最後一個維度總和為 1。 logits 將返回此標準化值。

另請參閱：torch.multinomial()

範例

>>> m = Categorical(torch.tensor([ 0.25, 0.25, 0.25, 0.25 ]))
>>> m.sample()  # equal probability of 0, 1, 2, 3
tensor(3)

參數

probs (張量) – 事件機率
logits (張量) – 事件對數機率（未標準化）

arg_constraints = {'logits': IndependentConstraint(Real(), 1), 'probs': Simplex()}¶

entropy()[原始碼]¶

enumerate_support(expand=True)[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_enumerate_support = True¶

log_prob(value)[原始碼]¶

property logits¶

property mean¶

property mode¶

property param_shape¶

property probs¶

sample(sample_shape=torch.Size([]))[原始碼]¶

property support¶

property variance¶

Cauchy¶

class torch.distributions.cauchy.Cauchy(loc, scale, validate_args=None)[原始碼]¶

基底類別： Distribution

從柯西（勞倫茲）分佈中取樣。平均值為 0 的獨立常態分佈隨機變量的比率分佈遵循柯西分佈。

範例

>>> m = Cauchy(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Cauchy distribution with loc=0 and scale=1
tensor([ 2.3214])

參數

loc (浮點數 或張量) – 分佈的眾數或中位數。
scale (浮點數 或張量) – 半高寬。

arg_constraints = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

icdf(value)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

support = Real()¶

property variance¶

卡方分佈¶

class torch.distributions.chi2.Chi2(df, validate_args=None)[原始碼]¶

基底： Gamma

建立一個由形狀參數 df 參數化的卡方分佈。這與 Gamma(alpha=0.5*df, beta=0.5) 完全相同

範例

>>> m = Chi2(torch.tensor([1.0]))
>>> m.sample()  # Chi2 distributed with shape df=1
tensor([ 0.1046])

參數: df (float 或 Tensor) – 分佈的形狀參數

arg_constraints = {'df': GreaterThan(lower_bound=0.0)}¶

property df¶

expand(batch_shape, _instance=None)[原始碼]¶

連續伯努利分佈¶

class torch.distributions.continuous_bernoulli.ContinuousBernoulli(probs=None, logits=None, lims=(0.499, 0.501), validate_args=None)[原始碼]¶

基底類別： ExponentialFamily

建立一個由 probs 或 logits（但不能同時使用兩者）參數化的連續伯努利分佈。

此分佈在 [0, 1] 中被支援，並由「probs」（在 (0,1) 中）或「logits」（實數值）參數化。請注意，與伯努利分佈不同，「probs」並非對應於機率，「logits」也並非對應於對數機率，但由於與伯努利分佈的相似性，因此使用了相同的名稱。如需更多詳細資訊，請參閱 [1]。

範例

>>> m = ContinuousBernoulli(torch.tensor([0.3]))
>>> m.sample()
tensor([ 0.2538])

參數

probs (Number, Tensor) – (0,1) 值參數
logits (Number, Tensor) – 實數值參數，其 sigmoid 與「probs」相符

[1] The continuous Bernoulli: fixing a pervasive error in variational autoencoders, Loaiza-Ganem G and Cunningham JP, NeurIPS 2019. https://arxiv.org/abs/1907.06845

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

icdf(value)[原始碼]¶

log_prob(value)[原始碼]¶

property logits¶

property mean¶

property param_shape¶

property probs¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

sample(sample_shape=torch.Size([]))[原始碼]¶

property stddev¶

support = Interval(lower_bound=0.0, upper_bound=1.0)¶

property variance¶

狄利克雷分佈¶

class torch.distributions.dirichlet.Dirichlet(concentration, validate_args=None)[原始碼]¶

基底類別： ExponentialFamily

建立一個由濃度參數 concentration 參數化的狄利克雷分佈。

範例

>>> m = Dirichlet(torch.tensor([0.5, 0.5]))
>>> m.sample()  # Dirichlet distributed with concentration [0.5, 0.5]
tensor([ 0.1046,  0.8954])

參數: concentration (張量) – 分佈的濃度參數（通常稱為 alpha）

arg_constraints = {'concentration': IndependentConstraint(GreaterThan(lower_bound=0.0), 1)}¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=())[原始碼]¶

support = Simplex()¶

property variance¶

指數分佈¶

class torch.distributions.exponential.Exponential(rate, validate_args=None)[原始碼]¶

基底類別： ExponentialFamily

建立一個由 rate 參數化的指數分佈。

範例

>>> m = Exponential(torch.tensor([1.0]))
>>> m.sample()  # Exponential distributed with rate=1
tensor([ 0.1046])

參數: rate (浮點數 或張量) – rate = 1 / 分佈的尺度

arg_constraints = {'rate': GreaterThan(lower_bound=0.0)}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

icdf(value)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

property stddev¶

support = GreaterThanEq(lower_bound=0.0)¶

property variance¶

費雪-史內德卡分佈¶

class torch.distributions.fishersnedecor.FisherSnedecor(df1, df2, validate_args=None)[原始碼]¶

基底類別： Distribution

建立一個由 df1 和 df2 參數化的費雪-史內德卡分佈。

範例

>>> m = FisherSnedecor(torch.tensor([1.0]), torch.tensor([2.0]))
>>> m.sample()  # Fisher-Snedecor-distributed with df1=1 and df2=2
tensor([ 0.2453])

參數

df1 (浮點數 或張量) – 自由度參數 1
df2 (浮點數 或張量) – 自由度參數 2

arg_constraints = {'df1': GreaterThan(lower_bound=0.0), 'df2': GreaterThan(lower_bound=0.0)}¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

support = GreaterThan(lower_bound=0.0)¶

property variance¶

Gamma¶

class torch.distributions.gamma.Gamma(concentration, rate, validate_args=None)[原始碼]¶

基底類別： ExponentialFamily

創建一個由形狀參數 concentration 和 rate 參數化的 Gamma 分佈。

範例

>>> m = Gamma(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # Gamma distributed with concentration=1 and rate=1
tensor([ 0.1046])

參數

concentration (float 或 Tensor) – 分佈的形狀參數（通常稱為 alpha）
rate (float 或 Tensor) – rate = 1 / 分佈的尺度參數（通常稱為 beta）

arg_constraints = {'concentration': GreaterThan(lower_bound=0.0), 'rate': GreaterThan(lower_bound=0.0)}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

support = GreaterThanEq(lower_bound=0.0)¶

property variance¶

Geometric¶

class torch.distributions.geometric.Geometric(probs=None, logits=None, validate_args=None)[原始碼]¶

基底類別： Distribution

創建一個由 probs 參數化的幾何分佈，其中 probs 是伯努利試驗成功的機率。

P(X=k) = (1-p)^{k} p, k = 0, 1, ...

備註

torch.distributions.geometric.Geometric() 第 $(k+1)$ 次試驗是第一次成功，因此在 $\{0, 1, \ldots\}$ 中繪製樣本，而 torch.Tensor.geometric_() 第 k 次試驗是第一次成功，因此在 $\{1, 2, \ldots\}$ 中繪製樣本。

範例

>>> m = Geometric(torch.tensor([0.3]))
>>> m.sample()  # underlying Bernoulli has 30% chance 1; 70% chance 0
tensor([ 2.])

參數

probs (Number, Tensor) – 採樣到 1 的機率。必須介於 (0, 1] 之間
logits (Number, Tensor) – 採樣到 1 的對數機率。

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

log_prob(value)[原始碼]¶

property logits¶

property mean¶

property mode¶

property probs¶

sample(sample_shape=torch.Size([]))[原始碼]¶

support = IntegerGreaterThan(lower_bound=0)¶

property variance¶

Gumbel¶

class torch.distributions.gumbel.Gumbel(loc, scale, validate_args=None)[原始碼]¶

基底： TransformedDistribution

從 Gumbel 分佈中取樣。

範例

>>> m = Gumbel(torch.tensor([1.0]), torch.tensor([2.0]))
>>> m.sample()  # sample from Gumbel distribution with loc=1, scale=2
tensor([ 1.0124])

參數

loc (float 或 Tensor) – 分佈的位置參數
scale (float 或 Tensor) – 分佈的尺度參數

arg_constraints: Dict[str, Constraint] = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

property stddev¶

support = Real()¶

property variance¶

HalfCauchy¶

class torch.distributions.half_cauchy.HalfCauchy(scale, validate_args=None)[原始碼]¶

基底： TransformedDistribution

建立由 scale 參數化的半柯西分佈，其中

X ~ Cauchy(0, scale)
Y = |X| ~ HalfCauchy(scale)

範例

>>> m = HalfCauchy(torch.tensor([1.0]))
>>> m.sample()  # half-cauchy distributed with scale=1
tensor([ 2.3214])

參數: scale (float 或 Tensor) – 完整柯西分佈的尺度

arg_constraints: Dict[str, Constraint] = {'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

icdf(prob)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

property scale¶

support = GreaterThanEq(lower_bound=0.0)¶

property variance¶

HalfNormal¶

class torch.distributions.half_normal.HalfNormal(scale, validate_args=None)[原始碼]¶

基底： TransformedDistribution

建立由 scale 參數化的半常態分佈，其中

X ~ Normal(0, scale)
Y = |X| ~ HalfNormal(scale)

範例

>>> m = HalfNormal(torch.tensor([1.0]))
>>> m.sample()  # half-normal distributed with scale=1
tensor([ 0.1046])

參數: scale (float 或 Tensor) – 完整常態分佈的尺度

arg_constraints: Dict[str, Constraint] = {'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

icdf(prob)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

property scale¶

support = GreaterThanEq(lower_bound=0.0)¶

property variance¶

Independent¶

class torch.distributions.independent.Independent(base_distribution, reinterpreted_batch_ndims, validate_args=None)[原始碼]¶

基底類別： Distribution

將分佈中的一些批次維度重新解釋為事件維度。

這主要用於更改 log_prob() 結果的形狀。例如，要建立與多變量常態分佈形狀相同的對角線常態分佈（以便它們可以互換），您可以

>>> from torch.distributions.multivariate_normal import MultivariateNormal
>>> from torch.distributions.normal import Normal
>>> loc = torch.zeros(3)
>>> scale = torch.ones(3)
>>> mvn = MultivariateNormal(loc, scale_tril=torch.diag(scale))
>>> [mvn.batch_shape, mvn.event_shape]
[torch.Size([]), torch.Size([3])]
>>> normal = Normal(loc, scale)
>>> [normal.batch_shape, normal.event_shape]
[torch.Size([3]), torch.Size([])]
>>> diagn = Independent(normal, 1)
>>> [diagn.batch_shape, diagn.event_shape]
[torch.Size([]), torch.Size([3])]

參數

base_distribution (torch.distributions.distribution.Distribution) – 基礎分佈
reinterpreted_batch_ndims (int) – 要重新解釋為事件維度的批次維度數量

arg_constraints: Dict[str, Constraint] = {}¶

entropy()[原始碼]¶

enumerate_support(expand=True)[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

property has_enumerate_support¶

property has_rsample¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

sample(sample_shape=torch.Size([]))[原始碼]¶

property support¶

property variance¶

InverseGamma¶

class torch.distributions.inverse_gamma.InverseGamma(concentration, rate, validate_args=None)[原始碼]¶

基底： TransformedDistribution

建立由 concentration 和 rate 參數化的反伽瑪分佈，其中

X ~ Gamma(concentration, rate)
Y = 1 / X ~ InverseGamma(concentration, rate)

範例

>>> m = InverseGamma(torch.tensor([2.0]), torch.tensor([3.0]))
>>> m.sample()
tensor([ 1.2953])

參數

concentration (float 或 Tensor) – 分佈的形狀參數（通常稱為 alpha）
rate (float 或 Tensor) – rate = 1 / 分佈的尺度參數（通常稱為 beta）

arg_constraints: Dict[str, Constraint] = {'concentration': GreaterThan(lower_bound=0.0), 'rate': GreaterThan(lower_bound=0.0)}¶

property concentration¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

property mean¶

property mode¶

property rate¶

support = GreaterThan(lower_bound=0.0)¶

property variance¶

Kumaraswamy¶

class torch.distributions.kumaraswamy.Kumaraswamy(concentration1, concentration0, validate_args=None)[原始碼]¶

基底： TransformedDistribution

從 Kumaraswamy 分佈中取樣。

範例

>>> m = Kumaraswamy(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Kumaraswamy distribution with concentration alpha=1 and beta=1
tensor([ 0.1729])

參數

concentration1 (float 或 Tensor) – 分佈的第一個濃度參數（通常稱為 alpha）
concentration0 (float 或 Tensor) – 分佈的第二個濃度參數（通常稱為 beta）

arg_constraints: Dict[str, Constraint] = {'concentration0': GreaterThan(lower_bound=0.0), 'concentration1': GreaterThan(lower_bound=0.0)}¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

property mean¶

property mode¶

support = Interval(lower_bound=0.0, upper_bound=1.0)¶

property variance¶

LKJCholesky¶

class torch.distributions.lkj_cholesky.LKJCholesky(dim, concentration=1.0, validate_args=None)[原始碼]¶

基底類別： Distribution

用於相關矩陣的下三角喬列斯基分解的 LKJ 分佈。該分佈由 concentration 參數 $\eta$ 控制，以使從喬列斯基因子生成的相關矩陣 $M$ 的概率與 $\det(M)^{\eta - 1}$ 成正比。因此，當 concentration == 1 時，我們在相關矩陣的喬列斯基因子上的分佈是均勻的。

L ~ LKJCholesky(dim, concentration)
X = L @ L' ~ LKJCorr(dim, concentration)

請注意，此分佈是對相關矩陣的喬列斯基因子進行取樣，而不是對相關矩陣本身進行取樣，因此與 [1] 中關於 LKJCorr 分佈的推導略有不同。對於取樣，這使用了 [1] 第 3 節中的 Onion 方法。

範例

>>> l = LKJCholesky(3, 0.5)
>>> l.sample()  # l @ l.T is a sample of a correlation 3x3 matrix
tensor([[ 1.0000,  0.0000,  0.0000],
        [ 0.3516,  0.9361,  0.0000],
        [-0.1899,  0.4748,  0.8593]])

參數

維度 (dim) – 矩陣的維度
集中度 (float 或 Tensor) – 分佈的集中度/形狀參數（通常稱為 eta）

參考文獻

[1] 基於藤蔓和擴展洋蔥方法生成隨機相關矩陣（2009 年），Daniel Lewandowski、Dorota Kurowicka、Harry Joe。多變量分析期刊。100. 10.1016/j.jmva.2009.04.008

arg_constraints = {'concentration': GreaterThan(lower_bound=0.0)}¶

expand(batch_shape, _instance=None)[原始碼]¶

log_prob(value)[原始碼]¶

sample(sample_shape=torch.Size([]))[原始碼]¶

support = CorrCholesky()¶

Laplace¶

class torch.distributions.laplace.Laplace(loc, scale, validate_args=None)[原始碼]¶

基底類別： Distribution

建立一個由 loc 和 scale 參數化的拉普拉斯分佈。

範例

>>> m = Laplace(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # Laplace distributed with loc=0, scale=1
tensor([ 0.1046])

參數

loc (float 或 Tensor) – 分佈的均值
scale (float 或 Tensor) – 分佈的尺度

arg_constraints = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

icdf(value)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

property stddev¶

support = Real()¶

property variance¶

對數常態分布¶

class torch.distributions.log_normal.LogNormal(loc, scale, validate_args=None)[原始碼]¶

基底： TransformedDistribution

建立一個由 loc 和 scale 參數化的對數常態分布，其中

X ~ Normal(loc, scale)
Y = exp(X) ~ LogNormal(loc, scale)

範例

>>> m = LogNormal(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # log-normal distributed with mean=0 and stddev=1
tensor([ 0.1046])

參數

loc (float 或 Tensor) – 分布對數的平均值
scale (float 或 Tensor) – 分布對數的標準差

arg_constraints: Dict[str, Constraint] = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

property loc¶

property mean¶

property mode¶

property scale¶

support = GreaterThan(lower_bound=0.0)¶

property variance¶

低秩多變量常態分布¶

class torch.distributions.lowrank_multivariate_normal.LowRankMultivariateNormal(loc, cov_factor, cov_diag, validate_args=None)[原始碼]¶

基底類別： Distribution

建立一個多變量常態分布，其共變異數矩陣具有由 cov_factor 和 cov_diag 參數化的低秩形式

covariance_matrix = cov_factor @ cov_factor.T + cov_diag

範例

>>> m = LowRankMultivariateNormal(torch.zeros(2), torch.tensor([[1.], [0.]]), torch.ones(2))
>>> m.sample()  # normally distributed with mean=`[0,0]`, cov_factor=`[[1],[0]]`, cov_diag=`[1,1]`
tensor([-0.2102, -0.5429])

參數

loc (Tensor) – 分布的平均值，形狀為 batch_shape + event_shape
cov_factor (Tensor) – 共變異數矩陣低秩形式的因子部分，形狀為 batch_shape + event_shape + (rank,)
cov_diag (Tensor) – 共變異數矩陣低秩形式的對角線部分，形狀為 batch_shape + event_shape

備註

當 cov_factor.shape[1] << cov_factor.shape[0] 時，由於 Woodbury 矩陣恆等式和矩陣行列式引理，可以避免計算共變異數矩陣的行列式和反矩陣。由於這些公式，我們只需要計算小尺寸「電容」矩陣的行列式和反矩陣

capacitance = I + cov_factor.T @ inv(cov_diag) @ cov_factor

arg_constraints = {'cov_diag': IndependentConstraint(GreaterThan(lower_bound=0.0), 1), 'cov_factor': IndependentConstraint(Real(), 2), 'loc': IndependentConstraint(Real(), 1)}¶

property covariance_matrix¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

property precision_matrix¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

property scale_tril¶

support = IndependentConstraint(Real(), 1)¶

property variance¶

相同族群混合分佈¶

class torch.distributions.mixture_same_family.MixtureSameFamily(mixture_distribution, component_distribution, validate_args=None)[source]¶

基底類別： Distribution

MixtureSameFamily 分佈實作了一個（批次）混合分佈，其中所有組成部分都來自相同分佈類型的不同參數化。它由一個 Categorical「選擇分佈」（超過 k 個組成部分）和一個組成分佈進行參數化，即一個最右邊批次形狀（等於 [k]）的 Distribution，用於索引每個（批次）組成部分。

範例

>>> # Construct Gaussian Mixture Model in 1D consisting of 5 equally
>>> # weighted normal distributions
>>> mix = D.Categorical(torch.ones(5,))
>>> comp = D.Normal(torch.randn(5,), torch.rand(5,))
>>> gmm = MixtureSameFamily(mix, comp)

>>> # Construct Gaussian Mixture Model in 2D consisting of 5 equally
>>> # weighted bivariate normal distributions
>>> mix = D.Categorical(torch.ones(5,))
>>> comp = D.Independent(D.Normal(
...          torch.randn(5,2), torch.rand(5,2)), 1)
>>> gmm = MixtureSameFamily(mix, comp)

>>> # Construct a batch of 3 Gaussian Mixture Models in 2D each
>>> # consisting of 5 random weighted bivariate normal distributions
>>> mix = D.Categorical(torch.rand(3,5))
>>> comp = D.Independent(D.Normal(
...         torch.randn(3,5,2), torch.rand(3,5,2)), 1)
>>> gmm = MixtureSameFamily(mix, comp)

參數

mixture_distribution – 類似 torch.distributions.Categorical 的實例。管理選擇組成部分的概率。類別的數量必須與 component_distribution 的最右邊批次維度相符。必須具有純量 batch_shape 或與 component_distribution.batch_shape[:-1] 相符的 batch_shape。
component_distribution – 類似 torch.distributions.Distribution 的實例。最右邊的批次維度索引組成部分。

arg_constraints: Dict[str, Constraint] = {}¶

cdf(x)[source]¶

property component_distribution¶

expand(batch_shape, _instance=None)[source]¶

has_rsample = False¶

log_prob(x)[source]¶

property mean¶

property mixture_distribution¶

sample(sample_shape=torch.Size([]))[source]¶

property support¶

property variance¶

多項式¶

class torch.distributions.multinomial.Multinomial(total_count=1, probs=None, logits=None, validate_args=None)[source]¶

基底類別： Distribution

建立一個由 total_count 和 probs 或 logits（但不能同時使用兩者）進行參數化的多項式分佈。 probs 的最內層維度用於索引類別。所有其他維度用於索引批次。

請注意，如果只呼叫 log_prob()，則不需要指定 total_count（請參閱以下範例）

備註

probs 參數必須是非負數、有限數且總和不為零，並且將沿最後一個維度進行標準化，使其總和為 1。 probs 將返回此標準化值。 logits 參數將被解釋為未標準化的對數概率，因此可以是任何實數。同樣地，它也將被標準化，以便產生的概率沿最後一個維度總和為 1。 logits 將返回此標準化值。

sample() 需要為所有參數和樣本共用一個 total_count。
log_prob() 允許每個參數和樣本使用不同的 total_count。

範例

>>> m = Multinomial(100, torch.tensor([ 1., 1., 1., 1.]))
>>> x = m.sample()  # equal probability of 0, 1, 2, 3
tensor([ 21.,  24.,  30.,  25.])

>>> Multinomial(probs=torch.tensor([1., 1., 1., 1.])).log_prob(x)
tensor([-4.1338])

參數

total_count (int) – 試驗次數
probs (張量) – 事件機率
logits (張量) – 事件對數機率（未標準化）

arg_constraints = {'logits': IndependentConstraint(Real(), 1), 'probs': Simplex()}¶

entropy()[source]¶

expand(batch_shape, _instance=None)[source]¶

log_prob(value)[source]¶

property logits¶

property mean¶

property param_shape¶

property probs¶

sample(sample_shape=torch.Size([]))[source]¶

property support¶

total_count: int¶

property variance¶

多變量常態分佈¶

class torch.distributions.multivariate_normal.MultivariateNormal(loc, covariance_matrix=None, precision_matrix=None, scale_tril=None, validate_args=None)[source]¶

基底類別： Distribution

建立一個由平均向量及共變異矩陣參數化的多變數常態（也稱為高斯）分佈。

多變數常態分佈可以使用正定共變異矩陣 $\mathbf{\Sigma}$ 、正定精準矩陣 $\mathbf{\Sigma}^{-1}$ 或具有正值對角元素的下三角矩陣 $\mathbf{L}$ 參數化，使得 $\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top$ 。此三角矩陣可以通過例如共變異矩陣的 Cholesky 分解獲得。

範例

>>> m = MultivariateNormal(torch.zeros(2), torch.eye(2))
>>> m.sample()  # normally distributed with mean=`[0,0]` and covariance_matrix=`I`
tensor([-0.2102, -0.5429])

參數

loc (張量) – 分佈的平均數
covariance_matrix (張量) – 正定共變異矩陣
precision_matrix (張量) – 正定精準矩陣
scale_tril (張量) – 共變異矩陣的下三角因子，具有正值對角線

備註

只能指定 covariance_matrix、precision_matrix 或 scale_tril 其中一個。

使用 scale_tril 會更有效率：所有內部計算都基於 scale_tril。如果傳遞的是 covariance_matrix 或 precision_matrix，則僅使用它們通過 Cholesky 分解計算相應的下三角矩陣。

arg_constraints = {'covariance_matrix': PositiveDefinite(), 'loc': IndependentConstraint(Real(), 1), 'precision_matrix': PositiveDefinite(), 'scale_tril': LowerCholesky()}¶

property covariance_matrix¶

entropy()[source]¶

expand(batch_shape, _instance=None)[source]¶

has_rsample = True¶

log_prob(value)[source]¶

property mean¶

property mode¶

property precision_matrix¶

rsample(sample_shape=torch.Size([]))[source]¶

property scale_tril¶

support = IndependentConstraint(Real(), 1)¶

property variance¶

負二項分佈¶

class torch.distributions.negative_binomial.NegativeBinomial(total_count, probs=None, logits=None, validate_args=None)[source]¶

基底類別： Distribution

建立一個負二項分佈，即在達到 total_count 次失敗之前，成功進行獨立且相同的伯努利試驗次數的分佈。每次伯努利試驗的成功機率為 probs。

參數

total_count (浮點數 或張量) – 要停止的負伯努利試驗次數的非負數，儘管該分佈對於實數計數仍然有效
probs (張量) – 半開區間 [0, 1) 中成功的事件機率
logits (張量) – 成功機率的事件對數機率

arg_constraints = {'logits': Real(), 'probs': HalfOpenInterval(lower_bound=0.0, upper_bound=1.0), 'total_count': GreaterThanEq(lower_bound=0)}¶

expand(batch_shape, _instance=None)[source]¶

log_prob(value)[原始碼]¶

property logits¶

property mean¶

property mode¶

property param_shape¶

property probs¶

sample(sample_shape=torch.Size([]))[原始碼]¶

support = IntegerGreaterThan(lower_bound=0)¶

property variance¶

常態分佈¶

class torch.distributions.normal.Normal(loc, scale, validate_args=None)[原始碼]¶

基底類別： ExponentialFamily

建立一個由 loc 和 scale 參數化的常態（也稱為高斯）分佈。

範例

>>> m = Normal(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # normally distributed with loc=0 and scale=1
tensor([ 0.1046])

參數

loc (float 或 torch.Tensor) – 分佈的平均值（通常稱為 mu）
scale (float 或 torch.Tensor) – 分佈的標準差（通常稱為 sigma）

arg_constraints = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

icdf(value)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

sample(sample_shape=torch.Size([]))[原始碼]¶

property stddev¶

support = Real()¶

property variance¶

OneHotCategorical¶

class torch.distributions.one_hot_categorical.OneHotCategorical(probs=None, logits=None, validate_args=None)[原始碼]¶

基底類別： Distribution

建立一個由 probs 或 logits 參數化的單一熱編碼類別分佈。

樣本是大小為 probs.size(-1) 的單一熱編碼向量。

備註

probs 參數必須是非負數、有限且總和不為零，並且它將沿最後一個維度正規化為總和為 1。 probs 將返回此正規化值。 logits 參數將被解釋為未正規化的對數機率，因此可以是任何實數。它也將被正規化，以便產生的機率沿最後一個維度總和為 1。 logits 將返回此正規化值。

另請參閱：torch.distributions.Categorical() 以取得 probs 和 logits 的規格。

範例

>>> m = OneHotCategorical(torch.tensor([ 0.25, 0.25, 0.25, 0.25 ]))
>>> m.sample()  # equal probability of 0, 1, 2, 3
tensor([ 0.,  0.,  0.,  1.])

參數

probs (張量) – 事件機率
logits (張量) – 事件對數機率（未標準化）

arg_constraints = {'logits': IndependentConstraint(Real(), 1), 'probs': Simplex()}¶

entropy()[原始碼]¶

enumerate_support(expand=True)[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_enumerate_support = True¶

log_prob(value)[原始碼]¶

property logits¶

property mean¶

property mode¶

property param_shape¶

property probs¶

sample(sample_shape=torch.Size([]))[原始碼]¶

support = OneHot()¶

property variance¶

帕雷托分佈¶

class torch.distributions.pareto.Pareto(scale, alpha, validate_args=None)[原始碼]¶

基底： TransformedDistribution

從帕雷托類型 1 分佈中取樣。

範例

>>> m = Pareto(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Pareto distribution with scale=1 and alpha=1
tensor([ 1.5623])

參數

scale (float 或 Tensor) – 分佈的尺度參數
alpha (float 或 Tensor) – 分佈的形狀參數

arg_constraints: Dict[str, Constraint] = {'alpha': GreaterThan(lower_bound=0.0), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

property mean¶

property mode¶

property support¶

property variance¶

泊松分佈¶

class torch.distributions.poisson.Poisson(rate, validate_args=None)[原始碼]¶

基底類別： ExponentialFamily

建立一個由 rate 參數化的泊松分佈，即速率參數。

樣本是非負整數，其機率質量函數 (PMF) 為

\mathrm{rate}^k \frac{e^{-\mathrm{rate}}}{k!}

範例

>>> m = Poisson(torch.tensor([4]))
>>> m.sample()
tensor([ 3.])

參數: rate (Number, Tensor) – 速率參數

arg_constraints = {'rate': GreaterThanEq(lower_bound=0.0)}¶

expand(batch_shape, _instance=None)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

sample(sample_shape=torch.Size([]))[原始碼]¶

support = IntegerGreaterThan(lower_bound=0)¶

property variance¶

鬆弛伯努利分佈¶

class torch.distributions.relaxed_bernoulli.RelaxedBernoulli(temperature, probs=None, logits=None, validate_args=None)[原始碼]¶

基底： TransformedDistribution

創建一個由 temperature 以及 probs 或 logits (但不能同時使用) 參數化的 RelaxedBernoulli 分佈。這是 Bernoulli 分佈的放鬆版本，因此值介於 (0, 1) 之間，並且具有可重新參數化的樣本。

範例

>>> m = RelaxedBernoulli(torch.tensor([2.2]),
...                      torch.tensor([0.1, 0.2, 0.3, 0.99]))
>>> m.sample()
tensor([ 0.2951,  0.3442,  0.8918,  0.9021])

參數

temperature (Tensor) – 放鬆溫度
probs (Number, Tensor) – 採樣到 `1` 的機率
logits (Number, Tensor) – 採樣到 `1` 的對數機率比

arg_constraints: Dict[str, Constraint] = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

property logits¶

property probs¶

support = Interval(lower_bound=0.0, upper_bound=1.0)¶

property temperature¶

LogitRelaxedBernoulli¶

class torch.distributions.relaxed_bernoulli.LogitRelaxedBernoulli(temperature, probs=None, logits=None, validate_args=None)[原始碼]¶

基底類別： Distribution

創建一個由 probs 或 logits (但不能同時使用) 參數化的 LogitRelaxedBernoulli 分佈，這是 RelaxedBernoulli 分佈的對數機率。

樣本是 (0, 1) 中值的對數機率。有關更多詳細信息，請參閱 [1]。

參數

temperature (Tensor) – 放鬆溫度
probs (Number, Tensor) – 採樣到 `1` 的機率
logits (Number, Tensor) – 採樣到 `1` 的對數機率比

[1] 具體分佈：離散隨機變量的連續放鬆 (Maddison 等人，2017)

[2] 使用 Gumbel-Softmax 進行分類重新參數化 (Jang 等人，2017)

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

expand(batch_shape, _instance=None)[原始碼]¶

log_prob(value)[原始碼]¶

property logits¶

property param_shape¶

property probs¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

support = Real()¶

RelaxedOneHotCategorical¶

class torch.distributions.relaxed_categorical.RelaxedOneHotCategorical(temperature, probs=None, logits=None, validate_args=None)[原始碼]¶

基底： TransformedDistribution

創建一個由 temperature 以及 probs 或 logits 參數化的 RelaxedOneHotCategorical 分佈。這是 OneHotCategorical 分佈的放鬆版本，因此其樣本位於單純形上，並且可重新參數化。

範例

>>> m = RelaxedOneHotCategorical(torch.tensor([2.2]),
...                              torch.tensor([0.1, 0.2, 0.3, 0.4]))
>>> m.sample()
tensor([ 0.1294,  0.2324,  0.3859,  0.2523])

參數

temperature (Tensor) – 放鬆溫度
probs (張量) – 事件機率
logits (Tensor) – 每個事件的未歸一化對數機率

arg_constraints: Dict[str, Constraint] = {'logits': IndependentConstraint(Real(), 1), 'probs': Simplex()}¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

property logits¶

property probs¶

support = Simplex()¶

property temperature¶

StudentT¶

class torch.distributions.studentT.StudentT(df, loc=0.0, scale=1.0, validate_args=None)[原始碼]¶

基底類別： Distribution

建立一個由自由度 df、均值 loc 和尺度 scale 參數化的學生 t 分佈。

範例

>>> m = StudentT(torch.tensor([2.0]))
>>> m.sample()  # Student's t-distributed with degrees of freedom=2
tensor([ 0.1046])

參數

df (float 或 Tensor) – 自由度
loc (float 或 Tensor) – 分佈的均值
scale (float 或 Tensor) – 分佈的尺度

arg_constraints = {'df': GreaterThan(lower_bound=0.0), 'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

support = Real()¶

property variance¶

TransformedDistribution¶

class torch.distributions.transformed_distribution.TransformedDistribution(base_distribution, transforms, validate_args=None)[原始碼]¶

基底類別： Distribution

Distribution 類別的擴展，它將一系列轉換應用於基礎分佈。令 f 為所應用轉換的組合

X ~ BaseDistribution
Y = f(X) ~ TransformedDistribution(BaseDistribution, f)
log p(Y) = log p(X) + log |det (dX/dY)|

請注意，TransformedDistribution 的 .event_shape 是其基礎分佈及其轉換的最大形狀，因為轉換可以在事件之間引入關聯性。

TransformedDistribution 的用法範例如下

# Building a Logistic Distribution
# X ~ Uniform(0, 1)
# f = a + b * logit(X)
# Y ~ f(X) ~ Logistic(a, b)
base_distribution = Uniform(0, 1)
transforms = [SigmoidTransform().inv, AffineTransform(loc=a, scale=b)]
logistic = TransformedDistribution(base_distribution, transforms)

如需更多範例，請參閱 Gumbel、HalfCauchy、HalfNormal、LogNormal、Pareto、Weibull、RelaxedBernoulli 和 RelaxedOneHotCategorical 的實作

arg_constraints: Dict[str, Constraint] = {}¶

cdf(value)[原始碼]¶: 透過反轉轉換並計算基礎分佈的分數來計算累積分佈函數。

expand(batch_shape, _instance=None)[原始碼]¶

property has_rsample¶

icdf(value)[原始碼]¶: 使用轉換並計算基礎分佈的分數來計算反向累積分佈函數。

log_prob(value)[原始碼]¶: 透過反轉轉換並使用基礎分佈的分數和對數絕對值行列式雅可比矩陣來計算分數，從而對樣本進行評分。

rsample(sample_shape=torch.Size([]))[原始碼]¶: 產生一個 sample_shape 形狀的重新參數化樣本，如果分佈參數是批次的，則產生 sample_shape 形狀的重新參數化樣本批次。首先從基礎分佈中取樣，並對列表中的每個轉換應用 transform()。

sample(sample_shape=torch.Size([]))[原始碼]¶: 產生一個 sample_shape 形狀的樣本，如果分佈參數是批次的，則產生 sample_shape 形狀的樣本批次。首先從基礎分佈中取樣，並對列表中的每個轉換應用 transform()。

property support¶

Uniform¶

class torch.distributions.uniform.Uniform(low, high, validate_args=None)[原始碼]¶

基底類別： Distribution

從半開區間 [low, high) 產生均勻分佈的隨機樣本。

範例

>>> m = Uniform(torch.tensor([0.0]), torch.tensor([5.0]))
>>> m.sample()  # uniformly distributed in the range [0.0, 5.0)
tensor([ 2.3418])

參數

low (float 或 Tensor) – 下限（含）。
high (float 或 Tensor) – 上限（不含）。

arg_constraints = {'high': Dependent(), 'low': Dependent()}¶

cdf(value)[原始碼]¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[原始碼]¶

has_rsample = True¶

icdf(value)[原始碼]¶

log_prob(value)[原始碼]¶

property mean¶

property mode¶

rsample(sample_shape=torch.Size([]))[原始碼]¶

property stddev¶

property support¶

property variance¶

VonMises¶

class torch.distributions.von_mises.VonMises(loc, concentration, validate_args=None)[原始碼]¶

基底類別： Distribution

一種圓形馮·米塞斯分佈。

此實作使用極座標。 loc 和 value 參數可以是任何實數（以利於無約束最佳化），但會被解釋為以 2 pi 為模數的角度。

範例：

>>> m = VonMises(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # von Mises distributed with loc=1 and concentration=1
tensor([1.9777])

參數

loc (torch.Tensor) – 以弧度表示的角度。
concentration (torch.Tensor) – 集中參數

arg_constraints = {'concentration': GreaterThan(lower_bound=0.0), 'loc': Real()}¶

expand(batch_shape)[原始碼]¶

has_rsample = False¶

log_prob(value)[原始碼]¶

property mean¶: 提供的平均值是圓形平均值。

property mode¶

sample(sample_shape=torch.Size([]))[原始碼]¶

馮·米塞斯分佈的抽樣演算法基於以下論文：D.J. Best 和 N.I. Fisher，“Efficient simulation of the von Mises distribution.” Applied Statistics (1979): 152-157.

抽樣總是在內部以雙精度完成，以避免在集中度值較小時，_rejection_sample() 中發生掛起，這在單精度約為 1e-4 時開始發生（請參閱問題 #88443）。

support = Real()¶

property variance¶: 提供的變異數是圓形變異數。

Weibull¶

class torch.distributions.weibull.Weibull(scale, concentration, validate_args=None)[原始碼]¶

基底： TransformedDistribution

從雙參數韋伯分佈中取樣。

範例

>>> m = Weibull(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Weibull distribution with scale=1, concentration=1
tensor([ 0.4784])

參數

scale (float 或 Tensor) – 分佈的尺度參數 (lambda)。
concentration (float 或 Tensor) – 分佈的集中參數 (k/形狀)。

arg_constraints: Dict[str, Constraint] = {'concentration': GreaterThan(lower_bound=0.0), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[原始碼]¶

expand(batch_shape, _instance=None)[source]¶

property mean¶

property mode¶

support = GreaterThan(lower_bound=0.0)¶

property variance¶

Wishart¶

class torch.distributions.wishart.Wishart(df, covariance_matrix=None, precision_matrix=None, scale_tril=None, validate_args=None)[source]¶

基底類別： ExponentialFamily

建立一個由對稱正定矩陣 $\Sigma$ 或其 Cholesky 分解 $\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top$ 參數化的 Wishart 分佈。

範例

>>> m = Wishart(torch.Tensor([2]), covariance_matrix=torch.eye(2))
>>> m.sample()  # Wishart distributed with mean=`df * I` and
>>>             # variance(x_ij)=`df` for i != j and variance(x_ij)=`2 * df` for i == j

參數

df (float 或 Tensor) – 大於 (平方矩陣的維度) - 1 的實值參數
covariance_matrix (張量) – 正定共變異矩陣
precision_matrix (張量) – 正定精準矩陣
scale_tril (張量) – 共變異矩陣的下三角因子，具有正值對角線

備註

只能指定 covariance_matrix、precision_matrix 或 scale_tril 其中之一。使用 scale_tril 會更有效率：所有內部計算都基於 scale_tril。如果傳遞的是 covariance_matrix 或 precision_matrix，則僅使用它們通過 Cholesky 分解來計算對應的下三角矩陣。「torch.distributions.LKJCholesky」是一個受限的 Wishart 分佈。[1]

參考文獻

[1] Wang, Z., Wu, Y. and Chu, H., 2018. On equivalence of the LKJ distribution and the restricted Wishart distribution. [2] Sawyer, S., 2007. Wishart Distributions and Inverse-Wishart Sampling. [3] Anderson, T. W., 2003. An Introduction to Multivariate Statistical Analysis (3rd ed.). [4] Odell, P. L. & Feiveson, A. H., 1966. A Numerical Procedure to Generate a SampleCovariance Matrix. JASA, 61(313):199-203. [5] Ku, Y.-C. & Bloomfield, P., 2010. Generating Random Wishart Matrices with Fractional Degrees of Freedom in OX.

arg_constraints = {'covariance_matrix': PositiveDefinite(), 'df': GreaterThan(lower_bound=0), 'precision_matrix': PositiveDefinite(), 'scale_tril': LowerCholesky()}¶

property covariance_matrix¶

entropy()[source]¶

expand(batch_shape, _instance=None)[source]¶

has_rsample = True¶

log_prob(value)[source]¶

property mean¶

property mode¶

property precision_matrix¶

rsample(sample_shape=torch.Size([]), max_try_correction=None)[source]¶: 警告

在某些情況下，基於 Bartlett 分解的抽樣演算法可能會返回奇異矩陣樣本。默認情況下會進行多次嘗試來修正奇異樣本，但最終可能會返回奇異矩陣樣本。奇異樣本可能會在 .log_prob() 中返回 -inf 值。在這些情況下，用戶應驗證樣本並相應地修正 df 的值或調整 .rsample 參數中 max_try_correction 的值。

property scale_tril¶

support = PositiveDefinite()¶

property variance¶

KL 散度¶

torch.distributions.kl.kl_divergence(p, q)[source]¶

計算兩個分佈之間的 Kullback-Leibler 散度 $KL(p \| q)$ 。

KL(p \| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx

參數

p (分配) – 一個 Distribution 物件。
q (分配) – 一個 Distribution 物件。

傳回

一批形狀為 batch_shape 的 KL 散度。

傳回類型

Tensor

引發

NotImplementedError – 如果分配類型尚未透過 register_kl() 註冊。

KL 散度目前針對以下分配對實作

Bernoulli 與 Bernoulli
Bernoulli 與 Poisson
Beta 與 Beta
Beta 與 ContinuousBernoulli
Beta 與 Exponential
Beta 與 Gamma
Beta 與 Normal
Beta 與 Pareto
Beta 與 Uniform
Binomial 與 Binomial
Categorical 與 Categorical
Cauchy 與 Cauchy
ContinuousBernoulli 與 ContinuousBernoulli
ContinuousBernoulli 與 Exponential
ContinuousBernoulli 與 Normal
ContinuousBernoulli 與 Pareto
ContinuousBernoulli 與 Uniform
Dirichlet 與 Dirichlet
Exponential 與 Beta
Exponential 與 ContinuousBernoulli
Exponential 與 Exponential
Exponential 與 Gamma
Exponential 與 Gumbel
Exponential 與 Normal
Exponential 與 Pareto
Exponential 與 Uniform
ExponentialFamily 與 ExponentialFamily
Gamma 與 Beta
Gamma 與 ContinuousBernoulli
Gamma 與 Exponential
Gamma 與 Gamma
Gamma 與 Gumbel
Gamma 與 Normal
Gamma 與 Pareto
Gamma 與 Uniform
Geometric 與 Geometric
Gumbel 與 Beta
Gumbel 與 ContinuousBernoulli
Gumbel 與 Exponential
Gumbel 與 Gamma
Gumbel 與 Gumbel
Gumbel 與 Normal
Gumbel 與 Pareto
Gumbel 與 Uniform
HalfNormal 與 HalfNormal
Independent 與 Independent
Laplace 與 Beta
Laplace 與 ContinuousBernoulli
Laplace 與 Exponential
Laplace 與 Gamma
Laplace 與 Laplace
Laplace 與 Normal
Laplace 與 Pareto
Laplace 與 Uniform
LowRankMultivariateNormal 與 LowRankMultivariateNormal
LowRankMultivariateNormal 與 MultivariateNormal
MultivariateNormal 與 LowRankMultivariateNormal
MultivariateNormal 與 MultivariateNormal
常態分佈 與 貝它分佈
常態分佈 與 連續伯努利分佈
常態分佈 與 指數分佈
常態分佈 與 伽瑪分佈
常態分佈 與 甘貝爾分佈
常態分佈 與 拉普拉斯分佈
常態分佈 與 常態分佈
常態分佈 與 帕雷托分佈
常態分佈 與 均勻分佈
單一熱編碼類別分佈 與 單一熱編碼類別分佈
帕雷托分佈 與 貝它分佈
帕雷托分佈 與 連續伯努利分佈
帕雷托分佈 與 指數分佈
帕雷托分佈 與 伽瑪分佈
帕雷托分佈 與 常態分佈
帕雷托分佈 與 帕雷托分佈
帕雷托分佈 與 均勻分佈
泊松分佈 與 伯努利分佈
泊松分佈 與 二項分佈
泊松分佈 與 泊松分佈
轉換分佈 與 轉換分佈
均勻分佈 與 貝它分佈
均勻分佈 與 連續伯努利分佈
均勻分佈 與 指數分佈
均勻分佈 與 伽瑪分佈
均勻分佈 與 甘貝爾分佈
均勻分佈 與 常態分佈
均勻分佈 與 帕雷托分佈
均勻分佈 與 均勻分佈

torch.distributions.kl.register_kl(type_p, type_q)[來源]¶

裝飾器，用於向 kl_divergence() 註冊成對函數。用法

@register_kl(Normal, Normal)
def kl_normal_normal(p, q):
    # insert implementation here

查找會返回由子類別排序的最具體的 (類型，類型) 匹配。如果匹配不明確，則會引發 RuntimeWarning。例如，要解決不明確的情況

@register_kl(BaseP, DerivedQ)
def kl_version1(p, q): ...
@register_kl(DerivedP, BaseQ)
def kl_version2(p, q): ...

您應該註冊第三個最具體的實現，例如

register_kl(DerivedP, DerivedQ)(kl_version1)  # Break the tie.

參數

type_p (類型) – 分佈 的子類別。
type_q (類型) – 分佈 的子類別。

轉換¶

類別 torch.distributions.transforms.AbsTransform(cache_size=0)[來源]¶: 通過映射 $y = |x|$ 進行轉換。

類別 torch.distributions.transforms.AffineTransform(loc, scale, event_dim=0, cache_size=0)[來源]¶

通過逐點仿射映射 $y = \text{loc} + \text{scale} \times x$ 進行轉換。

參數

loc (張量或 浮點數) – 位置參數。
scale (張量或 浮點數) – 尺度參數。
event_dim (整數) – 事件形狀的可選大小。對於單變量隨機變量，這應該是零，對於向量分佈為 1，對於矩陣分佈為 2，等等。

類別 torch.distributions.transforms.CatTransform(tseq, dim=0, lengths=無, cache_size=0)[來源]¶

轉換函子，它以與 torch.cat() 相容的方式，將一系列轉換 tseq 分量式地應用於 dim 處的每個子矩陣，長度為 lengths[dim]。

範例

x0 = torch.cat([torch.range(1, 10), torch.range(1, 10)], dim=0)
x = torch.cat([x0, x0], dim=0)
t0 = CatTransform([ExpTransform(), identity_transform], dim=0, lengths=[10, 10])
t = CatTransform([t0, t0], dim=0, lengths=[20, 20])
y = t(x)

類別 torch.distributions.transforms.ComposeTransform(parts, cache_size=0)[來源]¶

在鏈中組合多個轉換。被組合的轉換負責緩存。

參數

parts （轉換 列表）– 要組合的轉換列表。
cache_size (整數) – 緩存的大小。如果為零，則不進行緩存。如果為一，則緩存最新的單一值。僅支持 0 和 1。

類別 torch.distributions.transforms.CorrCholeskyTransform(cache_size=0)[來源]¶

將一個長度為 $D*(D-1)/2$ 的無約束實向量 $x$ 轉換為 D 維相關矩陣的 Cholesky 因子。此 Cholesky 因子是一個下三角矩陣，其對角線元素為正數，且每一行的歐幾里德範數為 1。轉換過程如下：

首先，我們將 x 轉換為按行排序的下三角矩陣。

對於下三角部分的每一行 $X_i$ ，我們應用類別 StickBreakingTransform 的「帶符號」版本，透過以下步驟將 $X_i$ 轉換為歐幾里德長度為 1 的向量： - 調整為區間 $(-1, 1)$ 的域： $r_i = \tanh(X_i)$ 。 - 轉換為無符號域： $z_i = r_i^2$ 。 - 應用 $s_i = StickBreakingTransform(z_i)$ 。 - 轉換回帶符號域： $y_i = sign(r_i) * \sqrt{s_i}$ 。

class torch.distributions.transforms.CumulativeDistributionTransform(distribution, cache_size=0)[來源]¶

透過機率分佈的累積分佈函數進行轉換。

參數: distribution (Distribution) – 用於轉換的機率分佈的累積分佈函數。

範例

# Construct a Gaussian copula from a multivariate normal.
base_dist = MultivariateNormal(
    loc=torch.zeros(2),
    scale_tril=LKJCholesky(2).sample(),
)
transform = CumulativeDistributionTransform(Normal(0, 1))
copula = TransformedDistribution(base_dist, [transform])

class torch.distributions.transforms.ExpTransform(cache_size=0)[來源]¶: 透過映射 $y = \exp(x)$ 進行轉換。

class torch.distributions.transforms.IndependentTransform(base_transform, reinterpreted_batch_ndims, cache_size=0)[原始碼]¶

另一個轉換的包裝器，用於將最右邊 reinterpreted_batch_ndims 個維度視為相依。這對正向或反向轉換沒有影響，但在 log_abs_det_jacobian() 中會將最右邊 reinterpreted_batch_ndims 個維度加總。

參數

base_transform (Transform) – 基礎轉換。
reinterpreted_batch_ndims (int) – 要視為相依的額外最右邊維度數量。

class torch.distributions.transforms.LowerCholeskyTransform(cache_size=0)[原始碼]¶

將無約束矩陣轉換為具有非負對角線項目的下三角矩陣。

這對於根據其 Cholesky 分解來參數化正定矩陣非常有用。

class torch.distributions.transforms.PositiveDefiniteTransform(cache_size=0)[原始碼]¶: 將無約束矩陣轉換為正定矩陣。

class torch.distributions.transforms.PowerTransform(exponent, cache_size=0)[原始碼]¶: 透過映射 $y = x^{\text{exponent}}$ 進行轉換。

class torch.distributions.transforms.ReshapeTransform(in_shape, out_shape, cache_size=0)[原始碼]¶

單位雅可比轉換，用於重塑張量的最右邊部分。

請注意， in_shape 和 out_shape 必須具有相同數量的元素，就像 torch.Tensor.reshape() 一樣。

參數

in_shape (torch.Size) – 輸入事件形狀。
out_shape (torch.Size) – 輸出事件形狀。

class torch.distributions.transforms.SigmoidTransform(cache_size=0)[原始碼]¶: 透過映射 $y = \frac{1}{1 + \exp(-x)}$ 和 $x = \text{logit}(y)$ 進行轉換。

class torch.distributions.transforms.SoftplusTransform(cache_size=0)[原始碼]¶: 透過映射 $\text{Softplus}(x) = \log(1 + \exp(x))$ 進行轉換。當 $x > 20$ 時，實作會還原為線性函數。

class torch.distributions.transforms.TanhTransform(cache_size=0)[原始碼]¶

透過映射 $y = \tanh(x)$ 進行轉換。

它等效於 ` ComposeTransform([AffineTransform(0., 2.), SigmoidTransform(), AffineTransform(-1., 2.)]) ` 然而，這在數值上可能不穩定，因此建議改用 TanhTransform。

請注意，當涉及 NaN/Inf 值時，應使用 cache_size=1。

class torch.distributions.transforms.SoftmaxTransform(cache_size=0)[原始碼]¶

通過 $y = \exp(x)$ 然後正規化，從無約束空間轉換到單純形。

這不是雙射的，不能用於 HMC。然而，這主要是逐坐標作用的（除了最終的正規化），因此適用於逐坐標優化演算法。

類別 torch.distributions.transforms.StackTransform(tseq, dim=0, cache_size=0)[原始碼]¶

變換函子，它以與 torch.stack() 相容的方式，將一系列變換 tseq 分量式地應用於 dim 處的每個子矩陣。

範例

x = torch.stack([torch.range(1, 10), torch.range(1, 10)], dim=1)
t = StackTransform([ExpTransform(), identity_transform], dim=1)
y = t(x)

類別 torch.distributions.transforms.StickBreakingTransform(cache_size=0)[原始碼]¶

通過斷棒過程，將無約束空間轉換為具有額外維度的單純形。

這種轉換在狄利克雷分佈的斷棒構造中表現為迭代 sigmoid 轉換：第一個 logit 通過 sigmoid 轉換為第一個概率和所有其他事件的概率，然後遞迴進行。

這是雙射的，適用於 HMC；然而，它將坐標混合在一起，不太適合優化。

類別 torch.distributions.transforms.Transform(cache_size=0)[原始碼]¶

具有可計算對數行列式的可逆變換的抽象類別。它們主要用於 torch.distributions.TransformedDistribution。

快取對於逆變換代價高昂或數值不穩定的情況很有用。請注意，必須謹慎處理已記憶的值，因為自動微分圖可能會被反轉。例如，以下代碼在使用或不使用快取的情況下都能正常工作

y = t(x)
t.log_abs_det_jacobian(x, y).backward()  # x will receive gradients.

但是，由於依賴關係反轉，以下代碼在快取時會出錯

y = t(x)
z = t.inv(y)
grad(z.sum(), [y])  # error because z is x

派生類別應實現 _call() 或 _inverse() 中的一個或兩個。設置 bijective=True 的派生類別還應實現 log_abs_det_jacobian()。

參數

cache_size (整數) – 緩存的大小。如果為零，則不進行緩存。如果為一，則緩存最新的單一值。僅支持 0 和 1。

變數

domain (Constraint) – 表示此變換的有效輸入的約束。
codomain (Constraint) – 表示此變換的有效輸出（即逆變換的輸入）的約束。
bijective (布林值) – 此變換是否為雙射。當且僅當對於域中的每個 x 和共域中的每個 y，都有 t.inv(t(x)) == x 和 t(t.inv(y)) == y 成立時，變換 t 才是雙射。非雙射的變換至少應保持較弱的偽逆性質：t(t.inv(t(x)) == t(x) 和 t.inv(t(t.inv(y))) == t.inv(y)。
sign (整數或張量) – 對於雙射單變量變換，根據變換是單調遞增還是單調遞減，這應該是 +1 或 -1。

屬性 inv¶: 傳回此變換的逆 Transform。這應該滿足 t.inv.inv is t。

屬性 sign¶: 傳回雅可比行列式的正負號（如果適用）。通常，這僅對雙射變換有意義。

log_abs_det_jacobian(x, y)[原始碼]¶: 給定輸入和輸出，計算對數行列式 log |dy/dx|。

forward_shape(shape)[原始碼]¶: 給定輸入形狀，推斷正向計算的形狀。預設情況下保留形狀。

inverse_shape(shape)[原始碼]¶: 給定輸出形狀，推斷逆計算的形狀。預設情況下保留形狀。

約束¶

實現了以下約束

constraints.boolean
constraints.cat
constraints.corr_cholesky
constraints.dependent
constraints.greater_than(lower_bound)
constraints.greater_than_eq(lower_bound)
constraints.independent(constraint, reinterpreted_batch_ndims)
constraints.integer_interval(lower_bound, upper_bound)
constraints.interval(lower_bound, upper_bound)
constraints.less_than(upper_bound)
constraints.lower_cholesky
constraints.lower_triangular
constraints.multinomial
constraints.nonnegative
constraints.nonnegative_integer
constraints.one_hot
constraints.positive_integer
constraints.positive
constraints.positive_semidefinite
constraints.positive_definite
constraints.real_vector
constraints.real
constraints.simplex
constraints.symmetric
constraints.stack
constraints.square
constraints.symmetric
constraints.unit_interval

類別 torch.distributions.constraints.Constraint[原始碼]¶

約束的抽象基類別。

約束對象表示變數有效的區域，例如，變數可以在其中進行優化的區域。

變數

is_discrete (布林值) – 約束空間是否為離散的。預設值為 False。
event_dim (整數) – 一起定義一個事件的最右側維度數。在計算有效性時，check() 方法將移除這些維度。

check(value)[原始碼]¶: 傳回一個 樣本形狀 + 批次形狀 的位元組張量，表示值中的每個事件是否滿足此約束。

torch.distributions.constraints.cat¶: _Cat 的別名

torch.distributions.constraints.dependent_property¶: _DependentProperty 的別名

torch.distributions.constraints.greater_than¶: _GreaterThan 的別名

torch.distributions.constraints.greater_than_eq¶: _GreaterThanEq 的別名

torch.distributions.constraints.independent¶: _IndependentConstraint 的別名

torch.distributions.constraints.integer_interval¶: _IntegerInterval 的別名

torch.distributions.constraints.interval¶: _Interval 的別名

torch.distributions.constraints.half_open_interval¶: _HalfOpenInterval 的別名

torch.distributions.constraints.less_than¶: _LessThan 的別名

torch.distributions.constraints.multinomial¶: _Multinomial 的別名

torch.distributions.constraints.stack¶: _Stack 的別名

限制式註冊表¶

PyTorch 提供兩個全域的 ConstraintRegistry 物件，它們將 Constraint 物件連結到 Transform 物件。這些物件都輸入限制式並返回轉換，但它們對雙射性有不同的保證。

biject_to(constraint) 從 constraints.real 查找給定 constraint 的雙射 Transform。返回的轉換保證具有 .bijective = True，並且應實作 .log_abs_det_jacobian()。
transform_to(constraint) 從 constraints.real 查找給定 constraint 的非必要雙射 Transform。返回的轉換不保證實作 .log_abs_det_jacobian()。

transform_to() 註冊表可用於對機率分佈的受限參數執行無約束最佳化，這些參數由每個分佈的 .arg_constraints dict 指示。這些轉換通常會過度參數化空間以避免旋轉；因此，它們更適合於 Adam 等逐坐標最佳化演算法

loc = torch.zeros(100, requires_grad=True)
unconstrained = torch.zeros(100, requires_grad=True)
scale = transform_to(Normal.arg_constraints['scale'])(unconstrained)
loss = -Normal(loc, scale).log_prob(data).sum()

biject_to() 註冊表對於漢米爾頓蒙地卡羅法很有用，其中來自具有受限 .support 的機率分佈的樣本在不受約束的空間中傳播，並且演算法通常是旋轉不變的。

dist = Exponential(rate)
unconstrained = torch.zeros(100, requires_grad=True)
sample = biject_to(dist.support)(unconstrained)
potential_energy = -dist.log_prob(sample).sum()

備註

transform_to 和 biject_to 不同的例子是 constraints.simplex：transform_to(constraints.simplex) 返回一個 SoftmaxTransform，它只是對其輸入進行指數化和標準化；這是一個適用於 SVI 等演算法的廉價且主要是逐坐標的操作。相反，biject_to(constraints.simplex) 返回一個 StickBreakingTransform，它將其輸入雙射到一個維度少一的空間；這是一個更昂貴、數值穩定性較低的轉換，但對於 HMC 等演算法是必需的。

biject_to 和 transform_to 物件可以通過使用其 .register() 方法的使用者定義限制式和轉換來擴展，作為單例限制式的函數

transform_to.register(my_constraint, my_transform)

或作為參數化限制式的裝飾器

@transform_to.register(MyConstraintClass)
def my_factory(constraint):
    assert isinstance(constraint, MyConstraintClass)
    return MyTransform(constraint.param1, constraint.param2)

您可以通過創建新的 ConstraintRegistry 物件來創建自己的註冊表。

class torch.distributions.constraint_registry.ConstraintRegistry[原始碼]¶

用於將限制式連結到轉換的註冊表。

register(constraint, factory=None)[原始碼]¶

在此註冊表中註冊一個 Constraint 子類別。用法

@my_registry.register(MyConstraintClass)
def construct_transform(constraint):
    assert isinstance(constraint, MyConstraint)
    return MyTransform(constraint.arg_constraints)

參數

constraint （Constraint 的子類別）– Constraint 的子類別，或所需類別的單例物件。
factory (Callable) – 一個可調用物件，它輸入一個限制式物件並返回一個 Transform 物件。

機率分佈 - torch.distributions¶

分數函數¶

路徑導數¶

分佈¶

指數族¶

伯努利¶

Beta¶

Binomial¶

Categorical¶

Cauchy¶

卡方分佈¶

連續伯努利分佈¶

狄利克雷分佈¶

指數分佈¶

費雪-史內德卡分佈¶

Gamma¶

Geometric¶

Gumbel¶

HalfCauchy¶

HalfNormal¶

Independent¶

InverseGamma¶

Kumaraswamy¶

LKJCholesky¶

Laplace¶

對數常態分布¶

低秩多變量常態分布¶

相同族群混合分佈¶

多項式¶

多變量常態分佈¶

負二項分佈¶

常態分佈¶

OneHotCategorical¶

帕雷托分佈¶

泊松分佈¶

鬆弛伯努利分佈¶

LogitRelaxedBernoulli¶

RelaxedOneHotCategorical¶

StudentT¶

TransformedDistribution¶

Uniform¶

VonMises¶

Weibull¶

Wishart¶

KL 散度¶

轉換¶

約束¶

限制式註冊表¶

文件

教學課程

資源