快捷方式

MarlGroupMapType

torchrl.envs.MarlGroupMapType(value, names=None, *, module=None, qualname=None, type=None, start=1)[原始碼]

Marl 組對映型別。

作為 torchrl 多智慧體的一個特性,你能夠控制環境中智慧體的分組。你可以將智慧體分組(堆疊它們的張量),以便在透過同一個神經網路時利用向量化。你可以將智慧體拆分到不同的組中,如果它們是異構的或者應該由不同的神經網路處理。要進行分組,你只需在環境構建時傳遞一個 group_map

否則,你可以從此類中選擇一種預設的分組策略。

  • 對於 group_map=MarlGroupMapType.ALL_IN_ONE_GROUP 和智慧體 ["agent_0", "agent_1", "agent_2", "agent_3"],進出你的環境的 tensordicts 將看起來像

    >>> print(env.rand_action(env.reset()))
    TensorDict(
        fields={
            agents: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([4, 9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([4, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([4, 3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([4]))},
        batch_size=torch.Size([]))
    >>> print(env.group_map)
    {"agents": ["agent_0", "agent_1", "agent_2", "agent_3]}
    
  • 對於 group_map=MarlGroupMapType.ONE_GROUP_PER_AGENT 和智慧體 ["agent_0", "agent_1", "agent_2", "agent_3"],進出你的環境的 tensordicts 將看起來像

    >>> print(env.rand_action(env.reset()))
    TensorDict(
        fields={
            agent_0: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([]))},
            agent_1: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([]))},
            agent_2: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([]))},
            agent_3: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([]))},
        batch_size=torch.Size([]))
    >>> print(env.group_map)
    {"agent_0": ["agent_0"], "agent_1": ["agent_1"], "agent_2": ["agent_2"], "agent_3": ["agent_3"]}
    

文件

訪問全面的 PyTorch 開發者文件

檢視文件

教程

獲取針對初學者和高階開發者的深入教程

檢視教程

資源

查詢開發資源並獲得問題解答

檢視資源