vllm.model_executor.layers.mamba.mamba_utils
MambaStateShapeCalculator ¶
Source code in vllm/model_executor/layers/mamba/mamba_utils.py
extra_groups_for_head_shards classmethod
¶
Compute the increase in group numbers to account for replication in order to accompany the head shards.
Source code in vllm/model_executor/layers/mamba/mamba_utils.py
linear_attention_state_shape classmethod
¶
linear_attention_state_shape(
num_heads: int, tp_size: int, head_dim: int
) -> tuple[tuple[int, int, int], ...]
Source code in vllm/model_executor/layers/mamba/mamba_utils.py
mamba1_state_shape classmethod
¶
mamba1_state_shape(
tp_world_size: int,
intermediate_size: int,
state_size: int,
conv_kernel: int,
use_v1: bool = True,
) -> tuple[tuple[int, int], tuple[int, int]]
Source code in vllm/model_executor/layers/mamba/mamba_utils.py
mamba2_state_shape classmethod
¶
mamba2_state_shape(
tp_world_size: int,
intermediate_size: int,
n_groups: int,
num_heads: int,
head_dim: int,
state_size: int,
conv_kernel: int,
use_v1: bool = True,
) -> tuple[tuple[int, int], tuple[int, int, int]]