vllm.model_executor.pooling_metadata
PoolingMetadata ¶
Metadata for pooling operations in the Pooler layer.
This class holds the necessary information for pooling operations, providing context for how to perform pooling and other related operations.
Attributes:
Name | Type | Description |
---|---|---|
seq_groups | List of (seq_ids, pooling_params). | |
seq_data | A mapping of sequence ID to additional sequence data. | |
prompt_lens | List of the lengths of each prompt. |
Source code in vllm/model_executor/pooling_metadata.py
PoolingTensors dataclass
¶
Tensors for pooling.
Source code in vllm/model_executor/pooling_metadata.py
from_pooling_metadata classmethod
¶
from_pooling_metadata(
pooling_metadata: PoolingMetadata, device: device
) -> PoolingTensors
Create PoolingTensors from PoolingMetadata.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pooling_metadata | PoolingMetadata | PoolingMetadata instance to convert. | required |
device | device | Device to store the tensors. | required |