vllm.entrypoints.openai.serving_embedding
EmbeddingMixin ¶
Bases: OpenAIServing
Source code in vllm/entrypoints/openai/serving_embedding.py
_build_response ¶
_build_response(
ctx: ServeContext,
) -> Union[EmbeddingResponse, ErrorResponse]
Source code in vllm/entrypoints/openai/serving_embedding.py
_preprocess async
¶
_preprocess(ctx: ServeContext) -> Optional[ErrorResponse]
Source code in vllm/entrypoints/openai/serving_embedding.py
OpenAIServingEmbedding ¶
Bases: EmbeddingMixin
Source code in vllm/entrypoints/openai/serving_embedding.py
chat_template_content_format instance-attribute
¶
chat_template_content_format: Final = (
chat_template_content_format
)
__init__ ¶
__init__(
engine_client: EngineClient,
model_config: ModelConfig,
models: OpenAIServingModels,
*,
request_logger: Optional[RequestLogger],
chat_template: Optional[str],
chat_template_content_format: ChatTemplateContentFormatOption,
) -> None
Source code in vllm/entrypoints/openai/serving_embedding.py
_create_pooling_params ¶
_create_pooling_params(
ctx: ServeContext[EmbeddingRequest],
) -> Union[PoolingParams, ErrorResponse]
Source code in vllm/entrypoints/openai/serving_embedding.py
_validate_request ¶
_validate_request(
ctx: ServeContext[EmbeddingRequest],
) -> Optional[ErrorResponse]
Source code in vllm/entrypoints/openai/serving_embedding.py
create_embedding async
¶
create_embedding(
request: EmbeddingRequest,
raw_request: Optional[Request] = None,
) -> Union[EmbeddingResponse, ErrorResponse]
Embedding API similar to OpenAI's API.
See https://platform.openai.com/docs/api-reference/embeddings/create for the API specification. This API mimics the OpenAI Embedding API.
Source code in vllm/entrypoints/openai/serving_embedding.py
_get_embedding ¶
_get_embedding(
output: EmbeddingOutput,
encoding_format: Literal["float", "base64"],
) -> Union[list[float], str]