vllm-project

vllm

A high-throughput and memo ry-efficient inference a nd serving engine for LLMs

Trusted Project Python
81.2k stars via trending-python
View on GitHub
← Back to Trending Repos