LLM Inference

Large model inference is a service management and deployment feature for large language models launched by the platform. The new module supports the latest PD separation technology, significantly improving inference efficiency. Meanwhile, the platform has also enhanced fundamental service deployment capabilities, including version management, manual/automatic scaling, and more.