Top LLM and Deep Learning Inference Engines - Curated List
ฝัง
- เผยแพร่เมื่อ 19 พ.ค. 2024
- Inference engines like DeepSpeed, FasterTransformer, and vLLM are designed to accelerate the process of generating predictions from large language models (LLMs) by optimizing the computation and memory usage during inference. These engines are particularly useful in scenarios where the models are deployed for real-time applications, requiring fast and efficient processing of large volumes of data.
⭐️ Contents ⭐️
1. FasterTransformer: github.com/NVIDIA/FasterTrans...
2.DeepSpeed: github.com/microsoft/DeepSpeed
3.TensorRT: github.com/NVIDIA/TensorRT
4. VLLM: github.com/vllm-project/vllm
5. OpenVINO™: github.com/openvinotoolkit/op...
6. Flash-Attention: github.com/Dao-AILab/flash-at...
7. TVM: github.com/apache/tvm
8. ONNX Runtime: github.com/microsoft/onnxruntime
___________________________________________________________________________
🔔 Get our Newsletter and Featured Articles: abonia1.github.io/newsletter/
🔗 Linkedin: / aboniasojasingarayar
🔗 Find me on Github : github.com/Abonia1
🔗 Medium Articles: / abonia
Very informatics nd Your voice very clear dr
Glad it helped!