Top LLM and Deep Learning Inference Engines - Curated List

แชร์
ฝัง
  • เผยแพร่เมื่อ 19 พ.ค. 2024
  • Inference engines like DeepSpeed, FasterTransformer, and vLLM are designed to accelerate the process of generating predictions from large language models (LLMs) by optimizing the computation and memory usage during inference. These engines are particularly useful in scenarios where the models are deployed for real-time applications, requiring fast and efficient processing of large volumes of data.
    ⭐️ Contents ⭐️
    1. FasterTransformer: github.com/NVIDIA/FasterTrans...
    2.DeepSpeed: github.com/microsoft/DeepSpeed
    3.TensorRT: github.com/NVIDIA/TensorRT
    4. VLLM: github.com/vllm-project/vllm
    5. OpenVINO™: github.com/openvinotoolkit/op...
    6. Flash-Attention: github.com/Dao-AILab/flash-at...
    7. TVM: github.com/apache/tvm
    8. ONNX Runtime: github.com/microsoft/onnxruntime
    ___________________________________________________________________________
    🔔 Get our Newsletter and Featured Articles: abonia1.github.io/newsletter/
    🔗 Linkedin: / aboniasojasingarayar
    🔗 Find me on Github : github.com/Abonia1
    🔗 Medium Articles: / abonia

ความคิดเห็น • 2

  • @MishelMichel
    @MishelMichel 9 วันที่ผ่านมา +3

    Very informatics nd Your voice very clear dr