Self host Mixtral-8x7B MoE LLM on Mac+cross devices, 2MB AI inference app fully portable
ฝัง
- เผยแพร่เมื่อ 15 ม.ค. 2024
- The device used in this demo is Mac M2 X 64G. Come and try it out!
Tutorial: www.secondstate.io/articles/m...
Run with 1 single command line: www.secondstate.io/run-llm/ - วิทยาศาสตร์และเทคโนโลยี
That's impressive. Congrats! Would it be possible to use WasmEdge+Mixtral as a replacement for LMStudio and run the open interpreter with the command 'interpreter --local'? So, instead of LLM Studio, it would use your implementation. That would be something awesome.
Yes, absolutely! This article may be helpful www.secondstate.io/articles/mixtral-8-7b/. Please let us know if you have any questions.
Does the WASM model store the whole model checkpoint within his memory space, and if yes, how can a 7B model fit the 4 GB maximum WASM linear memory? How does the WASM module use the CUDA, and in this case, the Apple Metal APIs? Great video!
Any insight?