Don't use a model greater than about 4G in size or you'll be sitting there for awhile waiting on responses. The Pi, while capable of running tiny LLM's, it doesn't actually have a GPU that's usable to do this so it's all CPU driven, which is horribly inefficient for the task.
@@VincentStevensonjoining their point. Pi CPU maxes out with LLMs (with bitnet model as possible future exceptions) The “super” Nvidia board will do, or Hailo-10 if it hits Raspberry Pi, as Hailo-8L did
Don't use a model greater than about 4G in size or you'll be sitting there for awhile waiting on responses.
The Pi, while capable of running tiny LLM's, it doesn't actually have a GPU that's usable to do this so it's all CPU driven, which is horribly inefficient for the task.
Good point!
@@VincentStevensonjoining their point. Pi CPU maxes out with LLMs (with bitnet model as possible future exceptions)
The “super” Nvidia board will do, or Hailo-10 if it hits Raspberry Pi, as Hailo-8L did
Wow~~,i will try it