Dude! You guys are spoiling us with these latest tools. : ) I really like the modular approach that you can take with these graphs and perfect timing cause i just did up to module 5 of the academy. One thing that cracked me up though is the voice clone you used is the same one that is used on so many of the youtube shorts videos my kids watch, it become like the defacto AI voice on youtube LOL!
really impressive usecase of subgraphs. really drives the point home, remotegraph is super interesting. now I just need more time to keep up with the pace you guys are churning out stuff at
I built something like this using langgraph, albeit not as elegant but it’s functional and works well. My audio out is eleven labs turbo which I’m happy with but for my SST input node I’ve been testing different models to find the most responsive and effective for always on communication. That is to say my use case required no activation phrase and no ui event E.g., key press etc. again, it functions well but as you already know, responsiveness is king here so anyway I can reduce lag the better for me. I started with whisper api and then went to local install of Distil whisper but finally landed on a local install of Vosk which seems to be the most responsive and plenty accurate. The question is have you tried this and can you tweak whisper via openAI api or any other flavor to perform better than vosk? Also, with local implementations of SST (at least the open source ones) there is no cost so that’s another bonus.
Dude! You guys are spoiling us with these latest tools. : ) I really like the modular approach that you can take with these graphs and perfect timing cause i just did up to module 5 of the academy. One thing that cracked me up though is the voice clone you used is the same one that is used on so many of the youtube shorts videos my kids watch, it become like the defacto AI voice on youtube LOL!
really impressive usecase of subgraphs. really drives the point home, remotegraph is super interesting. now I just need more time to keep up with the pace you guys are churning out stuff at
I built something like this using langgraph, albeit not as elegant but it’s functional and works well. My audio out is eleven labs turbo which I’m happy with but for my SST input node I’ve been testing different models to find the most responsive and effective for always on communication. That is to say my use case required no activation phrase and no ui event
E.g., key press etc. again, it functions well but as you already know, responsiveness is king here so anyway I can reduce lag the better for me. I started with whisper api and then went to local install of Distil whisper but finally landed on a local install of Vosk which seems to be the most responsive and plenty accurate. The question is have you tried this and can you tweak whisper via openAI api or any other flavor to perform better than vosk? Also, with local implementations of SST (at least the open source ones) there is no cost so that’s another bonus.
He lives, he dies, he hates the Jackson 5, Lance from Langchain!
Legend Lance himself
You're the man, Lance! This is awesome. 🙇♂