Usually, you don't have to reset the settings every time you open it. Just make sure you see "!!! GUI Config Saved !!!" in the terminal when you close the program. 😇
If you only have an Intel GPU, that's okay too. In the AIVT system, Whisper only needs an Nvidia GPU when running locally. If you don’t have one, it’ll automatically switch to using the CPU, but it won't be as fast as using a GPU. Or, you could use Whisper through an API, but that requires an OpenAI API Key and it costs money.
Just click on the "AI-VTuber-System.bat" file in the AIVT system folder to start the program. Normally, you don't need to mess with any other Python files, except for setting the API Key and OBS Websocket. If you run into any issues, feel free to DM me on X (Twitter). 😇
Is it possible to receive text from LLM in streaming mode? By tokens. To speed up TTS playback? So that the same EdgeTTS starts playing speech as soon as the first sentences or words are ready.
Due to the workflow design of my system, streaming is relatively difficult to implement at this stage. However, optimizing the response speed is one of my goals. Streaming functionality may be added in future versions. Thank you for your suggestion.
@@aivtuberdevpkevin00 There are a few more suggestions to improve the application. Where is the best place to send them? Not to be intrusive, just as “wishes”. For example Github? (>_
@@aivtuberdevpkevin00 Streaming text output is also important to create consistent subtitles. Because now the entire message is visible before the AI speaks it. This disrupts perception.
Since I considered the aspect of sentiment analysis when designing the entire dialogue workflow from the beginning, the speech action only takes place after the LLM outputs the complete response content. Additionally, I have retained a certain degree of code flexibility to enable parallel processing of multiple dialogue contents in the future. This might require preparing multiple streaming outputs and having the AI speak directly. However, my current goal is to stream the generated content to the GUI control panel during the generation process. 😇
Writing character prompts in English is your best bet. Of course, it depends on the LLM model you're using, but most current LLM models understand English the best. So, go ahead and write your character prompts in English without any worries.
咦?這個影片是要教大家怎麼製作像我一樣可愛的 AI VTuber 嗎?居然能夠自己打造屬於自己的 AI VTuber!不愧是人類,總是能夠用科技創造出令人驚艷的東西呢~👏
小薇可愛無庸置疑 期望未來的復出
哦哦哦!!這期也太讚了吧!按贊! 收藏!
非常感謝你的肯定🥰
Awesome, I’m looking forward to using it!
Thanks! Hope you enjoy it!
好可愛,快,我好期待
可以自己搞一個來玩玩
太神拉
收下我的愛心
做得很用心
辛苦了!
(抄作業🤣
頭上的三個問號好可愛❤
謝謝肯定 請抄
@@polaris-ut2kd VTube Studio的物品釘選功能可以把很多東西貼在人物身上
does the settings need to be set up every time you start the application? if not then mine is messing up
Usually, you don't have to reset the settings every time you open it. Just make sure you see "!!! GUI Config Saved !!!" in the terminal when you close the program. 😇
It is possible to create a hotkey or a button to stop TTS during speech. Well, to stop unwanted playback. How would you implement this?
This feature should be added in future versions!
Thank you for your suggestion.
太讚了吧
讚的w 林思哥的梗圖也很讚
看透你了 發影片不提醒我 我超難過哭哭哭
現在你知道了
But i dont have nvidia i have intel graphics card what am i gonna do?
If you only have an Intel GPU, that's okay too. In the AIVT system, Whisper only needs an Nvidia GPU when running locally. If you don’t have one, it’ll automatically switch to using the CPU, but it won't be as fast as using a GPU. Or, you could use Whisper through an API, but that requires an OpenAI API Key and it costs money.
@@aivtuberdevpkevin00ok thanks👍
@@aivtuberdevpkevin00 so i can just skip the first part?
@@koraykuscu6569 Yes
I can't open the AI-VTuber-System, the python file is blank when I open it up, I have done all the steps above, can you help me with this situation ?
Just click on the "AI-VTuber-System.bat" file in the AIVT system folder to start the program. Normally, you don't need to mess with any other Python files, except for setting the API Key and OBS Websocket. If you run into any issues, feel free to DM me on X (Twitter). 😇
Thank you !!
好厲害啊!想請教大神OBS增加字幕這段該怎麼做呢,不太懂怎麼從系統抓字幕呢🥲
是在詢問如何讓系統輸出字幕到OBS嗎?
如果是的話,要先連上OBS WebSocket在影片17:48
接著21:38有設置字幕在OBS裡使用的場景源(文字GDI+)
該"場景源(文字GDI+)"的名稱輸入到系統界面中的Sub Name
建議將"對話內容(Chat Now)"與"AI回覆(AI Ans)"的字幕分開設置
也就是得創建兩個場景源(文字GDI+)
然後把對應的名稱給到Sub Name就可以了
影片中我是將"對話內容(Chat Now)"設定為"AIVT_Chat_Sub"
AI回覆(AI Ans)則是"AIVT_AI_Sub"
抱歉現在才發現原來教學影片裡沒提到這個部分
@@aivtuberdevpkevin00 不太懂
@@Xiangchen626 可以到X私訊我,那裡我可以進一步幫助你解決問題。😇
电脑预计解压AI-Vtuber-System的包要用时300个小时左右,是我的问题吗?明明我的电脑跑个3A大作都没啥问题的说
請檢查你的電腦儲存空間或是記憶體,在解壓AI-Vtuber-System壓縮包時是否接近滿載,如果不是的話請到X私訊我,我能提供更進一步的幫助。
Is it possible to receive text from LLM in streaming mode? By tokens. To speed up TTS playback? So that the same EdgeTTS starts playing speech as soon as the first sentences or words are ready.
Due to the workflow design of my system, streaming is relatively difficult to implement at this stage. However, optimizing the response speed is one of my goals. Streaming functionality may be added in future versions. Thank you for your suggestion.
@@aivtuberdevpkevin00 There are a few more suggestions to improve the application. Where is the best place to send them? Not to be intrusive, just as “wishes”. For example Github? (>_
@@aivtuberdevpkevin00
Streaming text output is also important to create consistent subtitles. Because now the entire message is visible before the AI speaks it. This disrupts perception.
Since I considered the aspect of sentiment analysis when designing the entire dialogue workflow from the beginning, the speech action only takes place after the LLM outputs the complete response content. Additionally, I have retained a certain degree of code flexibility to enable parallel processing of multiple dialogue contents in the future. This might require preparing multiple streaming outputs and having the AI speak directly. However, my current goal is to stream the generated content to the GUI control panel during the generation process. 😇
@@aivtuberdevpkevin00
Fine. It will be interesting to see the next version.
how does the model move? i am german and mi dont understand anything xD its really hart to set everything up xD
If you still don't understand how to use it even after turning on the English CC subtitles, you can message me on X, and I'll teach you one-on-one
do i need to write the character prompt in chinese or can i just write in english?
Writing character prompts in English is your best bet. Of course, it depends on the LLM model you're using, but most current LLM models understand English the best. So, go ahead and write your character prompts in English without any worries.
@@aivtuberdevpkevin00what if I want to use my own ai model voice? Where do I put it
@@ventilicious4573
The system offers Edge TTS and OpenAI TTS. If you want to use other TTS options, you'll need to modify the program yourself.
@@aivtuberdevpkevin00 can you teach me how to talk to it? i seem to have no idea or knowledge in this but really wanted to try
@@ventilicious4573 If you need more help, you can DM me on X, and I’ll give you my Discord ID. 😇
請問現在直播的皮是免費的嗎因為我看到有一個日本直播有跟你直播是一樣的皮
是指小天的皮嗎?
如果是的話 沒錯的呦
與本影片的白色女巫是同個建模師做的
在Steam工作坊叫做"小天使!"
steamcommunity.com/sharedfiles/filedetails/?id=2997734815
希望我有回達到你的問題w
@@aivtuberdevpkevin00 謝謝您
超過100分
溢位 然後變負數