- 19
- 2 346
Compact AI
เข้าร่วมเมื่อ 11 มี.ค. 2016
Welcome to Compact AI! This channel explores AI and programming across a wide range of topics, including NLP, speech processing, computer vision, and other deep learning applications, with a particular emphasis on local and edge AI solutions.
My focus is dedicated to exploring ways to maximize performance on CPUs and low-powered GPUs, primarily utilizing PyTorch and Python, among other tools.
For those who wish to support my work, visit my Patreon, for additional resources, code samples, and more. Subscribe for updates as I explore different models, techniques, and ways to apply them in AI powered projects.
My focus is dedicated to exploring ways to maximize performance on CPUs and low-powered GPUs, primarily utilizing PyTorch and Python, among other tools.
For those who wish to support my work, visit my Patreon, for additional resources, code samples, and more. Subscribe for updates as I explore different models, techniques, and ways to apply them in AI powered projects.
DearPyGUI Local Streaming Chat UI: Code Walkthrough & ReactJS Frontend Comparison
What is the best option for building a local AI application GUI? In this video, I demonstrate how to build a local chat GUI app in Python using DearPyGUI and compare it to another app UI frontend I created with ReactJS.
After exploring both, I conclude that for a multipurpose AI application, a web-based frontend offers maximum styling flexibility and access to a wide range of resources. Here, I have used llama.cpp server as backend for chatbot with a custom LLM model.
Have I missed any better options for building a local UI for an AI-powered multipurpose application? Let me know your thoughts in the comments!
Topics Covered:
- DearPyGUI High resolution screen windows DPI scaling, automatic fullscreen maximization & proper scaling, custom themes & fonts code walkthrough.
- DearPyGUI Python async chatbot streaming response showing in GUI with OpenAI API format with llama.cpp backend server.
- Python custom message queue with async task to check for new chunk of streaming response from server.
- Discussing issues building with DearPyGUI.
- Comparison of local python DearPyGUI UI with my web ReactJS frontend demo UI.
Get Involved:
If you found this helpful, consider liking, subscribing, and sharing with others as I delve into more AI-related topics. Share your thoughts on the video in the comments! Let’s explore the global potential of AI together as a part of the worldwide technology and AI community.
For those who want to support my work, consider visiting my Patreon for additional resources and codes: www.patreon.com/CompactAI
Chapters:
00:00 Introduction
01:29 DearPyGUI Chat App Code Walkthrough
09:44 ReactJS Frontend Streaming Chat UI
Links:
www.patreon.com/CompactAI
[TODO: Upload DearPyGUI Demo App Code in Patreon]
github.com/hoffstadt/DearPyGui
react.dev/
vite.dev/
Hashtags: #webui #chatbot #aiapplications #nlp #llama #chatgpt #reactjs #ui #uidesign #python #programming #ai #deeplearning #languagemodel
Language: English.
After exploring both, I conclude that for a multipurpose AI application, a web-based frontend offers maximum styling flexibility and access to a wide range of resources. Here, I have used llama.cpp server as backend for chatbot with a custom LLM model.
Have I missed any better options for building a local UI for an AI-powered multipurpose application? Let me know your thoughts in the comments!
Topics Covered:
- DearPyGUI High resolution screen windows DPI scaling, automatic fullscreen maximization & proper scaling, custom themes & fonts code walkthrough.
- DearPyGUI Python async chatbot streaming response showing in GUI with OpenAI API format with llama.cpp backend server.
- Python custom message queue with async task to check for new chunk of streaming response from server.
- Discussing issues building with DearPyGUI.
- Comparison of local python DearPyGUI UI with my web ReactJS frontend demo UI.
Get Involved:
If you found this helpful, consider liking, subscribing, and sharing with others as I delve into more AI-related topics. Share your thoughts on the video in the comments! Let’s explore the global potential of AI together as a part of the worldwide technology and AI community.
For those who want to support my work, consider visiting my Patreon for additional resources and codes: www.patreon.com/CompactAI
Chapters:
00:00 Introduction
01:29 DearPyGUI Chat App Code Walkthrough
09:44 ReactJS Frontend Streaming Chat UI
Links:
www.patreon.com/CompactAI
[TODO: Upload DearPyGUI Demo App Code in Patreon]
github.com/hoffstadt/DearPyGui
react.dev/
vite.dev/
Hashtags: #webui #chatbot #aiapplications #nlp #llama #chatgpt #reactjs #ui #uidesign #python #programming #ai #deeplearning #languagemodel
Language: English.
มุมมอง: 44
วีดีโอ
PyTorch: MobileClip & ONNX GTE Models for Text & Image Embeddings
มุมมอง 433 หลายเดือนก่อน
If we cache embeddings from multiple full sentences, partial phrases, and individual words in a vector database, we can potentially avoid a forward pass for each similar query. Specifically, with a model that generates high-quality embeddings, we might achieve the desired embeddings through operations like addition and subtraction, without needing a forward pass for every text (or other modalit...
How Good is the Parler TTS Model Pronunciation
มุมมอง 1013 หลายเดือนก่อน
In this video I test Parler TTS Mini v1.1 model with the same sentence I used to test OpenVoice V2 voice cloning. Can this one pass the speech check that so far only VITS VCTK model has passed? Get Involved: If you found this helpful, consider liking, subscribing, and sharing with others as I delve into more AI-related topics. Share your thoughts on the video in the comments! Let’s explore the ...
PyTorch: Exploring VAE Latent Channels of TAESD for SDXL and FLUX
มุมมอง 433 หลายเดือนก่อน
In this video I use a modified TAESD pytorch code in jupyter python notebook to explore what information is stored in each of its latent channels. Codes for this is available in my patreon. Topics Covered: - TAESD FLUX and SDXL latent encoding and decoding. - What happens if only certain latent channels are used for decoding. Get Involved: If you found this helpful, consider liking, subscribing...
ComfyUI Multiple VAE Comparison Workflow
มุมมอง 1563 หลายเดือนก่อน
In this video I show a Comfy UI workflow for comparing all popular VAEs used in text to image generation including flux, stable diffusion etc. Based on this FLUX VAE looks better than SD 3.5 VAE. VAE Order of Images Outputs: 1. Original. 2. TAESD. 3. TAESDXL. 4. TAE1F. 5. Stable Diffusion 1.5 MSE VAE. 6. SDXL VAE. 7. FLUX VAE. 8. Stable Diffusion 3.5 VAE. Topics Covered: - Encoding, decoding an...
Make Vertical YouTube Shorts from Landscape Videos with Kdenlive Animation
มุมมอง 453 หลายเดือนก่อน
In this video I show video animation in kdenlive to turn landscape videos to vertical youtube shorts style videos. While the video focuses on making short from landscape videos for programming videos, it can also apply to other non-programming videos. Topics Covered: - Placing keyframes for smooth and fast transition between video frames. - Merging multiple videos into single video. - Using kde...
Voice Cloning with OpenVoice V2
มุมมอง 4723 หลายเดือนก่อน
In this programming video I test voice cloning with Open Voice V2 and explore its issues. Topics Covered: - Testing English text to speech with various predefined openvoice accents. - Comparing cloned English speech voice with Coqui TTS VCTK VITS model. - I additionally discuss "Could not locate cudnn_ops_infer64_8.dll" for OpenVoice V2 crash when running in GPU as CTranslate2 issue with Faster...
Instructions for Running Coqui and Melo TTS in Intel GPU
มุมมอง 1053 หลายเดือนก่อน
In this video I give my understanding of running the codes I have shown in previous tutorials with Intel GPU. As I do not have Intel GPU I cannot guarantee if it will run correctly. Topics Covered: - Pytorch 2.5 Intel GPU support. - Converting existing CUDA GPU pytorch code for running in Intel GPU with minimal effort. Get Involved: If you found this helpful, consider liking, subscribing, and s...
Kdenlive: Customize Layout for Making Youtube Shorts
มุมมอง 913 หลายเดือนก่อน
This is a video for setting up kdenlive for vertical video editing. For example, youtube shorts. Topics Covered: - Setting up layout in kdenlive to edit vertical video, specifically in youtube short format and resolution. - Setting up preset for vertical 1080x1920 videos. - Creating custom layout and organizing them. Get Involved: If you found this helpful, consider liking, subscribing, and sha...
Pitch Control and Noise Reduction VST in OBS Studio
มุมมอง 513 หลายเดือนก่อน
In this video I show using Reaplug VST to reduce noise and control pitch the change voice to lighter or heavier. The voice is AI generated. Topics Covered: - Reaper reaplug VST with OBS Studio for noise removal. - Use VST reajs pitch control to make voice heavier or lighter. Get Involved: If you found this helpful, consider liking, subscribing, and sharing with others as I delve into more AI-re...
1 Hour Audio Transcribed in 1 Minute with 1GB VRAM - whisper.cpp
มุมมอง 1713 หลายเดือนก่อน
In this video I transcribe an 1 hour long video with whisper.cpp using whisper v3 large turbo q5 quantized model requiring only 1GB VRAM memory. In my previous video I have shown fast whisper.cpp build process. Model and code links below. In this video I use an AI generated voice. I have used voice from VCTK VITS model for voicing this video Topics Covered: - Using yt-dlp tool to download 2 hou...
Build whisper.cpp Faster for Nvidia GPU
มุมมอง 2313 หลายเดือนก่อน
In this video I show whisper.cpp build for CUDA GPU locally in Windows using Visual Studio. In case anything is unclear I also have llama.cpp and stable-diffusion.cpp build video very similar to this. Topics Covered: - Using CMake GUI application to build whisper.cpp to run in GPU. - Speedup build with architecture specific compilation. Get Involved: If you found this helpful, consider liking, ...
All 109 AI Text to Speech Voices of VITS VCTK Model
มุมมอง 823 หลายเดือนก่อน
All 109 AI text to speech English voices of VITS trained on VCTK dataset. It can also be used from Coqui TTS with python. This was made programmatically in python to first generate the voices wav files then merge them with images while keeping track of their timestamps. Images are generated for each speaker at exact timestamps to make the video without any manual effort. The code is available i...
Python: Realtime Human Level Text to Speech with only 1GB VRAM
มุมมอง 2363 หลายเดือนก่อน
In this video I show the CPU and GPU inference of MeloTTS and Coqui TTS libraries in python. Their implementation is able to run the models in near Real-time for my hardware. Both are able to synthesize speech very fast in a few seconds with around 1GB of VRAM. For Coqui TTS I use VCTK dataset trained VITS Model which has 109 speakers all indistinguishable from human speech. As I understand Mel...
Build GPU stable-diffusion.cpp Faster and Generate FLUX Images
มุมมอง 1063 หลายเดือนก่อน
Build stable-diffusion.cpp fast with Nvidia CUDA GPU support. FLUX q3_k gguf text to image is able to generate good quality 1024x1024 images requiring less than 8GB VRAM. With flash attention it is able generate 1024x1024 resolution images under 6GB VRAM with FLUX q3_k. (Not shown in this video) Topics Covered: - Build stable-diffusion.cpp project with CMake GUI and Visual Studio. - Run FLUX im...
Python: 4, 8 bit Text to Speech, Speech to Text, Speech to Speech SeamlessM4T Comparison
มุมมอง 663 หลายเดือนก่อน
Python: 4, 8 bit Text to Speech, Speech to Text, Speech to Speech SeamlessM4T Comparison
GPU Specific llama.cpp Compilation: Massively Reduce Build Times
มุมมอง 463 หลายเดือนก่อน
GPU Specific llama.cpp Compilation: Massively Reduce Build Times
Build and Run llama.cpp Locally for Nvidia GPU
มุมมอง 2373 หลายเดือนก่อน
Build and Run llama.cpp Locally for Nvidia GPU
Kdenlive: Insert, Cut Video at Exact Timestamps, Speedup Segment & Text Overlay
มุมมอง 393 หลายเดือนก่อน
Kdenlive: Insert, Cut Video at Exact Timestamps, Speedup Segment & Text Overlay
I love the way you narrate your explanation
Does it depend on the Microsoft Visual studio and toolkit version because I am not getting few options in cmake after configuring ? I have visual studio 17 2022
Any example of live speech to lice speech and text translation?
Cool, I am working on implementing tts on my local ai setup on Linux. One of the only videos I watched where someone has coqui up and running with good examples. Thanks for the video
I'm currently working with this model and just hearing a full comparison of all the voices in one place has been tremendously helpful. Thanks for putting in the work to help us all!
thx for video. It help's me a lot.
sounds like the red queen
Links: www.patreon.com/CompactAI arxiv.org/pdf/2311.17049 huggingface.co/apple/MobileCLIP-B-LT huggingface.co/Alibaba-NLP/gte-base-en-v1.5
Links: www.patreon.com/CompactAI huggingface.co/parler-tts/parler-tts-mini-v1.1
Links: www.patreon.com/CompactAI www.patreon.com/posts/code-for-pytorch-115158865 github.com/madebyollin/taesd
Links: www.patreon.com/CompactAI www.patreon.com/posts/workflow-for-114956480 github.com/comfyanonymous/ComfyUI
Links: www.patreon.com/CompactAI th-cam.com/video/d9IHp_ohrZQ/w-d-xo.html kdenlive.org
Can We run it on CPU based system?
Yes. Set the device variable to "cpu" in the code and modify se_extractor.py, model = WhisperModel(model_size, device="cuda", compute_type="float16") with, model = WhisperModel(model_size, device="cpu", compute_type="float32") I have pointed out issues with this model in the video. Surely, there are other better options you can try than this.
@@compactai Thanks man. It worked.
Links: www.patreon.com/CompactAI github.com/myshell-ai/OpenVoice
❤
Links: www.patreon.com/CompactAI pytorch.org/docs/stable/notes/get_start_xpu.html github.com/pytorch/pytorch/releases/tag/v2.5.0 github.com/vosen/ZLUDA
Links: www.patreon.com/CompactAI kdenlive.org
Can you tell which model of coqui supports polly mathew
I have only tested the vits model so far.
@@compactai do they have that voice ? because in the github repo of coqui that have shared stats of voices there was polly voices too
Links: www.patreon.com/CompactAI www.reaper.fm/reaplugs/
Links: www.patreon.com/CompactAI www.patreon.com/posts/1-hour-audio-in-114479515 huggingface.co/ggerganov/whisper.cpp github.com/ggerganov/whisper.cpp
Links: www.patreon.com/CompactAI kdenlive.org/en/
Links: www.patreon.com/CompactAI github.com/ggerganov/llama.cpp developer.nvidia.com/cuda-toolkit desktop.github.com/download/ cmake.org/download/
Links: www.patreon.com/CompactAI cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html developer.nvidia.com/cuda-gpus github.com/ggerganov/llama.cpp
Links: www.patreon.com/CompactAI www.patreon.com/posts/jupyter-notebook-114201835 huggingface.co/facebook/seamless-m4t-medium
Links: www.patreon.com/c/CompactAI www.patreon.com/posts/model-links-and-114292004 github.com/leejet/stable-diffusion.cpp/tree/14206fd48832ab600d9db75f15acb5062ae2c296
Links: www.patreon.com/posts/114394468 www.patreon.com/CompactAI github.com/coqui-ai/TTS github.com/myshell-ai/MeloTTS
Links: www.patreon.com/CompactAI www.patreon.com/posts/114456591 huggingface.co/kakao-enterprise/vits-vctk huggingface.co/datasets/CSTR-Edinburgh/vctk
Links: www.patreon.com/CompactAI github.com/ggerganov/whisper.cpp cmake.org/ desktop.github.com/download/
Which one do you think is most human sounding?