19
2 346

PyTorch: MobileClip & ONNX GTE Models for Text & Image Embeddings

7:00

How Good is the Parler TTS Model Pronunciation

1:40

PyTorch: Exploring VAE Latent Channels of TAESD for SDXL and FLUX

4:56

ComfyUI Multiple VAE Comparison Workflow

5:51

Make Vertical YouTube Shorts from Landscape Videos with Kdenlive Animation

8:43

Voice Cloning with OpenVoice V2

5:03

DearPyGUI Local Streaming Chat UI: Code Walkthrough & ReactJS Frontend Comparison

What is the best option for building a local AI application GUI? In this video, I demonstrate how to build a local chat GUI app in Python using DearPyGUI and compare it to another app UI frontend I created with ReactJS.
After exploring both, I conclude that for a multipurpose AI application, a web-based frontend offers maximum styling flexibility and access to a wide range of resources. Here, I have used llama.cpp server as backend for chatbot with a custom LLM model.
Have I missed any better options for building a local UI for an AI-powered multipurpose application? Let me know your thoughts in the comments!
Topics Covered:
- DearPyGUI High resolution screen windows DPI scaling, automatic fullscreen maximization & proper scaling, custom themes & fonts code walkthrough.
- DearPyGUI Python async chatbot streaming response showing in GUI with OpenAI API format with llama.cpp backend server.
- Python custom message queue with async task to check for new chunk of streaming response from server.
- Discussing issues building with DearPyGUI.
- Comparison of local python DearPyGUI UI with my web ReactJS frontend demo UI.
Get Involved:
If you found this helpful, consider liking, subscribing, and sharing with others as I delve into more AI-related topics. Share your thoughts on the video in the comments! Let’s explore the global potential of AI together as a part of the worldwide technology and AI community.
For those who want to support my work, consider visiting my Patreon for additional resources and codes: www.patreon.com/CompactAI
Chapters:
00:00 Introduction
01:29 DearPyGUI Chat App Code Walkthrough
09:44 ReactJS Frontend Streaming Chat UI
Links:
www.patreon.com/CompactAI
[TODO: Upload DearPyGUI Demo App Code in Patreon]
github.com/hoffstadt/DearPyGui
react.dev/
vite.dev/
Hashtags: #webui #chatbot #aiapplications #nlp #llama #chatgpt #reactjs #ui #uidesign #python #programming #ai #deeplearning #languagemodel
Language: English.

มุมมอง: 44

วีดีโอ

PyTorch: MobileClip & ONNX GTE Models for Text & Image Embeddings

7:00

PyTorch: MobileClip & ONNX GTE Models for Text & Image Embeddings

มุมมอง 433 หลายเดือนก่อน

If we cache embeddings from multiple full sentences, partial phrases, and individual words in a vector database, we can potentially avoid a forward pass for each similar query. Specifically, with a model that generates high-quality embeddings, we might achieve the desired embeddings through operations like addition and subtraction, without needing a forward pass for every text (or other modalit...

How Good is the Parler TTS Model Pronunciation

1:40

How Good is the Parler TTS Model Pronunciation

มุมมอง 1013 หลายเดือนก่อน

In this video I test Parler TTS Mini v1.1 model with the same sentence I used to test OpenVoice V2 voice cloning. Can this one pass the speech check that so far only VITS VCTK model has passed? Get Involved: If you found this helpful, consider liking, subscribing, and sharing with others as I delve into more AI-related topics. Share your thoughts on the video in the comments! Let’s explore the ...

PyTorch: Exploring VAE Latent Channels of TAESD for SDXL and FLUX

4:56

PyTorch: Exploring VAE Latent Channels of TAESD for SDXL and FLUX

มุมมอง 433 หลายเดือนก่อน

In this video I use a modified TAESD pytorch code in jupyter python notebook to explore what information is stored in each of its latent channels. Codes for this is available in my patreon. Topics Covered: - TAESD FLUX and SDXL latent encoding and decoding. - What happens if only certain latent channels are used for decoding. Get Involved: If you found this helpful, consider liking, subscribing...

ComfyUI Multiple VAE Comparison Workflow

5:51

ComfyUI Multiple VAE Comparison Workflow

มุมมอง 1563 หลายเดือนก่อน

In this video I show a Comfy UI workflow for comparing all popular VAEs used in text to image generation including flux, stable diffusion etc. Based on this FLUX VAE looks better than SD 3.5 VAE. VAE Order of Images Outputs: 1. Original. 2. TAESD. 3. TAESDXL. 4. TAE1F. 5. Stable Diffusion 1.5 MSE VAE. 6. SDXL VAE. 7. FLUX VAE. 8. Stable Diffusion 3.5 VAE. Topics Covered: - Encoding, decoding an...

Make Vertical YouTube Shorts from Landscape Videos with Kdenlive Animation

8:43

Make Vertical YouTube Shorts from Landscape Videos with Kdenlive Animation

มุมมอง 453 หลายเดือนก่อน

In this video I show video animation in kdenlive to turn landscape videos to vertical youtube shorts style videos. While the video focuses on making short from landscape videos for programming videos, it can also apply to other non-programming videos. Topics Covered: - Placing keyframes for smooth and fast transition between video frames. - Merging multiple videos into single video. - Using kde...

5:03

Voice Cloning with OpenVoice V2

มุมมอง 4723 หลายเดือนก่อน

In this programming video I test voice cloning with Open Voice V2 and explore its issues. Topics Covered: - Testing English text to speech with various predefined openvoice accents. - Comparing cloned English speech voice with Coqui TTS VCTK VITS model. - I additionally discuss "Could not locate cudnn_ops_infer64_8.dll" for OpenVoice V2 crash when running in GPU as CTranslate2 issue with Faster...

Instructions for Running Coqui and Melo TTS in Intel GPU

4:17

Instructions for Running Coqui and Melo TTS in Intel GPU

มุมมอง 1053 หลายเดือนก่อน

In this video I give my understanding of running the codes I have shown in previous tutorials with Intel GPU. As I do not have Intel GPU I cannot guarantee if it will run correctly. Topics Covered: - Pytorch 2.5 Intel GPU support. - Converting existing CUDA GPU pytorch code for running in Intel GPU with minimal effort. Get Involved: If you found this helpful, consider liking, subscribing, and s...

Kdenlive: Customize Layout for Making Youtube Shorts

2:40

Kdenlive: Customize Layout for Making Youtube Shorts

มุมมอง 913 หลายเดือนก่อน

This is a video for setting up kdenlive for vertical video editing. For example, youtube shorts. Topics Covered: - Setting up layout in kdenlive to edit vertical video, specifically in youtube short format and resolution. - Setting up preset for vertical 1080x1920 videos. - Creating custom layout and organizing them. Get Involved: If you found this helpful, consider liking, subscribing, and sha...

Pitch Control and Noise Reduction VST in OBS Studio

2:31

Pitch Control and Noise Reduction VST in OBS Studio

มุมมอง 513 หลายเดือนก่อน

In this video I show using Reaplug VST to reduce noise and control pitch the change voice to lighter or heavier. The voice is AI generated. Topics Covered: - Reaper reaplug VST with OBS Studio for noise removal. - Use VST reajs pitch control to make voice heavier or lighter. Get Involved: If you found this helpful, consider liking, subscribing, and sharing with others as I delve into more AI-re...

1 Hour Audio Transcribed in 1 Minute with 1GB VRAM - whisper.cpp

6:25

1 Hour Audio Transcribed in 1 Minute with 1GB VRAM - whisper.cpp

มุมมอง 1713 หลายเดือนก่อน

In this video I transcribe an 1 hour long video with whisper.cpp using whisper v3 large turbo q5 quantized model requiring only 1GB VRAM memory. In my previous video I have shown fast whisper.cpp build process. Model and code links below. In this video I use an AI generated voice. I have used voice from VCTK VITS model for voicing this video Topics Covered: - Using yt-dlp tool to download 2 hou...

2:52

Build whisper.cpp Faster for Nvidia GPU

มุมมอง 2313 หลายเดือนก่อน

In this video I show whisper.cpp build for CUDA GPU locally in Windows using Visual Studio. In case anything is unclear I also have llama.cpp and stable-diffusion.cpp build video very similar to this. Topics Covered: - Using CMake GUI application to build whisper.cpp to run in GPU. - Speedup build with architecture specific compilation. Get Involved: If you found this helpful, consider liking, ...

All 109 AI Text to Speech Voices of VITS VCTK Model

52:42

All 109 AI Text to Speech Voices of VITS VCTK Model

มุมมอง 823 หลายเดือนก่อน

All 109 AI text to speech English voices of VITS trained on VCTK dataset. It can also be used from Coqui TTS with python. This was made programmatically in python to first generate the voices wav files then merge them with images while keeping track of their timestamps. Images are generated for each speaker at exact timestamps to make the video without any manual effort. The code is available i...

Python: Realtime Human Level Text to Speech with only 1GB VRAM

6:35

Python: Realtime Human Level Text to Speech with only 1GB VRAM

มุมมอง 2363 หลายเดือนก่อน

In this video I show the CPU and GPU inference of MeloTTS and Coqui TTS libraries in python. Their implementation is able to run the models in near Real-time for my hardware. Both are able to synthesize speech very fast in a few seconds with around 1GB of VRAM. For Coqui TTS I use VCTK dataset trained VITS Model which has 109 speakers all indistinguishable from human speech. As I understand Mel...

Build GPU stable-diffusion.cpp Faster and Generate FLUX Images

4:47

Build GPU stable-diffusion.cpp Faster and Generate FLUX Images

มุมมอง 1063 หลายเดือนก่อน

Build stable-diffusion.cpp fast with Nvidia CUDA GPU support. FLUX q3_k gguf text to image is able to generate good quality 1024x1024 images requiring less than 8GB VRAM. With flash attention it is able generate 1024x1024 resolution images under 6GB VRAM with FLUX q3_k. (Not shown in this video) Topics Covered: - Build stable-diffusion.cpp project with CMake GUI and Visual Studio. - Run FLUX im...

Python: 4, 8 bit Text to Speech, Speech to Text, Speech to Speech SeamlessM4T Comparison

4:00

Python: 4, 8 bit Text to Speech, Speech to Text, Speech to Speech SeamlessM4T Comparison

มุมมอง 663 หลายเดือนก่อน

Python: 4, 8 bit Text to Speech, Speech to Text, Speech to Speech SeamlessM4T Comparison

GPU Specific llama.cpp Compilation: Massively Reduce Build Times

2:30

GPU Specific llama.cpp Compilation: Massively Reduce Build Times

มุมมอง 463 หลายเดือนก่อน

GPU Specific llama.cpp Compilation: Massively Reduce Build Times

Build and Run llama.cpp Locally for Nvidia GPU

3:49

Build and Run llama.cpp Locally for Nvidia GPU

มุมมอง 2373 หลายเดือนก่อน

Build and Run llama.cpp Locally for Nvidia GPU

Kdenlive: Insert, Cut Video at Exact Timestamps, Speedup Segment & Text Overlay

3:19

Kdenlive: Insert, Cut Video at Exact Timestamps, Speedup Segment & Text Overlay

มุมมอง 393 หลายเดือนก่อน

Kdenlive: Insert, Cut Video at Exact Timestamps, Speedup Segment & Text Overlay

ความคิดเห็น

@ikaro75 5 วันที่ผ่านมา
I love the way you narrate your explanation
@sanketshende2792 14 วันที่ผ่านมา
Does it depend on the Microsoft Visual studio and toolkit version because I am not getting few options in cmake after configuring ? I have visual studio 17 2022
@keremardcl6759 หลายเดือนก่อน
Any example of live speech to lice speech and text translation?
@ShadowsandWhiskey หลายเดือนก่อน
Cool, I am working on implementing tts on my local ai setup on Linux. One of the only videos I watched where someone has coqui up and running with good examples. Thanks for the video
@Dante58426 2 หลายเดือนก่อน
I'm currently working with this model and just hearing a full comparison of all the voices in one place has been tremendously helpful. Thanks for putting in the work to help us all!
@Shek0r 2 หลายเดือนก่อน
thx for video. It help's me a lot.
@timstevens3361 2 หลายเดือนก่อน
sounds like the red queen
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI arxiv.org/pdf/2311.17049 huggingface.co/apple/MobileCLIP-B-LT huggingface.co/Alibaba-NLP/gte-base-en-v1.5
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI huggingface.co/parler-tts/parler-tts-mini-v1.1
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI www.patreon.com/posts/code-for-pytorch-115158865 github.com/madebyollin/taesd
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI www.patreon.com/posts/workflow-for-114956480 github.com/comfyanonymous/ComfyUI
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI th-cam.com/video/d9IHp_ohrZQ/w-d-xo.html kdenlive.org
@SuperWorld007 3 หลายเดือนก่อน
Can We run it on CPU based system?
@compactai 3 หลายเดือนก่อน
Yes. Set the device variable to "cpu" in the code and modify se_extractor.py, model = WhisperModel(model_size, device="cuda", compute_type="float16") with, model = WhisperModel(model_size, device="cpu", compute_type="float32") I have pointed out issues with this model in the video. Surely, there are other better options you can try than this.
@SuperWorld007 3 หลายเดือนก่อน
@@compactai Thanks man. It worked.
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI github.com/myshell-ai/OpenVoice
@jindown 3 หลายเดือนก่อน
❤
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI pytorch.org/docs/stable/notes/get_start_xpu.html github.com/pytorch/pytorch/releases/tag/v2.5.0 github.com/vosen/ZLUDA
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI kdenlive.org
@jindown 3 หลายเดือนก่อน
Can you tell which model of coqui supports polly mathew
@compactai 3 หลายเดือนก่อน
I have only tested the vits model so far.
@jindown 3 หลายเดือนก่อน
@@compactai do they have that voice ? because in the github repo of coqui that have shared stats of voices there was polly voices too
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI www.reaper.fm/reaplugs/
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI www.patreon.com/posts/1-hour-audio-in-114479515 huggingface.co/ggerganov/whisper.cpp github.com/ggerganov/whisper.cpp
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI kdenlive.org/en/
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI github.com/ggerganov/llama.cpp developer.nvidia.com/cuda-toolkit desktop.github.com/download/ cmake.org/download/
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html developer.nvidia.com/cuda-gpus github.com/ggerganov/llama.cpp
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI www.patreon.com/posts/jupyter-notebook-114201835 huggingface.co/facebook/seamless-m4t-medium
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/c/CompactAI www.patreon.com/posts/model-links-and-114292004 github.com/leejet/stable-diffusion.cpp/tree/14206fd48832ab600d9db75f15acb5062ae2c296
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/posts/114394468 www.patreon.com/CompactAI github.com/coqui-ai/TTS github.com/myshell-ai/MeloTTS
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI www.patreon.com/posts/114456591 huggingface.co/kakao-enterprise/vits-vctk huggingface.co/datasets/CSTR-Edinburgh/vctk
@compactai 3 หลายเดือนก่อน
Links: www.patreon.com/CompactAI github.com/ggerganov/whisper.cpp cmake.org/ desktop.github.com/download/
@perrymitchell7118 3 หลายเดือนก่อน
Which one do you think is most human sounding?

Compact AI

ความคิดเห็น