51
28 963

TWRTW Ep #7 - Nobel Prizes, Blackwell ramp, Benioff on Copilot, NYT suing Perplexity, x86 Consortium

1:15:59

How to Pick a Large Language Model for Private AI -- A Brief Overview

17:42

TWRTW Ep #6 - Updates on AI Datacenters, Intel/Qualcomm, Llama3.2, NotebookLM, whats next for OpenAI

1:15:29

Exploring the Long Context Window of Llama-3.1-405B on NVIDIA Grace Hopper GH200 Superchip

4:51

NVIDIA isn't a bubble? Armchair analysis of Big Tech's GPU spend (TWRTW Ep #5)

6:18

TWRTW Ep #5 - Who Can Afford to Build Frontier LLM's?

50:27

TWRTW Ep #8 ft. Anthony Placeres - ROI of AI? GPT-5 and Llama 4, TSMC yields, $4.5B chatbots, Russia

With backgrounds in the design and implementation of compute infrastructure from edge to cloud, sensor to tensor, Jordan Nanos and Hunter Almgren give their take on what’s new in enterprise technology - specifically, what they read this week.
Show notes:
x.com/tsarnick/status/1847746829490016578
x.com/deedydas/status/1848751769939284344?s=46
x.com/benhylak/status/1848765957008986416
x.com/character_ai/status/1849055407492497564?s=46
x.com/andrewcurran_/status/1849627640745058683?s=46
x.com/dylan522p/status/1849944315570864588
www.cnbc.com/2024/10/28/bret-taylors-ai-startup-sierra-valued-at-4point5-billion-in-funding.html
arstechnica.com/gadgets/2024/10/fake-restaurant-tips-on-reddit-a-reminder-of-google-ai-overviews-inherent-flaws/

มุมมอง: 82

วีดีโอ

TWRTW Ep #7 - Nobel Prizes, Blackwell ramp, Benioff on Copilot, NYT suing Perplexity, x86 Consortium

1:15:59

TWRTW Ep #7 - Nobel Prizes, Blackwell ramp, Benioff on Copilot, NYT suing Perplexity, x86 Consortium

มุมมอง 7314 วันที่ผ่านมา

TWRTW Ep #7 - Nobel Prizes, Blackwell ramp, Benioff on Copilot, NYT suing Perplexity, x86 Consortium

How to Pick a Large Language Model for Private AI -- A Brief Overview

17:42

How to Pick a Large Language Model for Private AI -- A Brief Overview

มุมมอง 79628 วันที่ผ่านมา

How to Pick a Large Language Model for Private AI A Brief Overview

TWRTW Ep #6 - Updates on AI Datacenters, Intel/Qualcomm, Llama3.2, NotebookLM, whats next for OpenAI

1:15:29

TWRTW Ep #6 - Updates on AI Datacenters, Intel/Qualcomm, Llama3.2, NotebookLM, whats next for OpenAI

มุมมอง 80หลายเดือนก่อน

TWRTW Ep #6 - Updates on AI Datacenters, Intel/Qualcomm, Llama3.2, NotebookLM, whats next for OpenAI

Exploring the Long Context Window of Llama-3.1-405B on NVIDIA Grace Hopper GH200 Superchip

4:51

Exploring the Long Context Window of Llama-3.1-405B on NVIDIA Grace Hopper GH200 Superchip

มุมมอง 471หลายเดือนก่อน

Exploring the Long Context Window of Llama-3.1-405B on NVIDIA Grace Hopper GH200 Superchip

NVIDIA isn't a bubble? Armchair analysis of Big Tech's GPU spend (TWRTW Ep #5)

6:18

NVIDIA isn't a bubble? Armchair analysis of Big Tech's GPU spend (TWRTW Ep #5)

มุมมอง 53หลายเดือนก่อน

NVIDIA isn't a bubble? Armchair analysis of Big Tech's GPU spend (TWRTW Ep #5)

TWRTW Ep #5 - Who Can Afford to Build Frontier LLM's?

50:27

TWRTW Ep #5 - Who Can Afford to Build Frontier LLM's?

มุมมอง 83หลายเดือนก่อน

TWRTW Ep #5 - Who Can Afford to Build Frontier LLM's?

TWRTW Ep #4 - Coding Assistants, SMCI shorts, NVIDIA DOJ Subpoena, OpenAI Lawsuits, Intel Spinoff

1:17:33

TWRTW Ep #4 - Coding Assistants, SMCI shorts, NVIDIA DOJ Subpoena, OpenAI Lawsuits, Intel Spinoff

มุมมอง 1782 หลายเดือนก่อน

TWRTW Ep #4 - Coding Assistants, SMCI shorts, NVIDIA DOJ Subpoena, OpenAI Lawsuits, Intel Spinoff

Demo and Code Review for Text-To-SQL with Open-WebUI

19:16

Demo and Code Review for Text-To-SQL with Open-WebUI

มุมมอง 2.8K2 หลายเดือนก่อน

Demo and Code Review for Text-To-SQL with Open-WebUI

TWRTW Ep #3 - GenAI's Impact on Work/Privacy, Immersion Cooling, OpenStack is Back, Perplexity Ads

1:11:19

TWRTW Ep #3 - GenAI's Impact on Work/Privacy, Immersion Cooling, OpenStack is Back, Perplexity Ads

มุมมอง 772 หลายเดือนก่อน

TWRTW Ep #3 - GenAI's Impact on Work/Privacy, Immersion Cooling, OpenStack is Back, Perplexity Ads

TWRTW Ep #2 - Nuclear Power, GPU Buildouts, Semi-Stateful Workloads, LLM security, GPT-5 Speculation

1:18:03

TWRTW Ep #2 - Nuclear Power, GPU Buildouts, Semi-Stateful Workloads, LLM security, GPT-5 Speculation

มุมมอง 772 หลายเดือนก่อน

TWRTW Ep #2 - Nuclear Power, GPU Buildouts, Semi-Stateful Workloads, LLM security, GPT-5 Speculation

Using Llama-3.1-405B as a Coding Assistant with Continue.Dev, Ollama, and NVIDIA GH200 Superchip

10:33

Using Llama-3.1-405B as a Coding Assistant with Continue.Dev, Ollama, and NVIDIA GH200 Superchip

มุมมอง 1.5K2 หลายเดือนก่อน

Using Llama-3.1-405B as a Coding Assistant with Continue.Dev, Ollama, and NVIDIA GH200 Superchip

Running Llama-3.1-405B with Ollama and Open-WebUI: Introduction to the DL384 Gen12 Server

7:11

Running Llama-3.1-405B with Ollama and Open-WebUI: Introduction to the DL384 Gen12 Server

มุมมอง 3312 หลายเดือนก่อน

Running Llama-3.1-405B with Ollama and Open-WebUI: Introduction to the DL384 Gen12 Server

TWRTW Ep. #1 - Intel issues, NVIDIA delays, JPY carry trades, Google antitrust

28:51

TWRTW Ep. #1 - Intel issues, NVIDIA delays, JPY carry trades, Google antitrust

มุมมอง 1512 หลายเดือนก่อน

TWRTW Ep. #1 - Intel issues, NVIDIA delays, JPY carry trades, Google antitrust

Building Customized Text-To-SQL Pipelines in Open WebUI

6:22

Building Customized Text-To-SQL Pipelines in Open WebUI

มุมมอง 4.3K2 หลายเดือนก่อน

Building Customized Text-To-SQL Pipelines in Open WebUI

Simple Overview of Text to SQL Using Open-WebUI Pipelines

6:02

Simple Overview of Text to SQL Using Open-WebUI Pipelines

มุมมอง 5K3 หลายเดือนก่อน

Simple Overview of Text to SQL Using Open-WebUI Pipelines

Overview of an Example LLM Inference Setup

10:21

Overview of an Example LLM Inference Setup

มุมมอง 3.1K3 หลายเดือนก่อน

Overview of an Example LLM Inference Setup

ความคิดเห็น

@bittuk575 12 วันที่ผ่านมา
I am really curious to know how to integrate this particular pipeline text_to_sql_pipeline in Open web UI and enable it? I successfully verified the Pipeline connection but in 'Pipeline' section, when I upload this text_to_sql_pipeline, it shows me No Valves. Could you please explain me how you enabled it in your Open Web UI? I am using the docker compose containing Ollama, Open Web UI and Searxng service. I run the Pipeline Container separately. Please guide me
@MustRunTonyo 15 วันที่ผ่านมา
I have used the openwebui standard pipeline, and it looks like I can't put more than one table in the DB_table field. That's too much of a downside! Did you come across a solution?
@thenatureofnurture6336 18 วันที่ผ่านมา
It's turtles (nested dolls, a totem pole, a mortal coil, the collective unconscious etc.) all the way down... and it's a very long way down. Humans mistake their unconscious appetites for divine inspiration. You probably believe YOU thought that question. You are just the latest iteration, the most sophisticated variation that has passed The test. Love ya!
@KeesFluitman 29 วันที่ผ่านมา
so I'm a little bit new to this area. You've add the xml file to the database yourself, and are just querying it via the pipeline? I was thinking about whether this would be a viable solution for an app that would allow people to easily find information that is stored in a database. But I guess that would be a huge security risk, since you basically allow them direct database access.
@jordannanos 29 วันที่ผ่านมา
@@KeesFluitman no xml file, it’s a csv but yes it’s direct access to the database. In a real environment this would run against a data warehouse, lake, or some backup/export of the database
@jordannanos 29 วันที่ผ่านมา
@@KeesFluitman there has to be a way in which people can find information stored in a database today. And generally they write SQL to find that information, or have to ask an analyst to write the SQL for them and send them the results. This pipeline just simplifies that process a little bit.
@AaronBlox-h2t หลายเดือนก่อน
Cool video.... Very useful info, thanks. Worth a sub.
@kosarajushreya6578 หลายเดือนก่อน
It's great video, thanks it helps a lot. Can I also connect the Microsoft SQL server database with Open webui through pipeline?
@jordannanos หลายเดือนก่อน
@@kosarajushreya6578 yes, but you’ll need to modify the pipeline code to do this. You’ll need to know how to connect to your DB in python.
@MrExtahsee หลายเดือนก่อน
This is wild
@Earthvssuna หลายเดือนก่อน
thanks so much, i will try all of it! but first Im curious how you settuo vllm for openwebui instead of ollama...do you have any good installation docu source for that?
@Earthvssuna หลายเดือนก่อน
so maybe the whole setup if relevant... like is ollama on a linux server? is openwebui on windows or on the same linux in a container etc..?
@jordannanos หลายเดือนก่อน
@@Earthvssuna you can run vLLM as a docker container or k8s deployment. For docker use this doc: docs.vllm.ai/en/latest/serving/deploying_with_docker.html Once the model (typically one from huggingface, like mistral or llama) is running, it’s an OpenAI-compliant endpoint. You can use the OpenAI python client for custom apps, or just add it as an endpoint in open-webui in the admin settings page. If you’re interested I was considering making some more videos describing how to install docker, kubernetes etc on a GPU server?
@SiD-hq2fo หลายเดือนก่อน
insightful! thanks brotha
@jim02377 หลายเดือนก่อน
This was a great intro, I think the information about Open webui pipes on the site is a bit vague. I would love to see more about how to use pipes for things like filtering user inputs or outputs, if pipes are the appropriate thing to use for that kind of thing. I work for a school district and would like to be able to do that to allow students access to local models.
@jordannanos หลายเดือนก่อน
@@jim02377 I haven’t played with filters, but that concept does exit as a type of pipeline: github.com/open-webui/pipelines/blob/main/examples/filters/detoxify_filter_pipeline.py
@albionix หลายเดือนก่อน
Hi Jordan, actually the document is in russian language but describes Kazachstan sities/towns and credit requirements in local currency. Still impressive demo of running local LLM capability.
@jordannanos หลายเดือนก่อน
Thanks, good to know. When I opened it up in Word, the bottom left says “Kazakh”. Too late now to correct the video
@azmat8250 หลายเดือนก่อน
Great review, Jordan! Quick question. I have a pipeline that calls Replicate to generate an image based off the user_message (prompt) fed in from open-webui. However, when i get the response from Replicate, I'm having some issues displaying the response back in open-webui. Do you know if the return type of the pipe function has to be a string in order for open-webui to render text? What is open-webui's interface expectation on the return from the pipe function?
@jordannanos หลายเดือนก่อน
@@azmat8250 I’ve only seen a string work when returning from a pipeline. Even a list throws an error for me. However it’s all open source… if you look at the component during a web search, or with an audio input, it seems like you could create something custom.
@jordannanos หลายเดือนก่อน
@@azmat8250 looked into this and v0.3.30 of open-webui has experimental support for image generation via OpenAI’s api and a few others. It’s not via a pipeline, but still may be worth upgrading and checking out if you haven’t seen it yet.
@azmat8250 หลายเดือนก่อน
thanks, @jordannanos . I'm on .30 but it seems like it's not working...at least for me. I'm still toying around with it. If I find something, I'll share here.
@geovannywing1648 หลายเดือนก่อน
i installed open we ui local in a docker container but when i access don't see the option to upload the pipelines files, there is a special config you running?
@jordannanos หลายเดือนก่อน
@@geovannywing1648 if you’re using docker you’ll also need to run a separate “pipelines” container from the open-webui project, make sure networking is setup correctly between the containers, and then a connection is established between the two containers.
@jordannanos หลายเดือนก่อน
@@geovannywing1648 docs here: docs.openwebui.com/pipelines/
@lucasl1047 หลายเดือนก่อน
and the conclusion is…?
@firstland_fr หลายเดือนก่อน
You think we can use custom model with api for rag ?
@jordannanos หลายเดือนก่อน
@@firstland_fr yes I don’t see why not. Ollama will work with any model in GGUF format (llama.cpp). And vLLM works with just about any transformers model from huggingface: docs.vllm.ai/en/latest/models/adding_model.html Both ollama and vLLM are tested with this pipeline
@renatopaschoalim1209 หลายเดือนก่อน
Hey Jordan! Can I change your pipelines for work in SQL Server?
@jordannanos หลายเดือนก่อน
@@renatopaschoalim1209 yes, it’s tested with Postgres and MySQL. If you know how connect to SQL server with python, you’ll be able to use the pipeline
@Mohsin.Siddique หลายเดือนก่อน
Great Video! Can you tell me please how to create/generate API Key for llama_index?
@jordannanos หลายเดือนก่อน
@@Mohsin.Siddique llama-index is a python package that is installed via pip, you don’t need an API key. No API keys required for this pipeline
@RickySupriyadi หลายเดือนก่อน
minds to share your code please?
@jordannanos หลายเดือนก่อน
@@RickySupriyadi hi, code is here: github.com/JordanNanos/example-pipelines video reviewing the code: th-cam.com/video/iLVyEgxGbg4/w-d-xo.html
@RickySupriyadi หลายเดือนก่อน
@@jordannanos wow cool, thank you.
@Alex-os5co หลายเดือนก่อน
What an awesome introduction to pipelines - thank you so much!
@gilkovary2753 2 หลายเดือนก่อน
Hi, how do I execute the python lib installation on the pipeline server?
@jordannanos 2 หลายเดือนก่อน
@@gilkovary2753 you’ll need to docker exec or kubectl exec into the container called “pipelines” Then run: pip install llama-cloud==0.0.13 llama-index==0.10.65 llama-index-agent-openai==0.2.9 \ llama-index-cli==0.1.13 llama-index-core==0.10.66 llama-index-embeddings-openai==0.1.11 \ llama-index-indices-managed-llama-cloud==0.2.7 llama-index-legacy==0.9.48.post2 \ llama-index-llms-ollama==0.2.2 llama-index-llms-openai==0.1.29 \ llama-index-llms-openai-like==0.1.3 llama-index-multi-modal-llms-openai==0.1.9 \ llama-index-program-openai==0.1.7 llama-index-question-gen-openai==0.1.3 \ llama-index-readers-file==0.1.33 llama-index-readers-llama-parse==0.1.6 \ llama-parse==0.4.9 nltk==3.8.1
@netixc9733 2 หลายเดือนก่อน
Thanks for sharing this awesome project! I tried running the 01_text_to_sql_pipeline_vLLM_llama.py file from your GitHub repo, but I'm having trouble uploading it on Open WebUI even though I've installed all the requirements. Do you have any idea what might be causing this issue? Thanks again!
@dj_hexa_official 2 หลายเดือนก่อน
Did you configure well pipeline ?
@netixc9733 2 หลายเดือนก่อน
@@dj_hexa_official what do u mean with that ?
@jordannanos 2 หลายเดือนก่อน
@@netixc9733 what error are you seeing? docker logs -f or kubectl logs -f your pipelines container and it may report an error
@johnkintree763 2 หลายเดือนก่อน
Lovely demo of the synergy between language models and databases.
@JJaitley 2 หลายเดือนก่อน
First of all great job Jordan. It would be really helpful if you could share the code on git.
@jordannanos 2 หลายเดือนก่อน
hi, code is here: github.com/JordanNanos/example-pipelines video reviewing the code: th-cam.com/video/iLVyEgxGbg4/w-d-xo.html
@orafaelgf 2 หลายเดือนก่อน
great video. hoping to see more as soon. congrats.
@jordannanos 2 หลายเดือนก่อน
hi, code is here: github.com/JordanNanos/example-pipelines video reviewing the code: th-cam.com/video/iLVyEgxGbg4/w-d-xo.html
@dj_hexa_official 2 หลายเดือนก่อน
Jordan , Super good job. I'm trying to integrate openwebui into my CRM system. That I would like to query the database for any of our product price or everything through the chat for my employees. This rag pipeline can make it in this way for example ? Thanks you for your answer
@jordannanos 2 หลายเดือนก่อน
hi, I think if you've got a db you should be able to query it. especially if you already know how using python. I posted another video. code is here: github.com/JordanNanos/example-pipelines video reviewing the code: th-cam.com/video/iLVyEgxGbg4/w-d-xo.html
@dj_hexa_official 2 หลายเดือนก่อน
@@jordannanos Thanks a lot Jordan . Super cool
@swarupdas8043 2 หลายเดือนก่อน
Hi. Could you link us to the source code of the pipeline?
@jordannanos 2 หลายเดือนก่อน
code is here: github.com/JordanNanos/example-pipelines video reviewing the code: th-cam.com/video/iLVyEgxGbg4/w-d-xo.html
@RedCloudServices หลายเดือนก่อน
Jordan thanks, I have a single gpu runpod setup would you recommend just adding a docker postgresql to existing pod? and is the python code using langchain stored in the pod pipeline settings? this sort of reminds me of AWS serverless Lambda but simpler
@jordannanos หลายเดือนก่อน
@@RedCloudServices if you’d like to save money I would run Postgres in docker on the same VM you’ve already got. That will also simplify networking. Over time you might want to start/stop those services independently in the event of an upgrade to docker or your VM. Or you might want to scale independently. In that case you might want a separate VM for your DB and a separate one for your UI. Or you might consider running kubernetes. Yes the python code is all contained within the pipelines container and uses llama-index not langchain (though you could use langchain too). Just a choice I made.
@jordannanos หลายเดือนก่อน
@@RedCloudServices in other words, you’ll need to pip install the packages that the pipeline depends on, inside the pipelines container. Watch the other video I linked for more detail on how to do this.
@RedCloudServices หลายเดือนก่อน
@@jordannanos yep! just watched it. I just learned openwebui does not allow Vision only models or multi modal LLMs like Gemini. Was hoping to setup a pipeline using a vision model 🤷‍♂️ also it’s not clear how to edit or setup whatever vector db it’s using
@martinsmuts2557 2 หลายเดือนก่อน
Hi Jordan, thanks. I am missing the steps where you created the custom "Database Rag Pipeline with Display". From the Pipelines page you completed the database details and set the Text-to-sql Model to Llama3, but where do you configure the connection between the pipeline valves and the "Database Rag Pipeline with Display" to be an option to be selected?
@jordannanos 2 หลายเดือนก่อน
@@martinsmuts2557 it’s a single .py file that is uploaded to the pipelines container. I’ll cover that in more detail in a future video
@KunaalNaik 2 หลายเดือนก่อน
@@jordannanos Do create this video soon!
@jordannanos 2 หลายเดือนก่อน
@@KunaalNaik @martinsmuts2557 just posted a video reviewing the code: th-cam.com/video/iLVyEgxGbg4/w-d-xo.html repo is here: github.com/JordanNanos/example-pipelines
@random_stuf_yt 2 หลายเดือนก่อน
hi
@ZiggyDaZigster 2 หลายเดือนก่อน
30k GC? 8 of them?
@EricWang-u4x 2 หลายเดือนก่อน
Thx for sharing and it's really interesting to learn more about the pipeline projects related to open webui.
@fakebizPrez 2 หลายเดือนก่อน
Sweet rig. Is that your daily driver? 😀😀
@KCM25NJL 2 หลายเดือนก่อน
The cost of such a setup is circa $500,000........ amma get me 2 :)
@pipcountgps1 2 หลายเดือนก่อน
Thank you Jordan! Great work, interesting to see how these new servers can really deliver performance. ARM / x86.. just works. Yours, Greg
@pipcountgps1 2 หลายเดือนก่อน
Thank you Jordan.
@hasaniqbal3180 2 หลายเดือนก่อน
Thank you for this. Can you share more info on the RAG pipeline along with code examples.
@jordannanos 2 หลายเดือนก่อน
working on getting it to run on both vLLM + ollama endpoints with llama3.1 + mistral. prompt uses llamaindex for text-to-sql.
@jordannanos 2 หลายเดือนก่อน
similar to this guide: docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo/
@jvannoyx4 2 หลายเดือนก่อน
Great job can't wait to see more
@jordannanos 2 หลายเดือนก่อน
@@jvannoyx4 hi, code is here: github.com/JordanNanos/example-pipelines video reviewing the code: th-cam.com/video/iLVyEgxGbg4/w-d-xo.html
@interactivetech1 2 หลายเดือนก่อน
Amazing video! I have a 4xA4000 GPU 128GB and I can only get the 405B 2_K model, and it’s really slow. Amazing how the GH100 chips offer great token/sec performance!
@niceshotapps1233 2 หลายเดือนก่อน
- what are you using it for? - .... stuff
@0101-s7v 2 หลายเดือนก่อน
AI, apparently. (LLM = Large Language Model)
@rodrimora 2 หลายเดือนก่อน
I feel jalous of that 8xH100 server. Currently using a 4x3090 at home. I actually use a pretty similar setup with vLLM for the full precision models and exllama or llama.cpp for quantized models + OpenwebUI as a frontend.
@MadeInJack 2 หลายเดือนก่อน
Why would you need more than that? Be glad for what you already have or you won't find happiness :)
@ricardocosta9336 2 หลายเดือนก่อน
Bitch i have a p40 and im over the moon. Being poor in ml is hard.
@nesdi6653 2 หลายเดือนก่อน
Word
@nesdi6653 2 หลายเดือนก่อน
Why not podman tho
@davidjanuski 2 หลายเดือนก่อน
Good discussion! Keep it up.
@FarhadOmid 2 หลายเดือนก่อน
Great work, Jordan! Gonna start scraping the parts together...
@starbrandX 3 หลายเดือนก่อน
I been automating deployments with skypilot. It uses the cheapest spot instances and heals itself
@peter102 3 หลายเดือนก่อน
nice video. saw the link from twitter. my question is, is there a way to speed up the results after you ask it a question?
@jordannanos 3 หลายเดือนก่อน
Yes, working to improve the LLM response and SQL query time