- 261
- 319 252
Nodematic Tutorials
United States
เข้าร่วมเมื่อ 19 พ.ย. 2020
At Nodematic Labs, our vision is to empower process improvement for the most complex software engineering value streams on the planet. To realize this vision, we build domain-tailored solutions that minimize “time to value”, maximize depth of insights, and facilitate concrete business outcomes. Our TH-cam channel is focused on actionable, practical, and valuable technical tutorials, with particular emphasis on software delivery, open source, and cloud engineering.
BigQuery Streaming Simplified (The Two Best Options)
Learn how to stream data into Google BigQuery using two powerful methods: direct Python streaming and Pub/Sub integration. Perfect for data engineers, IoT developers, and anyone working with real-time data analytics!
What You'll Learn:
• How to set up BigQuery streaming using Python
• Understanding different stream types (Pending, Committed, Buffered)
• Working with the Storage Write API
• Implementing Pub/Sub for decoupled data streaming
• Handling permissions and IAM setup
• Best practices for production environments
Key Topics Covered:
- BigQuery data warehousing
- Real-time data streaming
- IoT data ingestion
- Python implementation
- Pub/Sub messaging
- Error handling
- Scalability considerations
- Production best practices
Free Trial - Our New Diagram Tool: softwaresim.com/pricing/ ("TH-cam24" for 25% Off)
Demonstration Code and Diagram: github.com/nodematiclabs/bigquery-streaming
0:00 Conceptual Overview
1:20 BigQuery Dataset
2:37 Storage Write API Streaming
12:40 PubSub Streaming
#bigquery #googlecloud #dataengineering
What You'll Learn:
• How to set up BigQuery streaming using Python
• Understanding different stream types (Pending, Committed, Buffered)
• Working with the Storage Write API
• Implementing Pub/Sub for decoupled data streaming
• Handling permissions and IAM setup
• Best practices for production environments
Key Topics Covered:
- BigQuery data warehousing
- Real-time data streaming
- IoT data ingestion
- Python implementation
- Pub/Sub messaging
- Error handling
- Scalability considerations
- Production best practices
Free Trial - Our New Diagram Tool: softwaresim.com/pricing/ ("TH-cam24" for 25% Off)
Demonstration Code and Diagram: github.com/nodematiclabs/bigquery-streaming
0:00 Conceptual Overview
1:20 BigQuery Dataset
2:37 Storage Write API Streaming
12:40 PubSub Streaming
#bigquery #googlecloud #dataengineering
มุมมอง: 3
วีดีโอ
Under the Hood: Cloud Composer and Kubernetes (Airflow)
มุมมอง 489 ชั่วโมงที่ผ่านมา
Discover the inner workings of Google Cloud Composer and its Kubernetes foundation in this quick tutorial. Perfect for cloud engineers, DevOps professionals, and anyone working with workflow automation in Google Cloud Platform. Learn About: • Cloud Composer 2 basics • Critical environment configuration settings • Service account permissions and best practices • Kubernetes Autopilot cluster mana...
No Code Dataflow (Apache Beam) Pipeline Builder (Google Cloud)
มุมมอง 5621 วันที่ผ่านมา
Learn how to build powerful data pipelines in Google Cloud Platform without writing a single line of code! This step-by-step tutorial shows you how to use Dataflow's Job Builder to create efficient data workflows, perfect for beginners and data professionals alike. What You'll Learn: • How to use Dataflow Job Builder • Creating data pipelines without coding • Setting up data transformations vis...
Hands-On Intro to the Cloud SQL Auth Proxy (Google Cloud)
มุมมอง 6521 วันที่ผ่านมา
Learn how to securely connect to Google Cloud SQL instances using the Cloud SQL Auth Proxy in this comprehensive tutorial. Perfect for DevOps engineers, cloud architects, and developers working with Google Cloud Platform (GCP). Key Topics Covered: • Setting up a Cloud SQL instance • Enabling necessary GCP APIs • Using Cloud SQL Auth Proxy for secure connections • Connecting from outside VPC net...
Uplevel Your Database - Migrate to Spanner (Google Cloud)
มุมมอง 5321 วันที่ผ่านมา
Learn how to migrate your MySQL database from Cloud SQL to Google Cloud Spanner in this comprehensive tutorial. Perfect for developers looking to scale their applications globally while maintaining strong consistency. Key Topics Covered: - Cloud SQL to Spanner migration process - Single-region vs multi-region setup - Schema and data migration strategies - Handling data type conversions - Cost o...
Synthetic Streaming Data Made Easy (No Code Dataflow in GCP)
มุมมอง 61หลายเดือนก่อน
In this step-by-step tutorial, discover how to create realistic streaming data without writing a single line of code! Perfect for data engineers, cloud architects, and anyone interested in building streaming data pipelines. 📚 What You'll Learn: • Setting up synthetic data generation using Dataflow templates • Creating and configuring PubSub topics and subscriptions • Implementing data schemas f...
CSV-to-BigQuery Pipelines in Google Cloud (No-Code Dataproc)
มุมมอง 85หลายเดือนก่อน
Learn how to automatically transform CSV files into BigQuery tables using Dataproc! In this step-by-step tutorial, we'll show you how to: • Set up a Cloud Storage bucket for your CSV files • Create a BigQuery dataset • Use Dataproc's serverless batches feature • Configure your network for Private Google Access • Run a Dataproc job to convert CSVs to BigQuery tables Free Trial - Our New Diagram ...
Fine Tune and Deploy Llama 3.2 (GOLD STANDARD for Beginners)
มุมมอง 487หลายเดือนก่อน
Learn how to fine-tune and deploy a production-ready large language model in this comprehensive tutorial, using Hugging Face, Axolotl, vLLM, and Llama 3.2! 🛠️ Tools & Technologies Used: - Llama 3.2 Base Model - Axolotl for Fine-tuning - QLoRA (Quantized Low-Rank Adaptation) - Hugging Face Libraries - vLLM for Production Deployment - Google Cloud Run - Docker 🎯 What You'll Learn: - How to prepar...
From-Scratch Private Service Connect Intro (Google Cloud Networking)
มุมมอง 100หลายเดือนก่อน
Learn how to set up and use Google Cloud's Private Service Connect in this step-by-step tutorial. Perfect for cloud architects, network engineers, and DevOps professionals looking to enhance their GCP skills. 🔹 What you'll learn: - Understanding Private Service Connect basics - Setting up a service producer environment - Creating a service consumer network - Configuring load balancers for Priva...
1 Minute to Save on Cloud Storage Costs (No Code Compression)
มุมมอง 103หลายเดือนก่อน
Learn how to slash your cloud storage costs using a simple, no-code compression technique! In this tutorial, we'll show you how to: • Compress cloud storage files without writing complex scripts • Use Google Cloud Storage (GCS) and Dataflow for bulk file compression • Set up a compression job using pre-built templates • Compare file sizes before and after compression • Decompress files when nee...
BigQuery: Scanning and Masking Sensitive Data (DLP)
มุมมอง 101หลายเดือนก่อน
Learn how to identify and secure sensitive information in your BigQuery datasets with this comprehensive guide. Perfect for data engineers, analysts, and anyone working with cloud-based data warehouses. 🔑 Key topics covered: - Creating sample data with sensitive information - Uploading data to BigQuery - Using Sensitive Data Protection (formerly Data Loss Prevention) to scan for sensitive data ...
Uppy and Tus: Resumable, Large File Uploads (Demo with Code)
มุมมอง 191หลายเดือนก่อน
Learn how to implement a powerful resumable file upload system using the Tus protocol, Uppy frontend framework, and Kubernetes backend. Perfect for developers dealing with large file uploads and unstable network connections! 🔧 What you'll learn: - Setting up a Tus protocol server on Kubernetes - Configuring Uppy for client-side file uploads - Handling CORS and HTTPS setup - Implementing resumab...
Hands-On Intro to Illuminate (New Google AI for Podcasts/Speech)
มุมมอง 408หลายเดือนก่อน
Explore Google's latest AI research project, Illuminate, that transforms written content into engaging audio discussions. Learn how this powerful tool: • Converts research papers and books into podcast-like episodes • Improves upon traditional text-to-speech technology • Allows customization of length, tone, and explanation level • Makes complex topics more accessible and easier to understand T...
Fallback/Failover Service in Google Cloud Load Balancing
มุมมอง 94หลายเดือนก่อน
Learn how to set up a custom error service for load balancing in Google Cloud Platform. This tutorial covers: - Setting up virtual machines for primary and error services - Creating unmanaged instance groups - Configuring a Global External Application Load Balancer - Implementing custom error handling with Apache - Testing and troubleshooting your setup Perfect for DevOps engineers, cloud archi...
One-Click Multimodal Llama 3.2 Gradio App (GCP PlaySpaces)
มุมมอง 97หลายเดือนก่อน
Discover how to harness the power of AI with Google Cloud's latest innovation - PlaySpaces in Vertex AI. In this video, we'll dive into: • Vertex AI Model Garden: Access open-source and proprietary AI models • PlaySpaces: Run AI models using Cloud Run • Gradio: Create interactive AI applications • Step-by-step guide to launching a PlaySpace • Live demo: Using Llama 3.2 to interpret images Learn...
Research Game Changer: NotebookLM Introduction (Google AI)
มุมมอง 208หลายเดือนก่อน
Research Game Changer: NotebookLM Introduction (Google AI)
Fallback/Failover Load Balancer via Google Cloud DNS
มุมมอง 1972 หลายเดือนก่อน
Fallback/Failover Load Balancer via Google Cloud DNS
Free, Turnkey Llama 3.2 90B Multimodal API (Google Cloud AI)
มุมมอง 4152 หลายเดือนก่อน
Free, Turnkey Llama 3.2 90B Multimodal API (Google Cloud AI)
Controlling Egress (e.g., Domain Blocks) from Compute Engine
มุมมอง 722 หลายเดือนก่อน
Controlling Egress (e.g., Domain Blocks) from Compute Engine
Llama 3.2 Fine Tuning for Dummies (with 16k, 32k,... Context)
มุมมอง 7K2 หลายเดือนก่อน
Llama 3.2 Fine Tuning for Dummies (with 16k, 32k,... Context)
Protect Admin Webpages (admin.php) with GCP Load Balancing
มุมมอง 412 หลายเดือนก่อน
Protect Admin Webpages (admin.php) with GCP Load Balancing
Protect Your Apps: Managed SSO/OAuth Proxy (Google Cloud)
มุมมอง 862 หลายเดือนก่อน
Protect Your Apps: Managed SSO/OAuth Proxy (Google Cloud)
LLM-Integrated Browsing in Firefox (Any OSS or Proprietary Model)
มุมมอง 962 หลายเดือนก่อน
LLM-Integrated Browsing in Firefox (Any OSS or Proprietary Model)
Stable Diffusion XL: No Cost, No Login, No Setup on Hugging Face
มุมมอง 4322 หลายเดือนก่อน
Stable Diffusion XL: No Cost, No Login, No Setup on Hugging Face
Setup a Load Balancer with Private VMs (Google Cloud)
มุมมอง 1472 หลายเดือนก่อน
Setup a Load Balancer with Private VMs (Google Cloud)
Easily Geoblock Traffic (Russia, China, Sanctions,...) in GCP
มุมมอง 682 หลายเดือนก่อน
Easily Geoblock Traffic (Russia, China, Sanctions,...) in GCP
Translating and Captioning Text in Images/Frames (AI Pipeline, Google Cloud)
มุมมอง 902 หลายเดือนก่อน
Translating and Captioning Text in Images/Frames (AI Pipeline, Google Cloud)
Simplified: OWASP ModSecurity CRS 3.3 in Google Cloud
มุมมอง 1592 หลายเดือนก่อน
Simplified: OWASP ModSecurity CRS 3.3 in Google Cloud
Custom Llama 3.1 API and User Interface (Serverless Google Cloud)
มุมมอง 3662 หลายเดือนก่อน
Custom Llama 3.1 API and User Interface (Serverless Google Cloud)
Simplified: XSS and SQLI Protection (Google Cloud Armor)
มุมมอง 962 หลายเดือนก่อน
Simplified: XSS and SQLI Protection (Google Cloud Armor)
I had this error when I launched the job: "denied: Unauthenticated request. Unauthenticated requests do not have permission "artifactregistry.repositories.uploadArtifacts" on resource" . When I saw the logs, I found this message: "Warning: The gcloud CLI is not authenticated (or it is not installed). Authenticate by adding the "google-github-actions/auth" step prior this one." So we must configure the Authorization before the job and add the "google-github-actions/auth" action. For example: - id: auth uses: google-github-actions/auth@v2 with: credentials_json: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS }} - name: install gcloud cli uses: google-github-actions/setup-gcloud@v2 with: project_id: ${{ secrets.GOOGLE_PROJECT }}
Nice demo. Really helpful
I love the high quality of NotebookLM, what an amazing tool, but the only problem, a crucial one, is the only available two voices. People are using them like crazy, so they start to sound oversaturated, that's why I'm searching for a free and unlimited method to change the voices, keeping the quality. Found an AI that does the job with the same quality, but it's a pity it's not cheap nor unlimited. The search goes on...
They're playing music while they could be explaining what they are do. Practically no one understands what they're doing. That of explanation lead to me to thumbs down it. and stop watching. Hopefully I'll find something more useful .
is it worth using unsloth with amazon sagemaker ?
Hi, my name is Bongani from South Africa. Firstly, thank you for such an informative video. Short and straight to the point. I'm a non-technical co-founder in our startup, and I would like to ask you something that is somewhat technical. I'm in a region where the model has limited data on local languages. It's mainly good for detecting profanity. I would like to fix that by creating my own audio data sets, transcribing them and then feed those into the model to improve it. Is that something that is possible to do? I'm from a sound engineering background, now working in the telelhealth space
Very good overview video
Great tutorial! Just wish we could use checkpoints using the online version...
When I follow the tutorial I get an error: denied: Unauthenticated request. Unauthenticated requests do not have permission "artifactregistry.repositories.uploadArtifacts"
I am getting the same error as well :(
I'm grateful that i found this channel without the recommendation .
this is super awesome😎😎...thorough and easy to follow...thanks alot 👍🏿👍🏿
Thank you!
Hi, I have got error like : Not found: Dataset peak-catbird-440802-b3:dataform was not found in location US i have given all permissions as you mentioned and the dataset loction is us in bigquery.
such an intuitive video....very disheartening to see so less views, likes and subscribers😑...hope you continue making such videos
I have a folder with many PDF's and I would like to fine tune a model to summarize these PDF's and respond to questions in my website. Is there a way to do that using the example of this video?
you would first have to figure out the parsing logic to correctly extract the text and then put it in a summarizer, if all you want is a summary then there are many good models available on hugging face that you can use directly OR just get a gemini api key and use gemini for it, it should do a decent job at it.
@ thank you
@@igorcastilhos also if you want to use the PDFs as context to answer questions then you probably need to parse and then put them into a vector store so that they can be retrieved when needed, this is called RAG
@@abdulsami5843 I'm using Ollama with the Web UI tool. Inside it, I'm sending to the Knowledge collection some PDFs of resolutions of attorneys, so that they can ask about them whenever they want. The main reason to use Ollama (llama3.2) instead of OpenAI API is that it is free. But I'm having problems accessing the Web UI localhost:3000 from our server in my machine, it doesn't show the models installed in the server machine.
@@abdulsami5843 Also, the RAG feature doesn't have a very nice documentation. In my case, we have a distributed folder in microsoft windows (Like C:/) and inside that folder, the attorneys and advocates will send new PDFs through the website, and I wanted to use RAG for it, but it is very hard.
How can I download it as excel to my computer?
How do I submit a project in python. The main file (driver) main.py and other files that main.py imports? and other project files. requirements.txt, configs.json etc...
Fantastic tutorial. Can it recognize handwriting and extract them too! Google has amazing people
Great video!
Could you link the actual notebook you are using somewhere please? Feedback wise, I'm 47, been learning forever, you have an awesome pace, quality wise, I'd pay for the content.
Thanks! Here's the notebook colab.research.google.com/drive/1vIrqH5uYDQwsJ4-OO3DErvuv4pBgVwk4?usp=sharing
thanks!
greate, tks. how to up interface from UC? Do you have any video explain that?
At the time of video creation, the interface wasn't available, but we'll try to cover this in a future video.
Very good.
Thanks for the good tutorial ma'am. If you only save the QLoRA, How we can use it tools like LM Studio?
You will always need both the base model and the adapter layers to actually run/use the model - it's just a question of if you want to merge and store everything "together". I haven't tried LM Studio, but upon a quick look, I would suggest saving/publishing your model as merged weights (should be simpler to pull into LM Studio). The "GGUF Conversion" portion of the fine-tuning notebook might actually be best for the export, based on LM Studio's website "LM Studio supports any GGUF Llama, Mistral, Phi, Gemma, StarCoder, etc model on Hugging Face". Hope that helps!
@@nodematic yes I've exported as gguf and used the merged option and it worked. although i faced with the model over fitting to a small database and that made it's behavior to go weird in some cases. but I'm working to expand my training data and maybe add different system prompts into the data as well... the question is, how to divide my data into training/test datasets and check the loss function for test dataset? does that colab notebook support it? or I'mma need to figure it out myself? ++ thanks for the answer. 💙
The loss is reported after each step in the training (you'll see this in the training cell in the notebook). A good approach is to see where the loss starts to "level off" (decrease significantly slower), and use the model at that point. You could also consider reducing the LoRA alpha value to put less emphasis on the adapter layers (and increase the emphasis on the base model). The expanded training set is a good idea, especially if you have less than ~200 examples. There isn't a traditional training/test split in fine-tuning like you would in other AI/ML problems - partly because responses are difficult to score quantitatively and with precision. Instead, people will do a post-training quality step of RLHF to integrate feedback on which fine-tuned model answers were good and which were bad, on tests. There are also some advanced methods to limit overfitting, but it's well beyond the scope of most small model fine tuning.
Does this require a load balancer? Or can you region block my instances? I'm trying to stay within the free tier and block everything but US traffic.
The demonstrated setup requires a load balancer - you'd have to DIY something if you're routing traffic straight to your VM public IP.
Thanks for the informational video. I really liked the way you demonstrated the workflows and detailed steps of installing Milvus.
Nice tutorial. Got me subscribed! ❤ Buuuutt,, i want a new more detailed fine-tuning tutorial on Gemma 2-9b.. especially for coding purposes.
Thanks. We'll add that to our video ideas.
@@nodematic can't wait for that to come out! 💯 keep it up! 💪🏼
Hello, I'm from Brazil. I'm new to AI. I would like to build an artificial intelligence to automate university work, as I have a lot of work and I can't keep up with it. I want an AI that can write papers like me using my texts. What adjustments or training should I do? Do I need to change a parameter?
In my mind I'm trying to use about 10 review texts of my articles. And 1 expanded summary. I want the AI to write like me without AI plagiarism detection.
ok cool. But how good is the forecast realistically?? Where are the tests and benchmarks?
I'd recommend the "Empirical Results" portion of the TimesFM research paper for some testing/benchmarking arxiv.org/pdf/2310.10688.
what an evolved version of 2010 notepad instruction. Loved it.
I got "kubectl apply -f notebook.yaml deployment.apps/jupyter configured persistentvolumeclaim/jupyter-pvc unchanged error: resource mapping not found for name: "bucket-data" namespace: "" from "notebook.yaml": no matches for kind "Dataset" in version "com.ie.ibm.hpsys/v1alpha1" ensure CRDs are installed first" - Any idea why?
if you cant create tutorial how to get publicli accesible GCP project ower whole Google Cloud...im have created GCP project already but i dont know to find way customers to find my project...here is list of application what i plan to host :Cloud storage,data prevention loss (DLP),datastore,firestore,firebase,pub/sub,pub/sub lite,compilance,cloud funtions,bigquerry
RAG on Roids! Better than what I made :)
You lost me withn 60 secs how is this for dummies
Thanks for making this. Workbench bills are tied to the VM uptime, so it shouldn't be $110 a month unless you work every day of the month day and night -- which i hope not!
amazing tutorial, can you do one for https load balancer please!! thank you
Thanks! We'll add this to our videos list.
very nice
Just subscribed to PRO, this will save me $5k on internal hardware for POC and just getting things strung together, and give me access to better hardware!
thanks 🙂
Will this software building studio ever be available for public use?
Yes, it's available now at softwaresim.com/pricing/
Awesome. Easy to follow. Ty
I like the style of this video, dam!! Nice job
Please show us how to make embeddings based on vertex ai and how to deploy anything as an web app or android app from vertex. Thank you so much for the wonderful content.
Thanks for the suggestion - we'll add that to our video plans.
Agree. That would be a great piece of content to watch.
@@nodematic Thanks!
@@xXWillyxWonkaXx yup
Can you explain what is SQL X file
SQLX is essentially a templated version of SQL, which also includes some metadata (the details at the top). Most SQL scripts are valid SQLX scripts - it's sort of an extension of SQL.
This video is helpful.. Thanks a lot.. can u please post a video on spark + docker + unity catalog + delta lake (delta-spark) combining all the above ..
Interesting idea, thanks. We'll add it to our video ideas list.
Hello, Can I Execute Dataform Workflows in CloudShell?
Yes, with the Dataform CLI (or SDKs). Something like "dataform run" when you have the CLI installed. The gcloud CLI does not appear to support Dataform.
Hey hello, I'am a whole newbie in this. I apologize in advance if this sounds silly. Anyway, I followed step by step the tutorial and I can't seem to obtain a csv file. Nor even visualize it. what could be going wrong?
Hola, el uso de todas estas herramientas y servicios tienen un costo en Google Cloud, o son servicios gratuitos?
No, la implementación de Llama 3.1 tendrá un costo significativo. Sin embargo, puede utilizar la versión preliminar del modelo-como-servicio de Llama 3.1 (actualmente gratuita) y el nivel gratuito Cloud Run y Artifact Registry para obtener una versión gratuita de esta arquitectura.
so timesfm is used to forecast the future time series data based on current time series data?
Exactly
Thank you for sharing this useful video!👍 I followed your steps today but I got the following error: "Training pipeline failed with error message: timestamp transformation is specified for column date, which does not contain timestamp data, stats: distinct_value_count: 20000 category_stats" Would you mind giving me a hand?🥺✨