How to Create an AI-Assisted Search Engine with Python and txtAI in Seconds! Easy Tutorial

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ต.ค. 2024

ความคิดเห็น • 49

  • @python-programming
    @python-programming  2 ปีที่แล้ว +4

    Repository: github.com/wjbmattingly/youtube-txtai
    Help video for Anaconda and Environments: th-cam.com/video/mIB7IZFCE_k/w-d-xo.html (a little old but still very useful)

  • @khalifakhalifa610
    @khalifakhalifa610 2 ปีที่แล้ว +3

    Please we need more videos on TextAI.
    Your channel became my favorite!!! Kudos!!

  • @python-programming
    @python-programming  2 ปีที่แล้ว +2

    Also in this video, I reference the previous video. That video will go live next week instead.

  • @rickyS-D76
    @rickyS-D76 2 ปีที่แล้ว +1

    Thanks for the great video on txtAI, just loved the way you explained it. Thanks. I would like to see more txtAi + Streamlit app kind of videos that you mentioned in the end of this video.

    • @python-programming
      @python-programming  2 ปีที่แล้ว +1

      Thanks! So glad you enjoyed it! I will be doing more in the near future!

  • @sarasharick5209
    @sarasharick5209 ปีที่แล้ว +1

    Great video! I might have an opportunity to use this at work instead of some NER I was doing.

  • @satishchaudhary7875
    @satishchaudhary7875 2 ปีที่แล้ว +1

    Such a wonderful tutorial on txtai please do more video of txtai and paperai.

  • @Frank97006
    @Frank97006 ปีที่แล้ว +2

    At 2:40 it says txtai needs Python 3.7. What is meant is Python 3.7 or higher.
    So there is no need to install Python 3.7.

  • @sacred1profane
    @sacred1profane 2 ปีที่แล้ว +1

    Thanks for introducing the txtai.
    BTW the model works with python 3.9.9 on Mac.

    • @python-programming
      @python-programming  2 ปีที่แล้ว

      No problem! Thanks for that update on mac! I only have linux and Windows machines. Purchasing a mac is on my to do list.

    • @ojaskulkarni8138
      @ojaskulkarni8138 4 หลายเดือนก่อน

      @@python-programming How did you get the data set?

  • @debgandharghosh3981
    @debgandharghosh3981 7 หลายเดือนก่อน

    This video is so helpful for people like me who are taking baby steps towards NLP , I would really love to see how to update a txtai model, the github code for untitled.ipynb might be corrupt I couln't see the code , however wasn't a big issue I could write the code for drawing inferences by myself after seeing your video

  • @JOHNSMITH-ve3rq
    @JOHNSMITH-ve3rq ปีที่แล้ว

    Big next thing is an appropriate UI for users including showing them the source of each finding. Also using a vector database that doesn’t rely on memory which won’t work at scale. This is very cool though!

  • @umangternate
    @umangternate 5 หลายเดือนก่อน

    I failed installing txtai[all] with errors in fasttext block (cannot build wheels for fasttext). With pip install txtai, there was no error but still txtai is not "detectable" in vscode when importing. This is on windows 11 with visual studio (c++ build tools) installed. The version of python is 3.11 and the virtual environment is placed in drive D: (physically separated from drive C because I use SSD for C: and HD for D:). What could be the problem? Thank you.

  • @Kalks95
    @Kalks95 6 หลายเดือนก่อน

    How did you generate the data file? would I have to build my own data sets for this?

  • @theh1ve
    @theh1ve 2 ปีที่แล้ว +2

    Hi another awesome little video that not only shows a great use case but how to get up running with the code. Thank you. In answer to your questions yes to all! Integrated with streamlit absolutely as this would be how I would apply it. And understanding how to update the model would be great. Also say I had two written texts broken down into smaller documents could I return a tag with the results to see which text the document was returned from?

    • @python-programming
      @python-programming  2 ปีที่แล้ว +1

      Thanks! Awesome! I will put that video together that shows how to do that. As luck would have it, I have just done this for another project. The nested index will also be included.

  • @hunaydahsaeid1609
    @hunaydahsaeid1609 2 ปีที่แล้ว +2

    If want to build system to research papers data base to determine new research originality. What the best type of files csv or json to apply machine learning algorithms on it

    • @python-programming
      @python-programming  2 ปีที่แล้ว +2

      Great question. The file really is not important. It comes down to preference and use case. Will it sit on the web? JSON may be better.

    • @hunaydahsaeid1609
      @hunaydahsaeid1609 2 ปีที่แล้ว +2

      @@python-programming thank you.. 😊

    • @hunaydahsaeid1609
      @hunaydahsaeid1609 2 ปีที่แล้ว +1

      @@python-programming yes I want to sit it on the web ... and it will contains abstracts and titles for the research papers.. and it supposed to help students to determine researches with most similarity ot their proposed research.. I am applying your lessons about South Africa data set. You used json in the beginning . So I'm still learning and didn't complete all your lessons..
      Your lessons helped me so much. I'm grateful for it 🙏

  • @venkatesanr9455
    @venkatesanr9455 2 ปีที่แล้ว

    Thanks for the valuable videos. I have involved in semantic search mapping text as query and image/ other pdf docs as output. I have followed the approches for unstructured images/pdf/other extensions---->Tried like OCR based text extraction from images , pdf text extraction for pdf files only, bert embedding and doing clustering the images.
    Any other inputs or approaches using libraries from your end will be helpful for semantic search on unstructured data/images/pdf. Whether txtai is open sourced helpful for QA between images and text.
    Kindly reply.

  • @AndrewPeverells
    @AndrewPeverells 2 ปีที่แล้ว +1

    Great video as always, thank you for the tutorial! :)
    Just one quick question: I don't know anything about semantic search nor TextAI, but is it language indipendent? Or has it been trained on English?

    • @python-programming
      @python-programming  2 ปีที่แล้ว +1

      Thanks! Oh that is a great question. I did not speak about that. You will want to select a sentence transformer model for your language

    • @python-programming
      @python-programming  2 ปีที่แล้ว +1

      But it is language agnostic as a workflow

    • @AndrewPeverells
      @AndrewPeverells 2 ปีที่แล้ว +1

      @@python-programming cool, thank you very much! Do you know how many languages these models cover? Is there one for classical languages also?

    • @python-programming
      @python-programming  2 ปีที่แล้ว +1

      @@AndrewPeverells a good deal! What language(s) are you looking for? I can do some research and maybe demo that in a video for you.

    • @AndrewPeverells
      @AndrewPeverells 2 ปีที่แล้ว +1

      @@python-programming oh well, that would be awesome! You don't have to though, it's fine :) I was just wondering whether there was a transformer model for Latin!

  • @ameybikram5781
    @ameybikram5781 ปีที่แล้ว

    Is this library safe ? In terms of data breaches ?

  • @MonoJunkie
    @MonoJunkie ปีที่แล้ว

    I don't think you mentioned whether txtAI sends my/your data off to the cloud somewhere for analysis or refers to any external APIs or providers which would be important if dealing with sensitive information? Or who/what is behind it and whether it is legitimate? Paid/free? Or if there are licensing restrictions?

    • @python-programming
      @python-programming  ปีที่แล้ว +1

      Last I checked, it is all local. The creator is very pro open source.

    • @neuml
      @neuml ปีที่แล้ว +1

      Confirming that txtai is all local and doesn't send your data off to the cloud. You can download a model, disconnect your internet and everything will still work.

    • @python-programming
      @python-programming  ปีที่แล้ว +1

      @@neuml thanks for responding!

  • @techdiyer5290
    @techdiyer5290 11 หลายเดือนก่อน +1

    Im looking to create a web scraper thing in python that basically makes me find what im actually searching for. I want it to include something ive named table search. If anyone cares to ask, ill explain what that is/ how im thinking of making it work.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w ปีที่แล้ว

    what makes txtai different from the alternatives?

    • @neuml
      @neuml ปีที่แล้ว

      There are a lot of great vector database options available. txtai strives to make it easy to get up and running fast. It has built-in vectorization, vector storage, hybrid search and an LLM workflow framework for retrieval augmented generation (RAG). Everything runs local, no external APIs are required.

  • @Superdooperhero
    @Superdooperhero 2 ปีที่แล้ว +1

    You're in South Africa? If you're in Cape Town I can show you around.

    • @python-programming
      @python-programming  2 ปีที่แล้ว

      Thanks! I am actually and would have totally taken you up on that but we are leaving soon. Coming back next winter though!

  • @PabloPazosGutierrez
    @PabloPazosGutierrez 9 หลายเดือนก่อน

    Would be nice to search by phrases not just a word

  • @kosemekars
    @kosemekars 2 ปีที่แล้ว +2

    The GH link is 404

    • @python-programming
      @python-programming  2 ปีที่แล้ว +1

      Oh no! I will make it public when Inget to a computer. Thanks for letting me know!

    • @python-programming
      @python-programming  2 ปีที่แล้ว +1

      Fixed! Thanks again

  • @ravinkponjg
    @ravinkponjg ปีที่แล้ว +2

    Make more interesting video on txt ai

  • @hunaydahsaeid1609
    @hunaydahsaeid1609 2 ปีที่แล้ว +1

    👋

    • @khalifakhalifa610
      @khalifakhalifa610 2 ปีที่แล้ว

      Please more videos. You’re my favorite channel now!!!