The docs were already in markdown, so I just grabbed them straight from GitHub. github.com/czue/pegasus-docs. The app now lets you upload a zip, connect a repo, or build a bot from your slack history
@@czue Nice! I'm trying to ingest all of a given app's documentation into a vector database so that I could query over. I've tried scraping but the problem is that Beautifulsoup scrapes each and every page, even the ones that aren't related to the documentation. I guess this isn't really a problem that langchain could help me with.
gist.github.com/czue/8fe729720c56008c5ef459f6299ebf34 is ~what I worked on. Also, added to the description. Edit: whoops that was for the wrong video. The source code for this is part of my commercial products (www.saaspegasus.com/ and scriv.ai/), so I can't easily provide it.
After watching your video at my regular 2x speed, I'm now looking for a ai that will censor "umm" 🤣🤣 Thank you for the video great information. Any chance we could get the script you put together?
Yes, unfortunately I'm using it in my commercial products (www.saaspegasus.com/ and scriv.ai/), so I can't easily provide it. But would love to find time to extract it to its own project so I can share.
That's really cool Cory. Thanks for sharing.
You're welcome! Let me know if there's anything else you'd be interested in seeing
Very cool! Thanks for exploring- Helps me jump in with a bit of background knowledge 😊
Thanks Tim!
So you scraped all of Pegasus' documentation, converted it into markdown files and added those to your database?
The docs were already in markdown, so I just grabbed them straight from GitHub. github.com/czue/pegasus-docs.
The app now lets you upload a zip, connect a repo, or build a bot from your slack history
@@czue Nice! I'm trying to ingest all of a given app's documentation into a vector database so that I could query over. I've tried scraping but the problem is that Beautifulsoup scrapes each and every page, even the ones that aren't related to the documentation. I guess this isn't really a problem that langchain could help me with.
@@saadystic ah yeah sounds like more of a preprocessing/filter step
would you please provide source code for this?
gist.github.com/czue/8fe729720c56008c5ef459f6299ebf34 is ~what I worked on. Also, added to the description.
Edit: whoops that was for the wrong video. The source code for this is part of my commercial products (www.saaspegasus.com/ and scriv.ai/), so I can't easily provide it.
great video, thanks for making it
After watching your video at my regular 2x speed, I'm now looking for a ai that will censor "umm" 🤣🤣
Thank you for the video great information.
Any chance we could get the script you put together?
Haha, yeah, sorry. Still working on my ability to speak. 😂
Will try to put together some sample code soon!
This would be more useful if you provided source code.
Yes, unfortunately I'm using it in my commercial products (www.saaspegasus.com/ and scriv.ai/), so I can't easily provide it. But would love to find time to extract it to its own project so I can share.