Scrape Financial Data from SEC Edgar with Python
ฝัง
- เผยแพร่เมื่อ 25 ก.ค. 2024
- In this video, we'll see how you can scrape financial reporting data from SEC Edgar using Python and the Edgartools library. After watching this video, you should be able to access and scrape many types of corporate disclosure such as 10-K (annual reports), 10-Q (quarterly reports), 8-K (material events), 13F (portfolio holdings), and many more. All that data can easily be extracted as strings for text data and as pandas dataframes for structured data.
A written version of the tutorial is available here: vincent.codes.finance/posts/e...
👍 Please like if you found this video helpful, and subscribe to stay updated with my latest tutorials. 🔔
❤️ You can support this channel by buying me a ☕: buymeacoffee.com/codesfinance
🔖 Chapters:
00:00 Intro
00:22 SEC Edgar
01:22 Edgartools
02:06 Company data
03:40 Download 10-Q quarterly reports
07:18 Download 8-K and press releases
11:43 Get all filings
13:06 Download 13F holdings
14:27 Outro
🔗 Video links:
SEC Edgar: www.sec.gov/edgar/searchedgar...
edgartools: github.com/dgunning/edgartools
🐍 More Vincent Codes Finance:
- ✍🏻 Blog: vincent.codes.finance
- 🐦 X: / codesfinance
- 🧵 Threads: www.threads.net/@codesfinance
- 😺 GitHub: github.com/Vincent-Codes-Finance
- 📘 Facebook: / 61559283113665
- 👨💼 LinkedIn: / vincent-codes-finance
- 🎓 Academic website: www.vincentgregoire.com/
#sec #secedgar #financials #scraping #python #programming #code #nlp #opensource #pandas #edgartools #bigdata #research #researchtips #vscode #professor #datascience #dataanalytics #dataanalysis #financialrecords #financialreporting #annualreport #earningsreport #earnings - วิทยาศาสตร์และเทคโนโลยี
This is an amazing video that denotes the Edgartools. If I have chance to like it 1000 times , I could like it.. Thanks again.. This series must go on ....
Thanks for the kind words! I have a few more scraping-related videos planned in the near future.
Great instructions! Please share more contents like this. Subscribed already. Thanks.
Thank you! Will keep that in mind, instruction on getting good financial data is certainly something I want to cover going forward.
at 3:24 you stated that running Apple.Financials would yield the most recent data, however the data scraped was from 2023.
is there a way to ensure that the data provided when running this code is the most recent data available?
I should have been more precise. This gives you the financial data from the latest annual report (form 10-K). In the case of Apple, the latest financials available would be from the quarterly report issued earlier this month. You can get the latest quarterly financials using: `apple.get_filings(form="10-Q").latest().obj().financials`. Note that companies only file reports quarterly, some there is always some lage in the data.
Very good video! If I want to get the 10K filings for a given company ticker to use for RAG training in an LLM would the approach in the video suffice?
Yes, kind of. It will let you download all the filing information. But keep in mind that 10Ks contain more than just the filing text. To get financials, you have to extract the XRBL information from the filing. Most filings also have attachments. Edgartools will let you download all of these, the hardest part for RAG is not the scraping, it will be to structure all that data in a useful way for querying.
congratulations for share!!
Hi, nice video, I'm now to coding, how can I extract the cashflow for example for different dates? let's say 2005. Thank you!
I haven't done this specifically, but you have to look at the filings that would have that information. For most companies, cash would be available on a quarterly basis in the 10-Q form, as part of the XBRL (I'm not sure of the exact xbrl field). I would approach this by searching for all filings of type "10-Q" that I'm interested in (based on year and company) and then retrieving the xbrl data for each filing, then extracting the cash flow column.
Damn this is so easy
That was my reaction too. Edgartools is just amazing when you need to get Edgar data in a usable format.
Is there a way to get the raw financial data in tables? Like get the income statement into a table that can be analyzed?
Yes, it is in the XBRL data. Financial data is in the "us-gaap" namespace. Each data point has a standardized name called "fact" and a value. When you call the get_facts() method on a company, you get all of them in an object that can easily be converted to a DataFrame. The main caveat I noticed is that the "value" may not be parsed properly (to int, float, decimal, etc) so you might have to do some extra work for that step.
@@VincentCodesFinance Ok sounds good. I was able to use "company.financials.balance_sheet" to get company financials. I was wondering if its possible to get historical financial info as this function doesn't really work with filing data. Would I have to use get_facts() and create the income statement by pulling in specific items?
Thank you for making this video!
That's how I would try it. You can also have a look at the code that produces company.financials.balance_sheet, it might give you an idea on how to proceed.