Excellent, thank you! A very clever strategy for large documents. However, I am a little at a loss in the search of a good embedding model for texts in Spanish. I am not sure whether the BGE models are the best option for these. Can you suggest one that could be integrated seamlessly within your code?
Hi, for Spanish language take a look at jinaai/jina-embeddings-v2-base-es . In your code simply replace the model_name variable and everything should work.
Source code
github.com/debugverse/debugverse-youtube/tree/main/summarize_huge_documents_kmeans
Why do you use the HuggingFaceBgeEmbeddings and not OllamaEmbeddings?
what if images of tables and equations are there in that case?
Excellent, thank you! A very clever strategy for large documents. However, I am a little at a loss in the search of a good embedding model for texts in Spanish. I am not sure whether the BGE models are the best option for these. Can you suggest one that could be integrated seamlessly within your code?
Hi, for Spanish language take a look at jinaai/jina-embeddings-v2-base-es . In your code simply replace the model_name variable and everything should work.
@@DebugVerseTutorials Thank you very much for your kind answer. I'll do that 😊🤗🤗
@@DebugVerseTutorials Hi, if I would to use the Ollama model, how can I know the exact name necessary to put in the model_name?