Semantic Document Maps and Clusters

Data Analysis 7: Clustering - Computerphile

Shocking Price OnePlus 8 9 8T 10T 10pro Google pixel 6pro 8 Moto 2024 ||Technical Gossips

Incredibox Sprunki - Wrong teacher to prank #simon #sprunki

ทุกคนต้องไม่เชื่อแน่ๆ เค้ากำลังจะต่อยลมให้กระดาษขาดงั้นหรอ #negi #ipman

อยู่มา2ปีไม่เคยรื้อเตียง และะสิ่งที่เจอ!

Semantic Word Maps and Clusters

Orange Data Mining

มุมมอง 8 515

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 19 พ.ย. 2024

ความคิดเห็น • 13

@lemndemiel5263 2 ปีที่แล้ว ⁺²
Nice to see you (all) again - more videos! Thanks!
@SpiritTracker7 2 ปีที่แล้ว
thanks, great video. What add-on can I find the t-SNE module?
@waeladel ปีที่แล้ว
Great tutorial. How to find sentences that matches a selected cluster? Concordance only do one query at a time and corpus viewer will fetch the documents not the sentences.
@OrangeDataMining ปีที่แล้ว ⁺¹
Good question. It cannot be done easily, mostly because of the data structure underneath. You technically could tokenize on sentences separately and use the search option to look for specific words in sentences (but also one query at a time). We will think about to handle this. Thank you for the hint.
@no8888one ปีที่แล้ว
Hello and thank you for your great work!! I have a problem with "Document embedding" wedges which keeps giving me error when I run it after a corpus or preprocess text. I tried to use grimm_tale dataset and many other data and the error always appears. could you advice please
@OrangeDataMining ปีที่แล้ว
Possibly an issue with your internet connection? Are you behind a firewall or on a proxy? Alternatively, post the error on our Github page and we will try to help.
@angelo.signore 8 หลายเดือนก่อน
Hi, great video, thank you a lot!
I have a question: I did a research on Scopus, and exported the .csv file with, e.g., 700 entries. I choose to include in my columns Author and Indexed Keywords, Title and Abstract of the article.
The I choose the following widgets (in sequence):
Preprocess text
Document Embedding
t-SNE
However, if I choose Distances and then Hierarchical clustering, the clusters will NOT SHOW the "words" but the type of document (Article, Review, etc.) or other fields, such as "Title" or "Abstract", but the entire field is shown, not the tokenized words, e.g. "Feedback control of water supply in an NFT growing system" or "Light And CO2 interaction on peanut grown in nutrient film technique", not the single word.
I hope it's clear what I mean.
Thanks for all.
@OrangeDataMining 8 หลายเดือนก่อน
Document Embedding does not work on individual words. Instead, it returns document embeddings of fixed vector size, which are not interpretable. For this, you need Bag of Words. You can cluster on embeddings and then use bow features only for explanation, but they might not coincide. To explain individual clusters, you can use Box Plot or Word Enrichment. Alternatively, you can use Annotated Document Map after t-SNE, which will provide significant cluster words.
@angelo.signore 8 หลายเดือนก่อน
@OrangeDataMining, thanks for your answer.
So, if I have understood, I cannot apply to the .csv file from Scopus the procedures of this video, but I can do by putting the entire documents (.pdf, .docx, etc.) in a folder.
On the contrary, on the .csv file I can apply BOW after preprocess text, and then Box Plot, etc?
I didn't find Annotated Document Map, but I do found Annotated Corpus Map, and I have applied after BOW.
@OrangeDataMining 8 หลายเดือนก่อน
@@angelo.signore No, you can certainly apply the same procedure to .csv. Just use Corpus widget and define your text field under "Text features".
Sorry, I meant Annotated Corpus Map, you are right.
@angelo.signore 8 หลายเดือนก่อน
@@OrangeDataMining yes, I did the procedure on the .csv file.
I choose File->Corpus and under "Text features" Keyword, Title and Abstract.
Then Document Embedding->Proprocess Text->Distances, Hierarchical Clustering->Annotated Corpus Map
The problem is in the hierchical clustering I cannot find the term "Words" in the drop-down menu "Annotations"
@OrangeDataMining 8 หลายเดือนก่อน
@@angelo.signore No, Annotated Corpus Map follows t-SNE, not HC. After HC, you should use Box Plot or Word Enrichment (but this requires BoW before).

ต่อไป

เล่นอัตโนมัติ

Semantic Document Maps and Clusters

Semantic Document Maps and Clusters

Data Analysis 7: Clustering - Computerphile

Data Analysis 7: Clustering - Computerphile

Shocking Price OnePlus 8 9 8T 10T 10pro Google pixel 6pro 8 Moto 2024 ||Technical Gossips

Shocking Price OnePlus 8 9 8T 10T 10pro Google pixel 6pro 8 Moto 2024 ||Technical Gossips

Incredibox Sprunki - Wrong teacher to prank #simon #sprunki

Incredibox Sprunki - Wrong teacher to prank #simon #sprunki

ทุกคนต้องไม่เชื่อแน่ๆ เค้ากำลังจะต่อยลมให้กระดาษขาดงั้นหรอ #negi #ipman

ทุกคนต้องไม่เชื่อแน่ๆ เค้ากำลังจะต่อยลมให้กระดาษขาดงั้นหรอ #negi #ipman

อยู่มา2ปีไม่เคยรื้อเตียง และะสิ่งที่เจอ!

อยู่มา2ปีไม่เคยรื้อเตียง และะสิ่งที่เจอ!

Live!🔴 ทีมชาติไทย VS ทีมชาติลาว เชียร์สดฟุตบอลอุ่นเครื่อง FIFA DAY | 17 พ.ย. 67 #ทีมชาติไทย

Live!🔴 ทีมชาติไทย VS ทีมชาติลาว เชียร์สดฟุตบอลอุ่นเครื่อง FIFA DAY | 17 พ.ย. 67 #ทีมชาติไทย

Text Preprocessing

Text Preprocessing

Loading Text Corpus from the Document Repository

Loading Text Corpus from the Document Repository

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Word Embedding and Nearest Neighbors

Word Embedding and Nearest Neighbors

Social Media Text Mining/Data Analytics Using Orange

Social Media Text Mining/Data Analytics Using Orange

Document Embedding

Document Embedding

Explaining the Data Maps

Explaining the Data Maps

Text Prediction of Reviews in Orange Data Mining Software

Text Prediction of Reviews in Orange Data Mining Software

53 : Text Mining : Documents Semantic Analysis

53 : Text Mining : Documents Semantic Analysis

มีวิธีไหนที่ทำให้ Creaking ไม่เดินตามได้บ้าง?

มีวิธีไหนที่ทำให้ Creaking ไม่เดินตามได้บ้าง?

🔴Live โหนกระแส ไม่ต้องหลับ ไม่ต้องนอน ผู้เสียหายแฉ "เชน ธนา" เบี้ยวเงินค่าสินค้านับร้อยล้าน

🔴Live โหนกระแส ไม่ต้องหลับ ไม่ต้องนอน ผู้เสียหายแฉ "เชน ธนา" เบี้ยวเงินค่าสินค้านับร้อยล้าน

skibidi toilet 77 (full episode)

skibidi toilet 77 (full episode)

Incredibox Sprunki - Wrong teacher to prank #simon #sprunki

Incredibox Sprunki - Wrong teacher to prank #simon #sprunki

โจรขึ้นบ้านดรีม #ตลก #บ้านกูเอง #ละครสั้น #ben10 #sigma #หมาป่าเดียวดาย

โจรขึ้นบ้านดรีม #ตลก #บ้านกูเอง #ละครสั้น #ben10 #sigma #หมาป่าเดียวดาย

Live!🔴 ทีมชาติไทย VS ทีมชาติลาว เชียร์สดฟุตบอลอุ่นเครื่อง FIFA DAY | 17 พ.ย. 67 #ทีมชาติไทย

Live!🔴 ทีมชาติไทย VS ทีมชาติลาว เชียร์สดฟุตบอลอุ่นเครื่อง FIFA DAY | 17 พ.ย. 67 #ทีมชาติไทย

Cute Fish Crying 😭❤️|

Cute Fish Crying 😭❤️|

ศึกมวยไทยพันธมิตร 18/11/2024

ศึกมวยไทยพันธมิตร 18/11/2024