#Computing Paradigmatic similarity ###Represent each word by its context ###Compute Context Similarity ###Words with high context similarity have paradigmatic relation. That's the heart of the matter!
11:28 that code has a problem: by counting both w1 and w2, each word is counted twice, except for the first and last words in the sentence. I've been trying to fix this, but I can't find a perfect solution. (removing the ON MATCH from the w1 case works well, but only for a single query; doing a second query will fail to count the first word if it appeared in the first sentence)
I solved it this way... // Word adjacency graph with word counts WITH split(toLower("I like chicken sandwiches with cheese"), " ") AS text UNWIND range(0,size(text)-2) AS i MERGE (w1:Word {name: text[i]}) ON CREATE SET w1.count = 0 MERGE (w2:Word {name: text[i+1]}) ON CREATE SET w2.count = 0 MERGE (w1)-[r:NEXT]->(w2) ON CREATE SET r.count = 1 ON MATCH SET r.count = r.count+1 WITH i, text MATCH (w:Word {name:text[i]}) SET w.count = w.count+1 WITH text, CASE WHEN i=size(text)-2 THEN 1 ELSE 0 END as inc MATCH (w:Word {name:text[size(text)-1]}) SET w.count = w.count+inc
Insightful !! One question thought...How can one extract keywords from a article before loading them onto graph ... Like POS tagging or custom ontology tagging ??
HI ! Thanks for your video. Unfortunatly I tried your Cypher code presented at 16:51, it doesn't work. Unless we made a mistake, can we contact us to try to solve our problem ? Thanks in advance
I've been doing experiments with the Opinion Mining for a while and... it works, but the results are kinda funky. I took a bunch of Amazon reviews for the product Kindle Paperwhite, and beside the number 1 result being "this is a book" which was pretty funny, my first significant result was "This is a huge improvement". So far so good, but then I remade the graph removing the stopword "a" and now the first result is "This is not huge improvement"... and I was like "oh crap". I don't know what to make of it. The only solution I can think of is to query positive reviews and negative reviews separately.
Thanks for this! This is a gold mine of awesomeness!!!
that was a really good representation of giving meaning to use Neo4j as ya mining database. Many thx
#Computing Paradigmatic similarity
###Represent each word by its context
###Compute Context Similarity
###Words with high context similarity have paradigmatic relation.
That's the heart of the matter!
Great illustrations of the ideas turn understanding so easy. Thank you Very much.
Impressed for NPL parts. This is exactly what I need.
what's the relationship of your approach with Markov chains?
11:28 that code has a problem: by counting both w1 and w2, each word is counted twice, except for the first and last words in the sentence.
I've been trying to fix this, but I can't find a perfect solution. (removing the ON MATCH from the w1 case works well, but only for a single query; doing a second query will fail to count the first word if it appeared in the first sentence)
I solved it this way...
// Word adjacency graph with word counts
WITH split(toLower("I like chicken sandwiches with cheese"), " ") AS text
UNWIND range(0,size(text)-2) AS i
MERGE (w1:Word {name: text[i]})
ON CREATE SET w1.count = 0
MERGE (w2:Word {name: text[i+1]})
ON CREATE SET w2.count = 0
MERGE (w1)-[r:NEXT]->(w2)
ON CREATE SET r.count = 1
ON MATCH SET r.count = r.count+1
WITH i, text
MATCH (w:Word {name:text[i]}) SET w.count = w.count+1
WITH text, CASE WHEN i=size(text)-2 THEN 1 ELSE 0 END as inc
MATCH (w:Word {name:text[size(text)-1]}) SET w.count = w.count+inc
@@orderscapeinc.6341 I've been trying this for larger corpuses and the query performance is very slow. Any suggestions on how to speed it up?
This is fabulous. Can you provide access to your original slide deck and are there any updates on this topic?
Insightful !! One question thought...How can one extract keywords from a article before loading them onto graph ... Like POS tagging or custom ontology tagging ??
HI ! Thanks for your video. Unfortunatly I tried your Cypher code presented at 16:51, it doesn't work. Unless we made a mistake, can we contact us to try to solve our problem ? Thanks in advance
I've been doing experiments with the Opinion Mining for a while and... it works, but the results are kinda funky.
I took a bunch of Amazon reviews for the product Kindle Paperwhite, and beside the number 1 result being "this is a book" which was pretty funny, my first significant result was "This is a huge improvement".
So far so good, but then I remade the graph removing the stopword "a" and now the first result is "This is not huge improvement"... and I was like "oh crap".
I don't know what to make of it. The only solution I can think of is to query positive reviews and negative reviews separately.
where do you get the dataset from?
Any way you can provide the original slide deck for this? Thanks
Great ideas William. Any graph based solutions or ideas or papers for NLP Q&A tasks?
I had 3 papers please, go through my researchgate and see it. ali muttaleb! Thanks
it is really good explanation, and really good of technology usage , but where is the dataset?