You can do half queries using 'isl:*' , this will return everything that starts with 'isl'. If you want a multiword search, add in between the words i.e 'sea mon:*'. This will return the row with 'Sea Monster'
Thank you for the detailed step-by-step session. PostgreSQL 12, released later, added generated columns. So we no longer need a trigger to keep the search column updated.
I searched for a full-text search explanation and just clicked on the first video on TH-cam. And this was such luck! Very clear and interesting explanation. Thank you a lot for doing this!
Great video! Not sure if it was already answered, but regarding your last question of 'how to query for half a word' - this seems to be possible with 'to_tsquery' but not 'plainto_tsquery', you can use it like 'to_tsquery('blah:*')'. Greets, Bernd
It is, as long as the input data for true search index lives on the same table. Once you join tables together to make a search index, a generated column doesn’t work.
Ok, you know how people say that they want to do something and don't know how and in the very same day you upload the exact solution? You did it again! By the way at the moment I am using Redis's RediSearch module. It's super fast, I mean hyper fast. But I have some data in Hasura too, so for some data it's overkill to put in Redis. So will see your solution in action.
Great video, the docs on this are pretty dense so this was helpful. It still blows me away how good postgresql is and how much you get essentially for free.
The tutorial was really great and has helped me get started. However in my use case I have multiple tables to search through and I have achieved it without indexing. How would you index columns from different tables for a single FTS query?
The solution is no longer as elegant, but you create a new table which combines the values you want to search on. Let's say you have a "users" table and a "blog_posts" table. CREATE TABLE searchable_blog_posts ( id SERIAL, user_id INTEGER, blog_post_id INTEGER, search_index tsvector ) CREATE INDEX index_searchable_blog_posts_search_index ON search_index USING GIN(search_index); Then you can populate this "searchable_blog_posts" table using the information you want to be searchable. Then you use triggers to update this `searchable_blog_posts` table whenever either a change is made to the "users" table or the "blog_posts" table.
Amazing tutorial good job, just one small issue your trigger does not update new income rows it should have an update statement in the trigger instead of a := e.g. BEGIN UPDATE table_name SET document_with_weights = setWeight(.........) WHERE id = NEW.id; return NEW END
Awesome!, I followed the video and got good results, I also solved the problem of leyword is "isl" to search for "Island", just add "isl: *", but i want to search: when I enter the keyword "land", I want to get "island", I tried with prefixes like: "*: keyword: *", but this is not a good shot and an err sytanx is printed, What do I do with this problem? Any help is appreciated, thanks
I would recommend against that. Indexing is costly. On a table with 1 million rows, you can expect indexing to take roughly 5 to 10 seconds. Also some indexing locks the table. Index your table with a delayed job that gets performed daily or hourly.
I noticed that using tvector make search 2 times slower comparing to "and (like '%abc%' or like '%bcd%'), on top of this tvector performance is not stable, sometimes its say 6 seconds, sometimes 11 seconds to get the response
At 15:18 , the variable "document_with_weights" got the datatype "any" which might be available in this PL but let's say if you had to do this with java or c# .. what datatype do you use for the variable?
postgres will cache query so that benchmark is clearly not prove anything. The only difference is strategy the database plan to perform. At 5:50 you showed 2 query the first is much slower than the second. However, look at the strategy it is the same (seq scan which is the worst) but the result is cached so it was faster. You need to re thinking about it
Awesome video ! The issue that it won't work for "isl" but well "islands" because it's an English word variation is really annoying when searching for names: searching an artist "Martin" won't match "Martina". If anyone has a pointer about a way to achieve that, I'm interested.
PostgreSQL is database that uses tables while Elasticsearch uses JSON. The latter is focused on search queries and so it's faster. In my opinion, it is geared towards enterprise stuff.
A little late, but does anyone know how can i implement something like the initial example, that searches a bunch of keywords? The behavior I want is to get every result that matches at least one of the keywords, but ordered by number of matches (and if possible, still considering the given weights for the fields, as a match in the title is still more valuable than a match in the description)
First thank you for this great video! But I'm still wondering how it works to search not just for one word. I want to search with more words as one coherent term, so that you compare for example the search term "highway robber battlefield" with the data in your data base. Does anyone have a solution for this?
I did an issue in the typeorm GitHub, it is about how to do an fulltext query with MySQL and typeorm github.com/typeorm/typeorm/issues/3191 I don't know if it is better than what you did, but I want to know your thoughts about it :)
Can you use Synonym search with PostgreSql or Elasticsearch on next video? Here an example. medium.com/@lucasmagnum/elasticsearch-setting-up-a-synonyms-search-facea907ef92
There is a rather big mistake in the search query at around 17:00. You used english stemming for creating the vector but not in the search query. Problem is that for example the word 'Training' will get stemmed to 'train' but now if a user searches for 'training' the search won't find any results because they do not match. You have to use 'english' stemming also in the query like this: document_with_weights @@ plainto_tsquery('english', :query)"
You can do half queries using 'isl:*' , this will return everything that starts with 'isl'. If you want a multiword search, add in between the words i.e 'sea mon:*'. This will return the row with 'Sea Monster'
Oh awesome
Hey man, you look to understand very well about this... do you have any video to share with us ?
What impact does :* have on performance?
@@sungatae he explained how to check in the video. Explain analyze.
what if i want to search 'island' just with 'land', i tried with '*:land' is not working and generate error message?
Thank you for the detailed step-by-step session. PostgreSQL 12, released later, added generated columns. So we no longer need a trigger to keep the search column updated.
So much clearer than the postgresql docs. My head was spinning reading all the technical details of how it works. THANKS
Works great! I've just tested it with 125k rows, and results are close enough so i can omit running elasticksearch like database!
This is the best and the simplest explanation of postgresql FTS I could find on internet
I searched for a full-text search explanation and just clicked on the first video on TH-cam. And this was such luck!
Very clear and interesting explanation. Thank you a lot for doing this!
That was very helpful. I love how you started very basic and then got into the complex queries. Thanks so much!
Great video! Not sure if it was already answered, but regarding your last question of 'how to query for half a word' - this seems to be possible with 'to_tsquery' but not 'plainto_tsquery', you can use it like 'to_tsquery('blah:*')'. Greets, Bernd
thanks
Great explanation.. saved a lot of time that I wasted searching over the internet.
Postgres Generated columns would be really useful for generating search documents as changes are made. Its also a bit simpler than triggers.
It is, as long as the input data for true search index lives on the same table. Once you join tables together to make a search index, a generated column doesn’t work.
Ben...finally my request have been granted...👍u rock
I like how you followed start simple finish master approach. Thank you for your awesome contents once again!
I create a MATERIALIZED VIEW and a combined GIN index on it to prevent adding those additional columns.
Ok, you know how people say that they want to do something and don't know how and in the very same day you upload the exact solution? You did it again! By the way at the moment I am using Redis's RediSearch module. It's super fast, I mean hyper fast. But I have some data in Hasura too, so for some data it's overkill to put in Redis. So will see your solution in action.
I've been meaning to try out the search module for Redis, that sounds sweet
I come from a T-SQL background, but codewars use Postgresql.
This was really interesting to learn, thanks for the upload
i was just looking up this topic, clicked on the video and half way though realised its from ben
Such a great tutorial! Thanks Ben!
use plainto_tsquery instead of to_tsquery, because to_tsquery does not escape the sql letters like ', \, !, so the query can be easily injected
Great video, the docs on this are pretty dense so this was helpful. It still blows me away how good postgresql is and how much you get essentially for free.
I didn't expect to find a video from you on this topic. I am lucky!
i really appreciate your effort for explaining in a simple way this powerful function, thanks a lot! ;)
Very clear explanation. Thank You so much!
this is was very helpful and step by step reallly helped.
EXCELLENT LESSON, PERFECT, SUBLIME, THANKS THANKS BEN AWAD!!!! A HUUUUUUUUUUUUUUUUGE THANKS!!!!! :D
Right to the point with great examples. Thanks!
The tutorial was really great and has helped me get started. However in my use case I have multiple tables to search through and I have achieved it without indexing. How would you index columns from different tables for a single FTS query?
The solution is no longer as elegant, but you create a new table which combines the values you want to search on. Let's say you have a "users" table and a "blog_posts" table.
CREATE TABLE searchable_blog_posts (
id SERIAL,
user_id INTEGER,
blog_post_id INTEGER,
search_index tsvector
)
CREATE INDEX index_searchable_blog_posts_search_index ON search_index USING GIN(search_index);
Then you can populate this "searchable_blog_posts" table using the information you want to be searchable. Then you use triggers to update this `searchable_blog_posts` table whenever either a change is made to the "users" table or the "blog_posts" table.
💪🏻💪🏻 2 of the list ✅. Dude, nice video! Seems like the video will do great! congrats and 🙏🏻thank you
Amazing tutorial good job, just one small issue your trigger does not update new income rows it should have an update statement in the trigger instead of a :=
e.g.
BEGIN
UPDATE table_name SET
document_with_weights = setWeight(.........)
WHERE id = NEW.id;
return NEW
END
Great video again Ben!
Great video, super helpful. Wonder how this would scale with a really large database.
Great tutorial! Thanks.
Excellent tutorial here! Thanks.
You are the man.
Awesome! Great video Ben! subbed 👍
Welcome :)
Awesome tutorial, thank you
Awesome tutorial!!
This is a GREAT tutorial! Thanks.
very usefull, thank you.
Thanks for this. Now what about accentuated chars? Latin2 etc ?
Great tutorial!
Awesome!,
I followed the video and got good results, I also solved the problem of leyword is "isl" to search for "Island", just add "isl: *",
but i want to search:
when I enter the keyword "land", I want to get "island",
I tried with prefixes like: "*: keyword: *", but this is not a good shot and an err sytanx is printed,
What do I do with this problem?
Any help is appreciated,
thanks
what about performance when you re-create index on every update?
no idea
I would recommend against that. Indexing is costly. On a table with 1 million rows, you can expect indexing to take roughly 5 to 10 seconds. Also some indexing locks the table. Index your table with a delayed job that gets performed daily or hourly.
you rebuild the GIN/GiST index for that row when the row is updated. More info here www.postgresql.org/docs/9.1/textsearch-indexes.html
I noticed that using tvector make search 2 times slower comparing to "and (like '%abc%' or like '%bcd%'), on top of this tvector performance is not stable, sometimes its say 6 seconds, sometimes 11 seconds to get the response
great video!
At 15:18 , the variable "document_with_weights" got the datatype "any" which might be available in this PL but let's say if you had to do this with java or c# .. what datatype do you use for the variable?
postgres will cache query so that benchmark is clearly not prove anything. The only difference is strategy the database plan to perform. At 5:50 you showed 2 query the first is much slower than the second. However, look at the strategy it is the same (seq scan which is the worst) but the result is cached so it was faster. You need to re thinking about it
If a table have child table then how a text can be search from parent table as well as child table. Please help me. I stuck in my project
hi Ben thanks for the video. Is there a way to do a partial match to a word with this type of search setup? So Noo would find Noodles?
I think so, but I haven't tried to do it myself
Can't you create a computed column for tsvector instead of creating trigger?
if we need to use search on 2 tables then how to build an index in this case.
pls help
Awesome video !
The issue that it won't work for "isl" but well "islands" because it's an English word variation is really annoying when searching for names: searching an artist "Martin" won't match "Martina".
If anyone has a pointer about a way to achieve that, I'm interested.
You can add a '':*' in the search query. Example with 'isla': plainto_tsquery('isla:*') will match all words prefixed with 'isla'
Thanks a lot
How do you add these indexes automatically on insert or update?
did you use javascript to see the probable result in your website or just prosgres ...if you did..how you do it ...can you help me
Man how you connect posters full text search with your saffron website please please please tell me .
Can you do full text with incomplete words?
Just wondering how I can search across multiple tables? Do I need to concatenate all tables into a new one and do the full text search there?
how about creating a view of all those tables and having a tsvector column of all the required columns in that same view
Could you please help me to understand the difference between Full Text Search PostgreSQL and Elasticsearch?
PostgreSQL is database that uses tables while Elasticsearch uses JSON. The latter is focused on search queries and so it's faster. In my opinion, it is geared towards enterprise stuff.
nice video
Ben Awad, what u say about FoundationDB?
never tried it
can you make videos on elastic search?
sure
there is a directive available on Typeorm: @Index({ fulltext: true })
Not sure what it does and how it works.
I didn't know that existed, I'll have to go check it out
Did you ever get round to checking this out?
Nope, haven't done another project with fulltext search yet
A little late, but does anyone know how can i implement something like the initial example, that searches a bunch of keywords? The behavior I want is to get every result that matches at least one of the keywords, but ordered by number of matches (and if possible, still considering the given weights for the fields, as a match in the title is still more valuable than a match in the description)
I love you. Is there anything you haven't covered?
Could please tell me what IDE that you've been used? thx
vscode
MySQL vs PSQL please
Is this possible utilizing knex?
yeah this isn't specific to typeorm, just postgres
Your awesome (y)..
wich software do u use there?
It's JetBrains' DataGrip for the DB stuff and VS Code for the JS stuff
I would definitely recommend this video, which builds on what you've said: th-cam.com/video/c8IrUHV70KQ/w-d-xo.html
Hey Ben, do you know how these results compare to using typeorm find or findOne?
you'll want to use find or findOne first if that supports your use case
every recipe is from julie right now.
In the video that's Julie's account
Or your looking at the sample cookbook which she made
What editor of that?
www.jetbrains.com/datagrip/
I'm a big an of datagrip - just know that you can tweak your sql syntax formatting. I'm not a big fan of their default editor formatting rules.
oh cool
@@danieldosen5260 But can you configure the style you need? If not, please, file a feature request in our tracker: youtrack.jetbrains.com/issues/DBE
First thank you for this great video!
But I'm still wondering how it works to search not just for one word. I want to search with more words as one coherent term, so that you compare for example the search term "highway robber battlefield" with the data in your data base. Does anyone have a solution for this?
you can use operators like &, | and !
You did not consider time to create a new column.
Exactly what im facing right now. I assume you mean adding more columns down the road to search on right?
trigrams
would be cool if you made a similar video but using pg_trgm instead of vectors
also sent a twitter dm about this
I did an issue in the typeorm GitHub, it is about how to do an fulltext query with MySQL and typeorm github.com/typeorm/typeorm/issues/3191 I don't know if it is better than what you did, but I want to know your thoughts about it :)
I've never tried to do that with MySQL so I'm not sure how it compares
@@bawad you can run it over docker and try it
Can you use Synonym search with PostgreSql or Elasticsearch on next video?
Here an example.
medium.com/@lucasmagnum/elasticsearch-setting-up-a-synonyms-search-facea907ef92
thank u for this video ! But i want to get all records in table and still use *doc..* @@ plainto_tsquery(:query) . So what do i put in "query" ?
Why do you need to use tsquery if you want all the records?
Hhhiiiii
There is a rather big mistake in the search query at around 17:00. You used english stemming for creating the vector but not in the search query. Problem is that for example the word 'Training' will get stemmed to 'train' but now if a user searches for 'training' the search won't find any results because they do not match. You have to use 'english' stemming also in the query like this: document_with_weights @@ plainto_tsquery('english', :query)"
great video!!