38:50 - problem that graph db's are supposed to solve: "understand what you're looking for from your data"... "have to know what questions you're going to ask". I'm new to graph db concepts, long-time relational dbms guy, but as I understand it, this is what a graph db is best suited for: when you don't know every question you may need to answer in the future. Thinking through this, it implies to me that inherent to make a graph deliver the analytical goods, you have to have all (or nearly all) possible relationships defined. Essentially I think this would become like matrices, with the graph db engine just pre-indexing everything so you CAN ask almost any question. I need to understand this part more.
Nice presentation, but the example related to south Florida doctors appears weak. The anomaly looks to have nothing to do with relations, and all to do with the per-doctor bill rate, and even in the example, the visualization detected it.
After 4 years of this video, the graph db landscape has changed quite a bit for sure. I find Apache AGE (a graph extension to PostgresQL) quite good. It has apache license and uses Cypher language (which is a simple graph language compared to others). It takes advantage of the postgresQL DB engine so you get both graph db and relational db in one program and enjoy the extensibility of postgres
I think you are right that explicit operations on graphs have been oddly not in the public forefront of ML as the hot topic it is today-- but of course probability graphs and Bayesian integration over graphs is fairly well studied. Then there is DeepWalk and its relatives in the more modern realm.
It is hard for me NOT to see so. I tend to agree, and I think not a lot of businesses have the river-crossing constrainted situations. What is so wrong with recursive CTEs? used to be performance, but not nowaday with cloud computing..
My understanding is Graph DBs have pretty specific use-cases to shine. With proper RDMS and even bridge/link/weak entity tables, a lot of relationship needs are realised withou having to use Graph databases. I am not comfortable with the referential integrity and schema correctness (lack thereof) in Graph DBs. I am thinking: I could build a graph data model (or re-model) of the relational data model on top of it for specific complex queries. That is more like a graph data warehouse atop the typical relational data model with all its established fortes which overcome Graph DBs limitations like ternary relationships... I still can't find good reasons to use a Graph DB as a backbone for applications.
The goat becoming a goose is not the only thing wrong with that graph. It says take goose when it meant to say take fox on the second trip, making it wrong and confusing. Somebody failed him.
Awesome presentation! small question for you..... can I use graph database to organize and draw insights around in-structured data stored in disparate storage?
very interesting and insightful. I imagined that Graph DB world would be much more mature than it's presented here. It's the end of 2021 now - has there been any significant maturation in this field since mid-2019 when this lecture was given?
Graph dbs is still a new field. Orient Db was killed by SAP... and there are still tons of players in the market. That being said, SQL also has a ton of databases... there is much more databases than the typical SQLite, MySQL, Postgres, SQL Server and Oracle... I wonder if graph databases will become like Linux : dozens of differents flavors of everything.
Graph DBs are not any fun facts. They have solid position on the market. But as he said, this not product for all. Howewer graph DBs are very flexible and u can do with them all what you can achive with relational DB. The qustion could be - but why? For me there's only one player on the market: Neo4j. Others are just some attractions.
I’m very skeptic about this presentation… A relational database has the relations (edges) in the tables not in the foreign keys. A foreign key means that it represents the same object (node). My conclusion he has no understanding of relational databases whatsoever. A relational database is really about relations (and not about entities, whatever he means by that). (From 5:44) Later he speaks about three different types of graph databases… About Graph Computing Engines he tells (33:00): “This means you have to use something like a relational database to populate that up.” Aha, so a relational database can actually represent anything a graph needs. About RDF TripletStores (34:30): “If you have a table with one row of data with 4 columns you end up with 5 vertices in your graph.” Yes, that’s right, they are equivalent! About Labelled Property Graphs (36:10): “The distinctive difference, with a relational database edges can have rich meta data.” You do that in relational databases also (it is not only possible, it is common). And even more, in relational databases you can also define edges between two edges and edges between a node and an edge, also with properties. My conclusion is that a graph database is only a subset of a relational database, with no additional features. Anything you can represent and process in a graph database you can represent and process in a relational database, and not the other way around. That’s way relational database manufacturers implement (in different forms) graph databases on top of the relational database and Prolog manufacturers (Prolog is a relational programming language) implement graph databases on top of Prolog with persistency included. If you (really) understand relational databases and/or Prolog and/or have used Object Role Modelling (which was called NIAM before) it is kind of obvious. In ORM you design in your database as a graph and you implement it in a relational database (and if it is a simple database, no relations between relations, you can also implement it in a graph database). I’m always surprised about the lack of understanding about the relational model of the people who are so enthusiastic about graph databases. And I don’t understand the hype. There are more mature (since the seventies!) and powerful solutions, languages and tools with more features (distributed databases and distributed queries for example). And yes, a lot of the more complex problems require analysis of trees, networks or graphs. But with SQL (since 1974) and/or Prolog (since 1972) you can analyse them, generate .dot commands for Graphviz (short for Graph Visualization Software, since 1991) and visualise them. Old stuff, it works and is mature... And some tips: * Learn and understand the Relational Model * The (primary, alternate, foreign) keys are the references to objects (identifiers of the objects). * The rows represent relations (actually the combination of some binary relations or edges). * Learn Object Role Modelling for modelling your domain * Learn and use DOT / Graphviz to visualise your graphs (free) * Free and open source * OMNI Graffle supports DOT / Graphviz * Check the library of your language for existing support * Learn and use SQL (and try to stay away from the procedural additions like T-SQL and the like) * Learn to write recursive queries for analysis * Learn to generator DOT commands with SQL (for visualisation, you don’t need recursive queries) * Free and open source products available (Postgresql, MySQL, SQLite) * If you know Prolog, use it (SWI-Prolog, free download) * Libraries for graphs, RDF database and packs for graphviz
@Timber Wolf * In a relational database every row in every table is a relation. * If the database is in fifth normal form, the relations consists usually of one or more elementary relations. * There are different kinds of elementary relations. * The first kind of elementary relation is a relation between a primary key (a real world object) and a column value (data object). (In this case the column is called an attribute). * This elementary relation is a functional relation, where the key values form the input, and the column value is the “output”. * Examples: first name, last name, social security number, date of birth of a person where the combination of first and last name, social security number or a surrogate are the alternatives for the primary key (one of them is then the primary key, the others alternate keys). * The second kind of elementary relation is between a primary key and a foreign key. This a relation between two real world objects. * An example is the relation between a person and the (biological) mother (also a person). * This kind of elementary relation is also a functional relation, where the key values form the input, and each column value of the foreign key (or the foreign key as a whole) is the “output”. * The third kind of elementary relation is a relation between two foreign keys. * An example is one person likes another person. * The primary key is formed by the two foreign keys or by a surrogate. * In a triplet store graph database all these elementary relations are edges (with surrogate keys for the objects). * In a labeled-property graph database the first one is a property, the other two are edges. * In a relational database all the elementary relations with the same primary key can be joined in one relation in the fifth normal form (it is also possible not to join the elementary relations, then every elementary relation becomes table and you don’t need nulls). * This means that usually in a relational database a lot of relations are joined and stored together. For a lot of cases there is no need for joining or following pointers… * Another example is the relation: person A knows that person B likes person C. * In fact there are two relations: person B likes person C, and person A knows that … * Both are elementary relations of the third kind, where the second relation is a second order relation (a relation about a relation) * Second and higher order relations are no problem for a triplet store graph database or a relational database (actually very common practice), but are a problem for a labeled-property graph database (unless it is a hypernode or hypergraph database). * I think a lot of misunderstanding comes from Entity-Relationship Modeling, a technique with it’s roots in the pre-relational area. It’s more focused on technical concepts than on conceptual modeling. * As I posted earlier, better use Object Role Modeling. ORM in stead of ERM, one character difference, but a whole world different. * Long story very short: the entities are the relations (properties or edges), and the relationships (the keys) are the objects (or nodes). * Actually you can do without primary and foreign keys (especially with static data and also with views) (primary and foreign keys “only” help keeping dynamic data consistent). Without primary and foreign keys all the information is still there. * On the other hand, if you label the foreign key relationships you can describe the role the object plays in the relation (for example a person having the role of mother in the person X has mother Y). But still, the actual relation is in the row of the table and not in the foreign key relationship. * A foreign key is “only” a reference. If you read in a book about ‘John F. Kennedy’, that’s a reference to a person, you can use that reference or key to look up information elsewhere. * I hope I showed that keys represent real world objects, whether it is a primary, alternate or foreign key. And that a person primary key and a person foreign key with the same values represent the same person! * I hope I showed that the rows are the relations (actually mostly a whole bunch of relations). * If there are any questions left, let me know (for example about performance, I think that in a lot of cases RDBMS outperform (LPG) graph databases).
@Timber Wolf Do you blame ORM, or ERM? I do blame ERM. I think too few people are educated in conceptual and database modeling. And I think ORM gives the right fundamentals and a step by step procedure to come to a conceptual model (actually a graph model!) which can be used to implement in whatever format, or to convert between formats, or to see what can’t be implemented in a certain format (as with second and higher order relations in LPG graph databases).
The stated problem was modified by magically transforming the goat into a goose. Therefore it is OK to modify the basic assumptions. Why does the fox and the goat/goose have to go in the boat? They can swim. Tie them to the boat. Put the carrots in the boat and do it in one trip.
Why would you use the most useless database model in the WORLD (Twitter) to illustrate the positive aspects of the project? Also, presentation is it's strength, and presentation is the last step in real research - AFTER the problem is solved.
@@superscatboy If you actually took my comment serious as if I thought people wouldn't be able to do that simple math you may need to check your sense of humor, I was joking about how slow he talks, read between the lines
@@superscatboy I thought your comment was berating my intelligence but I see you were probably just adding to the joke lmao, sorry my bad. Maybe my sense of humor is what needs checking
38:50 - problem that graph db's are supposed to solve: "understand what you're looking for from your data"... "have to know what questions you're going to ask". I'm new to graph db concepts, long-time relational dbms guy, but as I understand it, this is what a graph db is best suited for: when you don't know every question you may need to answer in the future. Thinking through this, it implies to me that inherent to make a graph deliver the analytical goods, you have to have all (or nearly all) possible relationships defined. Essentially I think this would become like matrices, with the graph db engine just pre-indexing everything so you CAN ask almost any question. I need to understand this part more.
One of the besr graph DB presentation. You know what u are taking about.
Skip to 21:55 for where graph db have advantages
Or you could watch the presentation in the original context.
Awesome clarity of perspective. Thanks for providing a comprehensive overview of the lay of the land.
Nice presentation, but the example related to south Florida doctors appears weak. The anomaly looks to have nothing to do with relations, and all to do with the per-doctor bill rate, and even in the example, the visualization detected it.
After 4 years of this video, the graph db landscape has changed quite a bit for sure. I find Apache AGE (a graph extension to PostgresQL) quite good. It has apache license and uses Cypher language (which is a simple graph language compared to others). It takes advantage of the postgresQL DB engine so you get both graph db and relational db in one program and enjoy the extensibility of postgres
Great presentation with a lot of DOs and DONTs and a very honest and trustworthy content.
I think you are right that explicit operations on graphs have been oddly not in the public forefront of ML as the hot topic it is today-- but of course probability graphs and Bayesian integration over graphs is fairly well studied. Then there is DeepWalk and its relatives in the more modern realm.
From 26:20 to 26.40 the goat becomes a goose. Evolution confirmed
Superb introduction. Thanks very much!
You can emulate graph databases in relational databases. I have an MS Access graph database.
It is hard for me NOT to see so. I tend to agree, and I think not a lot of businesses have the river-crossing constrainted situations. What is so wrong with recursive CTEs? used to be performance, but not nowaday with cloud computing..
Very good talk, thanks for sharing
My understanding is Graph DBs have pretty specific use-cases to shine. With proper RDMS and even bridge/link/weak entity tables, a lot of relationship needs are realised withou having to use Graph databases.
I am not comfortable with the referential integrity and schema correctness (lack thereof) in Graph DBs. I am thinking: I could build a graph data model (or re-model) of the relational data model on top of it for specific complex queries. That is more like a graph data warehouse atop the typical relational data model with all its established fortes which overcome Graph DBs limitations like ternary relationships...
I still can't find good reasons to use a Graph DB as a backbone for applications.
The goat becoming a goose is not the only thing wrong with that graph. It says take goose when it meant to say take fox on the second trip, making it wrong and confusing. Somebody failed him.
ha i caught that too, i was like, what's going on here lol
Awesome presentation! small question for you..... can I use graph database to organize and draw insights around in-structured data stored in disparate storage?
This is wher graph database will shine. It can serve as bridge.
Greate and insightful presentation, thanks a lot!
very interesting and insightful. I imagined that Graph DB world would be much more mature than it's presented here. It's the end of 2021 now - has there been any significant maturation in this field since mid-2019 when this lecture was given?
Graph dbs is still a new field. Orient Db was killed by SAP... and there are still tons of players in the market. That being said, SQL also has a ton of databases... there is much more databases than the typical SQLite, MySQL, Postgres, SQL Server and Oracle... I wonder if graph databases will become like Linux : dozens of differents flavors of everything.
Graph DBs are not any fun facts. They have solid position on the market. But as he said, this not product for all. Howewer graph DBs are very flexible and u can do with them all what you can achive with relational DB. The qustion could be - but why?
For me there's only one player on the market: Neo4j. Others are just some attractions.
great presentation! thank you!
Great presentation, a friend of mine was telling me graph db is the best and it can do everything xD....
I’m very skeptic about this presentation…
A relational database has the relations (edges) in the tables not in the foreign keys. A foreign key means that it represents the same object (node). My conclusion he has no understanding of relational databases whatsoever. A relational database is really about relations (and not about entities, whatever he means by that). (From 5:44)
Later he speaks about three different types of graph databases…
About Graph Computing Engines he tells (33:00): “This means you have to use something like a relational database to populate that up.” Aha, so a relational database can actually represent anything a graph needs.
About RDF TripletStores (34:30): “If you have a table with one row of data with 4 columns you end up with 5 vertices in your graph.” Yes, that’s right, they are equivalent!
About Labelled Property Graphs (36:10): “The distinctive difference, with a relational database edges can have rich meta data.”
You do that in relational databases also (it is not only possible, it is common). And even more, in relational databases you can also define edges between two edges and edges between a node and an edge, also with properties.
My conclusion is that a graph database is only a subset of a relational database, with no additional features. Anything you can represent and process in a graph database you can represent and process in a relational database, and not the other way around. That’s way relational database manufacturers implement (in different forms) graph databases on top of the relational database and Prolog manufacturers (Prolog is a relational programming language) implement graph databases on top of Prolog with persistency included.
If you (really) understand relational databases and/or Prolog and/or have used Object Role Modelling (which was called NIAM before) it is kind of obvious. In ORM you design in your database as a graph and you implement it in a relational database (and if it is a simple database, no relations between relations, you can also implement it in a graph database).
I’m always surprised about the lack of understanding about the relational model of the people who are so enthusiastic about graph databases. And I don’t understand the hype. There are more mature (since the seventies!) and powerful solutions, languages and tools with more features (distributed databases and distributed queries for example).
And yes, a lot of the more complex problems require analysis of trees, networks or graphs. But with SQL (since 1974) and/or Prolog (since 1972) you can analyse them, generate .dot commands for Graphviz (short for Graph Visualization Software, since 1991) and visualise them. Old stuff, it works and is mature...
And some tips:
* Learn and understand the Relational Model
* The (primary, alternate, foreign) keys are the references to objects (identifiers of the objects).
* The rows represent relations (actually the combination of some binary relations or edges).
* Learn Object Role Modelling for modelling your domain
* Learn and use DOT / Graphviz to visualise your graphs (free)
* Free and open source
* OMNI Graffle supports DOT / Graphviz
* Check the library of your language for existing support
* Learn and use SQL (and try to stay away from the procedural additions like T-SQL and the like)
* Learn to write recursive queries for analysis
* Learn to generator DOT commands with SQL (for visualisation, you don’t need recursive queries)
* Free and open source products available (Postgresql, MySQL, SQLite)
* If you know Prolog, use it (SWI-Prolog, free download)
* Libraries for graphs, RDF database and packs for graphviz
@Timber Wolf
* In a relational database every row in every table is a relation.
* If the database is in fifth normal form, the relations consists usually of one or more elementary relations.
* There are different kinds of elementary relations.
* The first kind of elementary relation is a relation between a primary key (a real world object) and a column value (data object). (In this case the column is called an attribute).
* This elementary relation is a functional relation, where the key values form the input, and the column value is the “output”.
* Examples: first name, last name, social security number, date of birth of a person where the combination of first and last name, social security number or a surrogate are the alternatives for the primary key (one of them is then the primary key, the others alternate keys).
* The second kind of elementary relation is between a primary key and a foreign key. This a relation between two real world objects.
* An example is the relation between a person and the (biological) mother (also a person).
* This kind of elementary relation is also a functional relation, where the key values form the input, and each column value of the foreign key (or the foreign key as a whole) is the “output”.
* The third kind of elementary relation is a relation between two foreign keys.
* An example is one person likes another person.
* The primary key is formed by the two foreign keys or by a surrogate.
* In a triplet store graph database all these elementary relations are edges (with surrogate keys for the objects).
* In a labeled-property graph database the first one is a property, the other two are edges.
* In a relational database all the elementary relations with the same primary key can be joined in one relation in the fifth normal form (it is also possible not to join the elementary relations, then every elementary relation becomes table and you don’t need nulls).
* This means that usually in a relational database a lot of relations are joined and stored together. For a lot of cases there is no need for joining or following pointers…
* Another example is the relation: person A knows that person B likes person C.
* In fact there are two relations: person B likes person C, and person A knows that …
* Both are elementary relations of the third kind, where the second relation is a second order relation (a relation about a relation)
* Second and higher order relations are no problem for a triplet store graph database or a relational database (actually very common practice), but are a problem for a labeled-property graph database (unless it is a hypernode or hypergraph database).
* I think a lot of misunderstanding comes from Entity-Relationship Modeling, a technique with it’s roots in the pre-relational area. It’s more focused on technical concepts than on conceptual modeling.
* As I posted earlier, better use Object Role Modeling. ORM in stead of ERM, one character difference, but a whole world different.
* Long story very short: the entities are the relations (properties or edges), and the relationships (the keys) are the objects (or nodes).
* Actually you can do without primary and foreign keys (especially with static data and also with views) (primary and foreign keys “only” help keeping dynamic data consistent). Without primary and foreign keys all the information is still there.
* On the other hand, if you label the foreign key relationships you can describe the role the object plays in the relation (for example a person having the role of mother in the person X has mother Y). But still, the actual relation is in the row of the table and not in the foreign key relationship.
* A foreign key is “only” a reference. If you read in a book about ‘John F. Kennedy’, that’s a reference to a person, you can use that reference or key to look up information elsewhere.
* I hope I showed that keys represent real world objects, whether it is a primary, alternate or foreign key. And that a person primary key and a person foreign key with the same values represent the same person!
* I hope I showed that the rows are the relations (actually mostly a whole bunch of relations).
* If there are any questions left, let me know (for example about performance, I think that in a lot of cases RDBMS outperform (LPG) graph databases).
@Timber Wolf Do you blame ORM, or ERM? I do blame ERM.
I think too few people are educated in conceptual and database modeling. And I think ORM gives the right fundamentals and a step by step procedure to come to a conceptual model (actually a graph model!) which can be used to implement in whatever format, or to convert between formats, or to see what can’t be implemented in a certain format (as with second and higher order relations in LPG graph databases).
Cool stuff
excellent try to Unity and Q# Tutorial
Thats great ----I might look into iT, take a closer look for mY master's Project "Teaching Graph Databases"! any recommendation, buys?!
goat != goose
The stated problem was modified by magically transforming the goat into a goose. Therefore it is OK to modify the basic assumptions. Why does the fox and the goat/goose have to go in the boat? They can swim. Tie them to the boat. Put the carrots in the boat and do it in one trip.
That is called BPR.
Why would you use the most useless database model in the WORLD (Twitter) to illustrate the positive aspects of the project?
Also, presentation is it's strength, and presentation is the last step in real research - AFTER the problem is solved.
theres no gun
Pro tip at 2X speed this only becomes 30 minutes
60 / 2 = 30?!
Wow, thanks for the insight!
@@superscatboy If you actually took my comment serious as if I thought people wouldn't be able to do that simple math you may need to check your sense of humor, I was joking about how slow he talks, read between the lines
@@dabbopabblo At least you're not pissed off about it 😂
@@superscatboy I thought your comment was berating my intelligence but I see you were probably just adding to the joke lmao, sorry my bad. Maybe my sense of humor is what needs checking