Thanks for putting this presentation together, it's a great overview. It's not clear from the video, how do we specify position versus equality deletes?
There isn't a particular way for Spark, it just uses position deletes, the only situation I think you can use equality deletes currently is in flink for streaming which you'd then clean up via compaction.
Awsome Video !! At 3:18 when explaining different delete format I have question regards to the implementation : As the delete mode only accept MOR or COW , how exactly do I specify the delete operation to use Equality delete or Positional delete ??
It’s mainly based on the engine, most engines will use position delete but streaming platforms like Flink will use equality deleted to keep write latency to a minimum
Thanks Alex for great explanation. it is not clear for me what do delete files contain in case of update statement issued against table ? do delete files will have post image of the rows for example or what will happen ? thanks
What it is actually done is Append on Write, and not Copy on Write. Because the file is written elsewhere and only the pointer changes to the file with the new raw.
1:40 why do you say that, in Hive, updating a row would imply re-writting all the files composing the affected partition? Why is not just the Parquet file that contains the updated row? I mean, why would the other Parquet files in the partition have to be re-written ?
If you directly update the single file that's fine, but the Hive metastore tracks tables and partitions and not single files, so If I run an update query against Hive it's not aware of the file that needs updating, just the partition so it rewrites that partition and then swaps out the reference in metastore to the location of the new version of the partition. - Alex
@@Dremio Thanks! Waw, I had not realised Hive was that inefficient ! So if I update a single row, all the parquet files composing the partition will be re-written, even though only one parquet file should be affected. Correct ?
@@galeop I wouldn't say it is inefficient, it just wasn't originally designed for the same reasons. Hive was mainly wanting to figure out how define a table for the SQL -> MapReduce functionality. A lot of the problems and bottlenecks didn't become apparent till later which is why formats like Iceberg were invented.
Very helpful! Thanks for also explaining the two types of delete files.
Hi Alex,
Very good Content and Explanation.
[Notes Part-2]
>>>>>>>>>>>>>>>>> Setting the table for COW or MOR: >>> When to use which write mode?
Great explanation! Thank you for this video!
Thanks for putting this presentation together, it's a great overview.
It's not clear from the video, how do we specify position versus equality deletes?
There isn't a particular way for Spark, it just uses position deletes, the only situation I think you can use equality deletes currently is in flink for streaming which you'd then clean up via compaction.
Awsome Video !!
At 3:18 when explaining different delete format I have question regards to the implementation :
As the delete mode only accept MOR or COW , how exactly do I specify the delete operation to use Equality delete or Positional delete ??
It’s mainly based on the engine, most engines will use position delete but streaming platforms like Flink will use equality deleted to keep write latency to a minimum
Which version of spark supports delete files?
Thanks Alex for great explanation.
it is not clear for me what do delete files contain in case of update statement issued against table ?
do delete files will have post image of the rows for example or what will happen ?
thanks
If an update, the delete file will reference the deleted old version. The new version of the row would be in a new file.
What it is actually done is Append on Write, and not Copy on Write. Because the file is written elsewhere and only the pointer changes to the file with the new raw.
1:40 why do you say that, in Hive, updating a row would imply re-writting all the files composing the affected partition? Why is not just the Parquet file that contains the updated row? I mean, why would the other Parquet files in the partition have to be re-written ?
If you directly update the single file that's fine, but the Hive metastore tracks tables and partitions and not single files, so If I run an update query against Hive it's not aware of the file that needs updating, just the partition so it rewrites that partition and then swaps out the reference in metastore to the location of the new version of the partition. - Alex
@@Dremio Thanks!
Waw, I had not realised Hive was that inefficient ! So if I update a single row, all the parquet files composing the partition will be re-written, even though only one parquet file should be affected. Correct ?
@@galeop I wouldn't say it is inefficient, it just wasn't originally designed for the same reasons. Hive was mainly wanting to figure out how define a table for the SQL -> MapReduce functionality. A lot of the problems and bottlenecks didn't become apparent till later which is why formats like Iceberg were invented.