Thanks. that was fast and quite easy to uderstand. But if you would put cross links to your other videos like about Glue this would become even greater!
Great video! Just a heads up that the timestamps are in UTC, so most of us will have to do the offset calculation (5 hours ahead for EST during daylight savings). Maybe there's an easier way to specify that. Also, I'm really curious about the distinction between avro and parquet. I noticed that avro files were used in the metadata but parquet were used for the data. I heard Iceberg can accept avro and was wondering if there are advantages to only using avro.
line 3:5: mismatched input 'SYSTEM_TIME'. Expecting: 'TIMESTAMP', 'VERSION' I'm getting this error while running the timestamp querry. can you please tell me why?
Johnny the speed comes from partition by column we use while creating? Like if I used a different column insyead of date and and used the date related queries , will it still be faster or not?
Great intro to Iceberg, Johnny. Quick question, as well as delete can it support Truncate? Deletes are fine for a relatively small number of rows (in traditional DBMS's this is also true) but on millions of rows, Delete takes forever compared with Truncate. With Iceberg updating all those Manifests as it's deleting each row, would that not also be bit of a bottleneck, or is that offset somewhat by the compute resources of AWS?
Hi all, when creating iceberg table in Athena , I get " Exception encountered when executing query, this query ran against ...... database, unless qualified by the query . please post the error message on our forum ....., anyone know the solution ?
After populating the iceberg table, at 18:10, why it creates a folder with random chars before each partition folder? I'd like to have the partitions folders right after the data folder
Can you write me a snippet of code the moves an iceberg column to a different column position? I cannot for the life of me get it to work based on the AWS documention. Thanks. Tried several variants similar to: ALTER TABLE database.table_name CHANGE field1 string AFTER field2
Best tutorial ever on iceberg and aws services. Thank you very much for that.
Great job Johnny! Im excited about the potential of Iceberg on AWS too!
Great vid. Please make one for Hudi.
Would be nice for hudi
Added to the list!
Iceberg table is suitable for transformed layer or curated layer data rather than implementing it for raw data layer, am I right?
Thanks. that was fast and quite easy to uderstand. But if you would put cross links to your other videos like about Glue this would become even greater!
U are the AWS GOAT!!
Love it! Looking forward to more Apache Iceberg. Maybe in connection with Dremio
Great video! Just a heads up that the timestamps are in UTC, so most of us will have to do the offset calculation (5 hours ahead for EST during daylight savings). Maybe there's an easier way to specify that.
Also, I'm really curious about the distinction between avro and parquet. I noticed that avro files were used in the metadata but parquet were used for the data. I heard Iceberg can accept avro and was wondering if there are advantages to only using avro.
Nice tutorial! I love how you share your knowledge! Thanks!
Around 26 minutes after you queried the deleted data it said it scanned 5.76MB. That seems like a lot for just metadata!
Great video ! Would be great one using streaming from kinesis to iceberg. Like kinesis +EMR + glue catalog + iceberg
line 3:5: mismatched input 'SYSTEM_TIME'. Expecting: 'TIMESTAMP', 'VERSION'
I'm getting this error while running the timestamp querry. can you please tell me why?
great explanations! love your videos!!! thanks! 🙂
What a fantastic video. Great learning :)
Johnny the speed comes from partition by column we use while creating? Like if I used a different column insyead of date and and used the date related queries , will it still be faster or not?
For a very large dataset (like around 15 billion rows overall) is it going to give good performance if we use iceberg to select/delete/update ?
so useful!
Hello Johnny chivers. Is there a way to create iceberg table with existing metadata and data using Athena or Glue?
thank you
Can we create an iceberg table to S3 using multi region access point?
Love your vids, really appreciate the work you do!
You are amazing❤
is there any way that it wont create random prefixes while inserting the partitioned data at @18:10?
I failed to create nested y/m/d partition for iceberg table in Athena, how to accomplish this?
Great intro to Iceberg, Johnny. Quick question, as well as delete can it support Truncate? Deletes are fine for a relatively small number of rows (in traditional DBMS's this is also true) but on millions of rows, Delete takes forever compared with Truncate. With Iceberg updating all those Manifests as it's deleting each row, would that not also be bit of a bottleneck, or is that offset somewhat by the compute resources of AWS?
After running the SQL delete, iceberg stills query with the time travel feature?
Yes, the snapshots are still present.
Hi all, when creating iceberg table in Athena , I get " Exception encountered when executing query, this query ran against ...... database, unless qualified by the query . please post the error message on our forum ....., anyone know the solution ?
After populating the iceberg table, at 18:10, why it creates a folder with random chars before each partition folder? I'd like to have the partitions folders right after the data folder
Ideally, you should not have to deal with this yourself. The idea of iceberg is that it handles things like that for you.
Can you write me a snippet of code the moves an iceberg column to a different column position? I cannot for the life of me get it to work based on the AWS documention. Thanks.
Tried several variants similar to:
ALTER TABLE database.table_name CHANGE field1 string AFTER field2
ALTER TABLE database.table_name CHANGE field1 field1 string AFTER field2