Table Format Overview
Apache Iceberg is an open table format for huge analytic tables. It brings ACID transactions, schema evolution, hidden partitioning, and time travel to data lakes — without locking you into any single compute engine.
A raw data lake is just files (Parquet, ORC) sitting in S3/HDFS. There is no ACID, no schema tracking, no partition history. If you rename a partition column, all old queries break. If two writers write at the same time, you get corruption. Iceberg solves all of this.
Iceberg sits between your compute engine (Spark) and your storage (S3/HDFS). It maintains a metadata layer — a tree of files that describes every version of your table. Spark reads this metadata first, then fetches only the needed data files.
Iceberg vs Hive vs Delta vs Hudi
Understanding where Iceberg fits by comparing it against Hive partitioning (the old way), Delta Lake (Databricks), and Apache Hudi (Uber).
Hive partitioning stores data in directory paths like year=2024/month=01/. This has serious problems: queries must filter on exact partition columns, renaming a partition breaks everything, and listing millions of partitions kills the metastore. Iceberg eliminates all of this.
| Problem | Hive Partitioning | Apache Iceberg |
|---|---|---|
| Query must specify partition? | ✗ Yes — or full scan | ✓ Hidden — automatic pruning |
| Rename partition column? | ✗ Breaks queries | ✓ Partition evolution — no rewrite |
| Millions of partitions? | ✗ Metastore bottleneck | ✓ Manifest files — scalable |
| Time travel? | ✗ Not supported | ✓ Snapshot-based |
| Concurrent writes? | ✗ Manual locking | ✓ Optimistic concurrency |
| Feature | Apache Iceberg | Delta Lake | Apache Hudi |
|---|---|---|---|
| Created by | Netflix / Apple | Databricks | Uber |
| Multi-engine support | ✓ Excellent | Partial (OSS) | Good |
| Partition evolution | ✓ Best-in-class | Liquid clustering only | Limited |
| Hidden partitioning | ✓ Yes | ✗ No | ✗ No |
| Record-level upserts | ✓ MERGE | ✓ MERGE | ✓ Best for upserts |
| Primary use case | Large analytic tables | Databricks lakehouse | Near-real-time CDC |
| Cloud-native | ✓ AWS/GCP/Azure | ✓ AWS/GCP/Azure | ✓ AWS/GCP/Azure |
Metadata Tree & Files
Iceberg's metadata is a layered tree of files. Understanding this tree is the key to understanding how time travel, ACID, and partition pruning all work.
Every Iceberg table has a metadata.json file. It is the single entry point — it records the current schema, the partition spec, and a list of all snapshots. The catalog (Glue, Hive, REST) stores only the path to this file.
├── table-uuid
├── current-schema-id
├── partition-specs ← all partition definitions ever used
├── current-snapshot-id ← which snapshot is "now"
└── snapshots[] ← list of all snapshots
└── snapshot-id, manifest-list path, parent-id, timestamp
spark.read.format("iceberg").load("db.orders"), Spark asks the catalog for the metadata.json path, reads it, finds the current-snapshot-id, then follows the chain down to find which data files to read.
Each snapshot points to a manifest list (also called a snapshot file). The manifest list is an Avro file that lists all manifest files belonging to this snapshot, along with partition-level statistics (min/max values per partition) used for partition pruning.
A manifest file (Avro) contains the list of actual data files for one logical partition-group, along with column-level statistics (null count, min value, max value) for each file. This is how Iceberg does file-level pruning.
The actual Parquet / ORC / Avro data files in S3 or HDFS. These are immutable — Iceberg never modifies them. When you UPDATE a row, Iceberg writes a new file and updates the manifest to replace the old file reference. Old files are cleaned by expire_snapshots.
Snapshot Management
Every write to an Iceberg table creates a new snapshot. Snapshots are the foundation of ACID, time travel, and concurrent access in Iceberg.
A snapshot is an immutable record of the table state at a point in time. It has a unique snapshot-id, a timestamp, the operation that created it (append/overwrite/delete/replace), and a pointer to its manifest list. The metadata.json always points to the current snapshot.
Old snapshots accumulate over time and take up space (metadata + unreferenced data files). You call expire_snapshots to remove old snapshots and their associated data files. This is the Iceberg equivalent of Delta's VACUUM.
# Expire snapshots older than 7 days
from pyspark.sql import SparkSession
spark.sql("""
CALL spark_catalog.system.expire_snapshots(
table => 'db.orders',
older_than => TIMESTAMP '2024-01-01 00:00:00',
retain_last => 3
)
""")
# Or using the Python API
from pyiceberg.catalog import load_catalog
catalog = load_catalog("glue")
table = catalog.load_table("db.orders")
table.expire_snapshots().expire_older_than(1704067200000).commit()
-- View all snapshots of a table
SELECT * FROM db.orders.snapshots;
-- View history (human-readable)
SELECT * FROM db.orders.history;
-- View metadata files
SELECT * FROM db.orders.metadata_log_entries;
-- View data files in current snapshot
SELECT * FROM db.orders.files;
Time Travel Queries
Iceberg lets you read any historical version of a table — either by snapshot ID or by timestamp. This is essential for audits, debugging, and reproducing past reports.
Read the table as it was at a specific point in time. Iceberg finds the latest snapshot whose timestamp is ≤ the given timestamp and reads from that snapshot.
# Method 1: Using Spark SQL AS OF TIMESTAMP
spark.sql("""
SELECT * FROM db.orders
TIMESTAMP AS OF '2024-01-15 10:00:00'
""")
# Method 2: Using DataFrame API option
df = spark.read \
.format("iceberg") \
.option("as-of-timestamp", "1705312800000") \ # epoch milliseconds
.load("db.orders")
df.show()
Read the table at a specific snapshot ID. Get snapshot IDs from the snapshots metadata table.
# Find available snapshots first
spark.sql("SELECT snapshot_id, committed_at, operation FROM db.orders.snapshots").show()
# Read by snapshot ID — SQL
spark.sql("""
SELECT * FROM db.orders
VERSION AS OF 8537401032498282350
""")
# Read by snapshot ID — DataFrame API
df = spark.read \
.format("iceberg") \
.option("snapshot-id", "8537401032498282350") \
.load("db.orders")
df.show()
You can roll back a table to a previous snapshot. This updates metadata.json to point to the old snapshot as "current". The newer snapshots are still in history — rollback is reversible.
# Rollback to a specific snapshot ID
spark.sql("""
CALL spark_catalog.system.rollback_to_snapshot(
'db.orders',
8537401032498282350
)
""")
# Rollback to a timestamp
spark.sql("""
CALL spark_catalog.system.rollback_to_timestamp(
'db.orders',
TIMESTAMP '2024-01-15 09:00:00'
)
""")
expire_snapshots, those newer snapshots will be deleted and rollback will no longer be reversible.
Partition Evolution
Iceberg's killer feature: you can change how data is partitioned without rewriting old files. Old data keeps its old partition layout; new data uses the new layout. Queries work across both seamlessly.
In Hive, if you have 3 years of data partitioned by month and you want to switch to day because the data volume grew, you would need to rewrite all 3 years of data. With Iceberg, you just change the spec — old data stays as-is, new data uses the new spec.
-- Create table partitioned by month
CREATE TABLE db.orders (
order_id BIGINT,
customer_id BIGINT,
order_date DATE,
amount DOUBLE
) USING iceberg
PARTITIONED BY (months(order_date));
-- After 1 year, data grew — switch to day partitioning
-- Old data stays in month partitions, new data uses day partitions
ALTER TABLE db.orders
REPLACE PARTITION FIELD months(order_date)
WITH days(order_date);
-- Add a second partition field (combined partitioning)
ALTER TABLE db.orders
ADD PARTITION FIELD region;
-- Drop a partition field
ALTER TABLE db.orders
DROP PARTITION FIELD region;
spec-id stored in metadata.json. Each manifest file records which spec-id it was written with. At query time, Iceberg applies the correct pruning logic for each manifest based on its spec-id.
Hidden Partitioning & Partition Transforms
Iceberg's partitioning is invisible to the query author. It also supports powerful transforms — extract year, month, bucket, truncate — so you never have to add computed columns to your schema.
In Hive, you must filter on the exact partition column (e.g., WHERE year=2024 AND month=1). In Iceberg, you just write WHERE order_date = '2024-01-15' — Iceberg automatically derives which month/day partition to scan. The partition logic is hidden from the query.
year INT, month INTQuery must filter:
WHERE year=2024 AND month=1If you forget → full scan.
Partition defined as
months(order_date)Query just writes:
WHERE order_date BETWEEN '2024-01-01' AND '2024-01-31'Iceberg auto-prunes partitions.
Iceberg supports several transform functions that compute a partition value from a column value. These are applied at write time to determine which partition a row goes to, and at read time to prune partitions.
| Transform | Column Type | What It Does | Example |
|---|---|---|---|
identity | Any | Partition by exact value | identity(region) → partition per region value |
year | Date/Timestamp | Extract year | year(order_date) → 2024 |
month | Date/Timestamp | Extract year-month | month(order_date) → 2024-01 |
day | Date/Timestamp | Extract year-month-day | day(order_date) → 2024-01-15 |
hour | Timestamp | Extract year-month-day-hour | hour(event_ts) → 2024-01-15-10 |
bucket(N) | Any | Hash into N buckets | bucket(16, customer_id) → 0-15 |
truncate(W) | String/Int | Truncate to W chars/int | truncate(3, zip_code) → "900" from "90001" |
# Create table with bucket + day partitioning
spark.sql("""
CREATE TABLE db.events (
event_id BIGINT,
user_id BIGINT,
event_type STRING,
event_ts TIMESTAMP,
payload STRING
) USING iceberg
PARTITIONED BY (
bucket(16, user_id),
days(event_ts)
)
""")
# Query — no need to specify partition columns explicitly
# Iceberg auto-prunes buckets and day partitions
spark.sql("""
SELECT * FROM db.events
WHERE user_id = 12345
AND event_ts BETWEEN '2024-01-01' AND '2024-01-31'
""").show()
Schema Evolution
Iceberg supports safe schema evolution — add, rename, drop, or reorder columns — without rewriting data files. Old files are still readable after changes because Iceberg tracks columns by ID, not by name or position.
In Parquet, columns are referenced by position (column 0, column 1...). If you drop column 1 and add a new column, old files read the new column in the wrong position. Iceberg fixes this by assigning a unique ID to every column, stored in metadata. Old files and new files can have different column positions — Iceberg always maps by ID.
field-id. Parquet files also store field-id in their metadata (using Parquet's field ID feature). So when you rename column "price" to "unit_price", the field-id stays the same — old files still map correctly to the new name.
-- Add a new column
ALTER TABLE db.orders ADD COLUMN discount DOUBLE;
-- Add a nested column inside a struct
ALTER TABLE db.orders ADD COLUMN address.zip_code STRING;
-- Old files return NULL for the new column — safe!
SELECT order_id, discount FROM db.orders;
-- Old rows: discount = NULL, new rows: discount = actual value
-- Rename column (field-id stays the same internally)
ALTER TABLE db.orders RENAME COLUMN price TO unit_price;
-- Old files: column was written as "price" but has field-id=5
-- New files: column written as "unit_price" with same field-id=5
-- Queries using "unit_price" work on both old and new files!
-- Drop a column (data still exists in old files, just not read)
ALTER TABLE db.orders DROP COLUMN internal_notes;
-- Reorder columns (logical reordering, no data rewrite)
ALTER TABLE db.orders ALTER COLUMN discount
AFTER order_date;
-- Change column type (with compatible promotion)
-- INT → LONG is safe (widening)
ALTER TABLE db.orders ALTER COLUMN quantity
TYPE BIGINT;
-- View current schema
DESCRIBE TABLE db.orders;
INSERT, MERGE, UPDATE, DELETE
Iceberg supports full DML — you can insert, merge (upsert), update, and delete rows. These all work via copy-on-write or merge-on-read strategies, creating new snapshots without touching the original files.
# Append new rows to Iceberg table
df_new = spark.read.parquet("s3://bucket/new_orders/")
df_new.writeTo("db.orders").append()
# Or using SQL
spark.sql("""
INSERT INTO db.orders
SELECT order_id, customer_id, order_date, amount
FROM staging.new_orders
""")
# Overwrite using dynamic partition overwrite
spark.sql("""
INSERT OVERWRITE db.orders
SELECT * FROM staging.new_orders
WHERE order_date = '2024-01-15'
""")
MERGE INTO is the most important DML for CDC and SCD pipelines. It lets you insert new rows, update existing rows, and delete rows — all in one atomic operation.
-- Full upsert: update if exists, insert if not
MERGE INTO db.orders AS target
USING staging.orders_updates AS source
ON target.order_id = source.order_id
WHEN MATCHED AND source.op = 'DELETE' THEN DELETE
WHEN MATCHED AND source.op = 'UPDATE' THEN UPDATE SET
target.amount = source.amount,
target.updated_at = source.updated_at
WHEN NOT MATCHED AND source.op = 'INSERT' THEN INSERT *;
from pyspark.sql.functions import expr
# Create source updates DataFrame
source_df = spark.read.table("staging.orders_updates")
# Register as temp view for SQL MERGE
source_df.createOrReplaceTempView("source")
spark.sql("""
MERGE INTO db.orders t
USING source s
ON t.order_id = s.order_id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
""")
-- Update rows matching a condition
UPDATE db.orders
SET status = 'CANCELLED'
WHERE order_date < '2023-01-01' AND amount = 0;
-- Delete rows matching a condition (GDPR right to erasure!)
DELETE FROM db.orders
WHERE customer_id = 99999;
-- Delete a full partition efficiently
DELETE FROM db.orders
WHERE order_date = '2020-01-01';
-- Iceberg skips rewriting other partitions!
expire_snapshots is run. This is great for read-heavy workloads.
TBLPROPERTIES ('format-version' = '2').
SparkCatalog & Read/Write
To use Iceberg with Spark, you configure a SparkCatalog, then read and write Iceberg tables using the familiar DataFrame API or SQL. Iceberg integrates through Spark's catalog plugin system.
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName("IcebergDemo") \
.config("spark.jars.packages",
"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.0") \
# Register Iceberg catalog — name it "spark_catalog" to be default
.config("spark.sql.catalog.spark_catalog",
"org.apache.iceberg.spark.SparkSessionCatalog") \
.config("spark.sql.catalog.spark_catalog.type", "hive") \
# Or use a separate catalog name
.config("spark.sql.catalog.iceberg",
"org.apache.iceberg.spark.SparkCatalog") \
.config("spark.sql.catalog.iceberg.type", "hadoop") \
.config("spark.sql.catalog.iceberg.warehouse", "s3://my-bucket/warehouse") \
.getOrCreate()
# Enable Iceberg extensions for DML (MERGE, UPDATE, DELETE)
spark.conf.set("spark.sql.extensions",
"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
from pyspark.sql.types import StructType, StructField, LongType, StringType, DoubleType, DateType
from pyspark.sql.functions import col, lit, current_date
# Create Iceberg table via SQL
spark.sql("""
CREATE TABLE IF NOT EXISTS iceberg.db.orders (
order_id BIGINT NOT NULL,
customer_id BIGINT,
order_date DATE,
region STRING,
amount DOUBLE
) USING iceberg
PARTITIONED BY (months(order_date), identity(region))
TBLPROPERTIES (
'write.metadata.delete-after-commit.enabled' = 'true',
'write.metadata.previous-versions-max' = '10'
)
""")
# Write DataFrame to Iceberg table
df = spark.createDataFrame([
(1, 101, "2024-01-15", "US", 250.0),
(2, 102, "2024-01-16", "EU", 180.0),
], ["order_id", "customer_id", "order_date", "region", "amount"])
# Append
df.writeTo("iceberg.db.orders").append()
# Overwrite by partition filter
df.writeTo("iceberg.db.orders") \
.overwritePartitions()
# Create and overwrite (replace) entire table
df.writeTo("iceberg.db.orders").createOrReplace()
# Standard read
df = spark.read.table("iceberg.db.orders")
df.show()
# SQL read
spark.sql("SELECT * FROM iceberg.db.orders WHERE region = 'US'").show()
# Incremental read — read only changes since a snapshot
spark.read \
.format("iceberg") \
.option("start-snapshot-id", "8537401032498282350") \
.option("end-snapshot-id", "9123456789012345678") \
.load("iceberg.db.orders") \
.show()
Iceberg REST Catalog & AWS Glue Catalog
The catalog is where Iceberg stores the pointer to each table's current metadata.json. There are multiple catalog implementations. REST Catalog is the modern, engine-agnostic standard; Glue Catalog is the go-to for AWS deployments.
| Catalog | Best For | Multi-Engine | Notes |
|---|---|---|---|
| Hadoop (file-based) | Local dev / testing | ✗ | No locking — not for production |
| Hive Metastore | On-prem / legacy | ✓ | Battle-tested but complex |
| AWS Glue | AWS deployments | ✓ | Serverless, integrates with Athena/EMR |
| REST Catalog | Modern / cloud-native | ✓ | Polaris, Nessie, Tabular — the future |
| JDBC | Database-backed | ✓ | PostgreSQL/MySQL as catalog |
spark = SparkSession.builder \
.appName("IcebergGlue") \
.config("spark.jars.packages",
"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.0,"
"software.amazon.awssdk:bundle:2.20.160") \
.config("spark.sql.extensions",
"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") \
# Glue Catalog
.config("spark.sql.catalog.glue_catalog",
"org.apache.iceberg.spark.SparkCatalog") \
.config("spark.sql.catalog.glue_catalog.warehouse",
"s3://my-bucket/warehouse") \
.config("spark.sql.catalog.glue_catalog.catalog-impl",
"org.apache.iceberg.aws.glue.GlueCatalog") \
.config("spark.sql.catalog.glue_catalog.io-impl",
"org.apache.iceberg.aws.s3.S3FileIO") \
.config("spark.sql.catalog.glue_catalog.aws.region", "us-east-1") \
.getOrCreate()
# Now use the glue_catalog prefix in table references
spark.sql("CREATE DATABASE IF NOT EXISTS glue_catalog.analytics")
spark.sql("SHOW TABLES IN glue_catalog.analytics").show()
spark = SparkSession.builder \
.config("spark.sql.catalog.rest",
"org.apache.iceberg.spark.SparkCatalog") \
.config("spark.sql.catalog.rest.type", "rest") \
.config("spark.sql.catalog.rest.uri",
"http://iceberg-rest-catalog:8181") \
.config("spark.sql.catalog.rest.warehouse",
"s3://my-bucket/warehouse") \
.config("spark.sql.catalog.rest.token", "my-auth-token") \
.getOrCreate()
# Same table API — just different catalog prefix
spark.sql("SELECT * FROM rest.analytics.orders").show()
Compaction & Table Maintenance
Over time, Iceberg tables accumulate small files (from streaming or many small inserts), stale snapshots, and old metadata. These maintenance procedures keep performance optimal.
Combines many small Parquet files into fewer, larger files. This is the Iceberg equivalent of Delta's OPTIMIZE. It also re-sorts data for better query performance (like ZORDER).
-- Basic compaction (merge small files)
CALL spark_catalog.system.rewrite_data_files('db.orders');
-- Compaction with target file size (256MB)
CALL spark_catalog.system.rewrite_data_files(
table => 'db.orders',
options => map(
'target-file-size-bytes', '268435456',
'min-file-size-bytes', '67108864'
)
);
-- Sort-order compaction (like ZORDER) — great for range queries
CALL spark_catalog.system.rewrite_data_files(
table => 'db.orders',
strategy => 'sort',
sort_order => 'order_date ASC NULLS LAST, customer_id ASC'
);
Removes old snapshots and the data files that are no longer referenced by any active snapshot. This is the Iceberg equivalent of Delta's VACUUM.
-- Expire snapshots older than 7 days, keep last 3
CALL spark_catalog.system.expire_snapshots(
table => 'db.orders',
older_than => TIMESTAMP '2024-01-08 00:00:00',
retain_last => 3
);
-- Also deletes orphan data files not referenced by kept snapshots
Over time, manifest files become fragmented (many tiny manifests from streaming appends). This procedure rewrites manifests to be larger and better organized, speeding up query planning.
-- Rewrite and merge small manifest files
CALL spark_catalog.system.rewrite_manifests('db.orders');
-- Use native Avro writer (avoids Spark job overhead)
CALL spark_catalog.system.rewrite_manifests(
table => 'db.orders',
use_caching => 'true'
);
Orphan files are data files in the warehouse directory that are not referenced by any snapshot (e.g., from failed writes). This removes them and recovers storage.
CALL spark_catalog.system.delete_orphan_files(
table => 'db.orders',
older_than => TIMESTAMP '2024-01-01 00:00:00'
);
Iceberg on AWS
AWS has first-class support for Apache Iceberg across S3, Glue Catalog, Athena, and EMR. This is the most common production setup for Iceberg in the cloud.
All Iceberg files (metadata JSON, manifest lists, manifests, data Parquet) live in S3. Iceberg uses S3's strong consistency (available since December 2020) for safe concurrent access. Use S3 object prefix naming wisely to avoid throttling on high-write tables.
# EMR Bootstrap — add Iceberg jars via spark-defaults.conf
# or pass --packages to spark-submit
spark = SparkSession.builder \
.appName("EMR_Iceberg") \
.config("spark.sql.extensions",
"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") \
.config("spark.sql.catalog.glue",
"org.apache.iceberg.spark.SparkCatalog") \
.config("spark.sql.catalog.glue.catalog-impl",
"org.apache.iceberg.aws.glue.GlueCatalog") \
.config("spark.sql.catalog.glue.warehouse",
"s3://my-data-lake/iceberg/") \
.config("spark.sql.catalog.glue.io-impl",
"org.apache.iceberg.aws.s3.S3FileIO") \
.config("spark.sql.catalog.glue.aws.region", "us-east-1") \
.getOrCreate()
# Create Glue database
spark.sql("CREATE DATABASE IF NOT EXISTS glue.analytics")
# Create Iceberg table visible in Glue + Athena
spark.sql("""
CREATE TABLE IF NOT EXISTS glue.analytics.sales (
sale_id BIGINT,
sale_date DATE,
product STRING,
revenue DOUBLE
) USING iceberg
PARTITIONED BY (months(sale_date))
""")
# Write data
df.writeTo("glue.analytics.sales").append()
Athena v3 supports Iceberg natively. Once a table is registered in Glue, Athena can query it directly with time travel support — no extra config needed.
-- Standard query
SELECT * FROM analytics.sales
WHERE sale_date >= DATE('2024-01-01');
-- Time travel in Athena
SELECT * FROM analytics.sales
FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00 UTC';
-- MERGE in Athena (v3)
MERGE INTO analytics.sales t
USING updates u ON t.sale_id = u.sale_id
WHEN MATCHED THEN UPDATE SET revenue = u.revenue
WHEN NOT MATCHED THEN INSERT *;
EMR Serverless (no cluster management) supports Iceberg out of the box on EMR 6.10+. Just submit a Spark job — no cluster spinning needed.
aws emr-serverless start-job-run \
--application-id app-123 \
--execution-role-arn arn:aws:iam::123:role/emr-role \
--job-driver '{
"sparkSubmit": {
"entryPoint": "s3://my-bucket/scripts/iceberg_job.py",
"sparkSubmitParameters": "--conf spark.jars=/usr/share/aws/iceberg/lib/iceberg-spark3-runtime.jar"
}
}'
Knowledge Check
Test your understanding of Apache Iceberg with these FAANG-style interview questions.
months(order_date). Your data grew 10x and you now want daily partitions. What happens to old data when you run ALTER TABLE ... REPLACE PARTITION FIELD months(order_date) WITH days(order_date)?DELETE FROM db.orders WHERE customer_id = 99999. How does Iceberg handle this (default Copy-on-Write)?customer_id for faster lookup queries. Which procedure and strategy should you use?Cheat Sheet
Essential Iceberg commands and concepts at a glance.
spark.sql.extensions = IcebergSparkSessionExtensions
spark.sql.catalog.NAME = SparkCatalog
spark.sql.catalog.NAME.type = hadoop|hive|rest
spark.sql.catalog.NAME.warehouse = s3://...
USING iceberg
PARTITIONED BY (
months(order_date),
bucket(16, customer_id)
)
SELECT * FROM t
TIMESTAMP AS OF '2024-01-01'
-- by snapshot
SELECT * FROM t
VERSION AS OF 8537401...
ALTER TABLE t RENAME COLUMN price TO unit_price;
ALTER TABLE t DROP COLUMN old_col;
ALTER TABLE t ALTER COLUMN qty TYPE BIGINT;
REPLACE PARTITION FIELD months(d)
WITH days(d);
ALTER TABLE t ADD PARTITION FIELD region;
ALTER TABLE t DROP PARTITION FIELD region;
year(ts_col)
month(ts_col)
day(ts_col)
hour(ts_col)
bucket(N, col) — hash to N buckets
truncate(W, col) — prefix W chars
INSERT OVERWRITE db.t SELECT ...
UPDATE db.t SET col=val WHERE ...
DELETE FROM db.t WHERE ...
MERGE INTO db.t USING src ON ...
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
df.writeTo("db.t").overwritePartitions()
df.writeTo("db.t").createOrReplace()
df.write.format("iceberg")
.mode("append").save("db.t")
CALL sys.rewrite_manifests('db.t');
CALL sys.expire_snapshots(
'db.t', older_than=..., retain_last=3);
CALL sys.delete_orphan_files('db.t');
SELECT * FROM db.t.history;
SELECT * FROM db.t.files;
SELECT * FROM db.t.manifests;
SELECT * FROM db.t.partitions;
SELECT * FROM db.t.refs;
'db.t', 8537401032498282350);
CALL sys.rollback_to_timestamp(
'db.t', TIMESTAMP '2024-01-15 09:00:00');
aws.glue.GlueCatalog
io-impl =
aws.s3.S3FileIO
warehouse = s3://bucket/path
aws.region = us-east-1
| Feature | Iceberg | Delta Lake |
|---|---|---|
| Partition evolution (no rewrite) | ✓ Yes | ✗ No |
| Hidden partitioning | ✓ Yes | ✗ No |
| Multi-engine support | ✓ Excellent | Partial |
| Compaction command | rewrite_data_files | OPTIMIZE |
| Sort optimization | strategy='sort' | ZORDER BY |
| Snapshot cleanup | expire_snapshots | VACUUM |
| Orphan file cleanup | delete_orphan_files | VACUUM (includes this) |
| AWS native integration | ✓ Glue + Athena + EMR | Partial (S3 + EMR) |