Pyiceberg

to learn pyiceberg. we need to have catalog and object storage. in this article, I will use postgres for catalog and minio for s3. Those tools will be setup using docker-compose

docker-compose.yml
version: "3.9"
services:
  postgres:
    image: postgres
    container_name: pg_catalog
    environment:
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: password
      POSTGRES_DB: mycatalog
    restart: unless-stopped
    ports:
      - 5432:5432

  minio:
    image: quay.io/minio/minio:latest
    command: server /data --console-address ":9001"
    container_name: minio_s3
    # volumes:
    #   - ./data:/data
    environment:
      MINIO_ROOT_USER: admin
      MINIO_ROOT_PASSWORD: password
    restart: unless-stopped
    ports:
      - 9000:9000
      - 9001:9001

Create Access and Secret key on minio website http://localhost:9001 and replace in .env file

Code

Connect S3

Connect Catalog & S3

Create Namespace

Create Schema & Partition & Table

Load Table

Catalog

direct metadata file

Schema Evolution (Update table)

update_schema: NotImplementedError

Reference

Last updated