diff --git a/docs/community/setting-up-a-dev-env.mdx b/docs/community/setting-up-a-dev-env.mdx index 345da741..8d63c984 100644 --- a/docs/community/setting-up-a-dev-env.mdx +++ b/docs/community/setting-up-a-dev-env.mdx @@ -43,8 +43,35 @@ sudo apt install openjdk-21-jdk +**Option A: Using `winget` (Recommended)** +```powershell +winget install Oracle.JDK.21 +``` + +**Option B: Manual Download** + Please follow the link to install Java 21 on Windows: [JDK 21](https://www.oracle.com/in/java/technologies/downloads/#jdk21-windows) +**Verify Installation:** +```powershell +java -version +``` + +:::caution JAVA_HOME Environment Variable +If `mvn` or other Java-based tools fail later with _"JAVA_HOME is not set"_, you need to set it manually: +```powershell +# Find your Java installation path +Get-Command java | Select-Object -ExpandProperty Source + +# Set JAVA_HOME for the current session +$env:JAVA_HOME = "C:\Program Files\Java\jdk-21" + +# To set it permanently, use System Properties > Environment Variables +# or run (requires admin PowerShell): +[System.Environment]::SetEnvironmentVariable("JAVA_HOME", "C:\Program Files\Java\jdk-21", "Machine") +``` +::: + @@ -68,8 +95,24 @@ sudo snap install go --classic +**Option A: Using `winget` (Recommended)** +```powershell +winget install GoLang.Go +``` + +**Option B: Manual Download** + Please follow the link to install Golang 1.25 on Windows: [Go 1.25](https://go.dev/dl/) +**Verify Installation:** +```powershell +go version +``` + +:::tip +After installing Go via `winget`, you may need to **restart your terminal** (or open a new PowerShell window) for the `go` command to be available in your PATH. +::: + @@ -93,8 +136,20 @@ sudo apt-get install -y nodejs +**Option A: Using `winget` (Recommended)** +```powershell +winget install OpenJS.NodeJS.LTS +``` + +**Option B: Manual Download** + Please follow the link to install Node.js 22.19.0 on Windows: [Node.js 22.19.0](https://nodejs.org/en/download) +**Verify Installation:** +```powershell +node -v +``` + @@ -130,8 +185,29 @@ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin +**Option A: Using `winget`** +```powershell +winget install Docker.DockerDesktop +``` + +**Option B: Manual Download** + Please follow the link to install Docker on Windows: [Docker Desktop](https://docs.docker.com/desktop/setup/install/windows-install/) +:::important WSL2 Required +Docker Desktop for Windows requires **WSL2** (Windows Subsystem for Linux 2) as its backend. If not already installed, run the following in an **Administrator** PowerShell: +```powershell +wsl --install +``` +Restart your computer after installation. After restarting, launch Docker Desktop and ensure it starts without errors before proceeding. +::: + +**Verify Installation:** +```powershell +docker --version +docker compose version +``` + @@ -152,8 +228,20 @@ sudo apt install maven -y +**Option A: Using `winget` (Recommended)** +```powershell +winget install Apache.Maven +``` + +**Option B: Manual Download** + Please follow the link to install Maven on Windows: [Maven](https://maven.apache.org/download.cgi) +**Verify Installation:** +```powershell +mvn -v +``` + @@ -169,11 +257,34 @@ To better understand the workflow, let's walk through an example using **Postgre Use the following command to quickly spin up the source (Postgres/MongoDB/MySQL) and destination (Iceberg/Parquet Writer) services using Docker Compose. This will download the required docker-compose files and start the containers in the background. + + ```bash sh -c 'curl -fsSL https://raw.githubusercontent.com/datazip-inc/olake-docs/master/docs/community/docker-compose.yml -o docker-compose.source.yml && \ curl -fsSL https://raw.githubusercontent.com/datazip-inc/olake/master/destination/iceberg/local-test/docker-compose.yml -o docker-compose.destination.yml && \ docker compose -f docker-compose.source.yml --profile postgres -f docker-compose.destination.yml up -d' ``` + + +```powershell +# Step 1: Download source docker-compose +Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-docs/master/docs/community/docker-compose.yml" -OutFile "docker-compose.source.yml" + +# Step 2: Download destination docker-compose (Iceberg) +Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake/master/destination/iceberg/local-test/docker-compose.yml" -OutFile "docker-compose.destination.yml" + +# Step 3: Start all source and destination containers +docker compose -f docker-compose.source.yml --profile postgres -f docker-compose.destination.yml up -d +``` + +:::tip Verify containers are running +After starting, verify all containers are healthy: +```powershell +docker compose -f docker-compose.source.yml --profile postgres -f docker-compose.destination.yml ps +``` +::: + + :::note @@ -226,9 +337,19 @@ You can swap sources and destinations just by changing the Docker profile and yo Clone the OLake repository and navigate to the project directory: + + ```bash git clone git@github.com:datazip-inc/olake.git && cd olake ``` + + +```powershell +git clone https://github.com/datazip-inc/olake.git +Set-Location olake +``` + + :::note - For contributing, you can [fork the repository on GitHub](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/about-forks) first, then clone your fork. @@ -368,9 +489,23 @@ You **must** review the **Discover** and **Sync** commands in the [Commands and Initially, you have to run the discover command which generates a `streams.json` file which contains all the possible streams. It requires the source name and source config path, with command type `discover`. + + ```bash ./build.sh driver-postgres discover --config $(pwd)/source.json ``` + + +```powershell +# On Windows, use 'go run' directly instead of build.sh +go run ./drivers/postgres/main.go discover --config "$PWD\source.json" +``` + + + +:::note Windows Users +The `go run` command compiles and runs the Go code directly. Unlike `./build.sh`, it does **not** automatically build the Iceberg Java JAR. If you're using Iceberg as your destination, you must [build the JAR manually](#6-debugging) before running sync commands. +::: The following video provides a comprehensive guide on the `streams.json` file structure and how to configure it for your use case: @@ -389,9 +524,19 @@ The sync command is used to sync data from the source to the destination. OLake For the first full refresh, run the sync command without the `--state` flag: + + ```bash ./build.sh driver-postgres sync --config $(pwd)/source.json --catalog $(pwd)/streams.json --destination $(pwd)/destination.json ``` + + +```powershell +# Run full refresh sync (compiles and executes in one step) +go run ./drivers/postgres/main.go sync --config "$PWD\source.json" --catalog "$PWD\streams.json" --destination "$PWD\destination.json" +``` + + After this initial sync completes, a `state.json` and `stats.json` file are automatically generated. The `state.json` file contains the necessary resume tokens and metadata that OLake uses for CDC (Change Data Capture) or Incremental sync operations. Essentially, it tells OLake from where to resume or start the next sync. @@ -407,9 +552,19 @@ Modifications or deletions to existing records in the source will **not** be ref Run the sync command with the `--state` flag enabled: + + ```bash ./build.sh driver-postgres sync --config $(pwd)/source.json --catalog $(pwd)/streams.json --destination $(pwd)/destination.json --state $(pwd)/state.json ``` + + +```powershell +# Run incremental sync with state tracking +go run ./drivers/postgres/main.go sync --config "$PWD\source.json" --catalog "$PWD\streams.json" --destination "$PWD\destination.json" --state "$PWD\state.json" +``` + + To learn how Incremental sync works in OLake, including how it tracks changes and resumes from previous sync points, watch the following video: @@ -425,9 +580,19 @@ This ensures that OLake continues from where it left off, only syncing new or ch Run the sync command with the `--state` flag enabled: + + ```bash ./build.sh driver-postgres sync --config $(pwd)/source.json --catalog $(pwd)/streams.json --destination $(pwd)/destination.json --state $(pwd)/state.json ``` + + +```powershell +# Run CDC sync with state tracking (captures inserts, updates, and deletes) +go run ./drivers/postgres/main.go sync --config "$PWD\source.json" --catalog "$PWD\streams.json" --destination "$PWD\destination.json" --state "$PWD\state.json" +``` + + To learn how CDC sync works in OLake, including how it tracks changes and resumes from previous sync points, watch the [comprehensive video tutorial](#setup-video-tutorial) above. @@ -439,6 +604,14 @@ To learn how CDC sync works in OLake, including how it tracks changes and resume After running the sync command, you can query your data using the Spark Iceberg service available at [`localhost:8888`](http://localhost:8888). + :::tip Windows Users + If port `8888` is already in use (e.g., by Jupyter or Anaconda), check with: + ```powershell + netstat -ano | findstr :8888 + ``` + You can stop the conflicting process or change the port mapping in `docker-compose.destination.yml`. + ::: + For example, run the following SQL commands to explore your synced data: ```sql @@ -466,9 +639,18 @@ When running sync with state mode enabled, you can verify Change Data Capture/In Now run sync with state enabled: + + ```bash ./build.sh driver-postgres sync --config $(pwd)/source.json --catalog $(pwd)/streams.json --destination $(pwd)/destination.json --state $(pwd)/state.json ``` + + + ```powershell + go run ./drivers/postgres/main.go sync --config "$PWD\source.json" --catalog "$PWD\streams.json" --destination "$PWD\destination.json" --state "$PWD\state.json" + ``` + + After the sync is completed, execute the SQL query from the previous section on `localhost:8888`. You should see 2 additional rows with `op_type` indicating `"c"` for created. @@ -480,9 +662,18 @@ When running sync with state mode enabled, you can verify Change Data Capture/In Now run sync with state enabled: + + ```bash ./build.sh driver-postgres sync --config $(pwd)/source.json --catalog $(pwd)/streams.json --destination $(pwd)/destination.json --state $(pwd)/state.json ``` + + + ```powershell + go run ./drivers/postgres/main.go sync --config "$PWD\source.json" --catalog "$PWD\streams.json" --destination "$PWD\destination.json" --state "$PWD\state.json" + ``` + + After the sync is completed, execute the SQL query from the previous section on `localhost:8888`. You can notice that the `op_type` for this record will indicate `"u"` for updated. @@ -494,9 +685,18 @@ When running sync with state mode enabled, you can verify Change Data Capture/In Now run sync with state enabled: + + ```bash ./build.sh driver-postgres sync --config $(pwd)/source.json --catalog $(pwd)/streams.json --destination $(pwd)/destination.json --state $(pwd)/state.json ``` + + + ```powershell + go run ./drivers/postgres/main.go sync --config "$PWD\source.json" --catalog "$PWD\streams.json" --destination "$PWD\destination.json" --state "$PWD\state.json" + ``` + + After the sync is completed, execute the SQL query from the previous section on `localhost:8888`. You can notice that the `op_type` for these records will indicate `"d"` for deleted (with other fields as NONE). @@ -505,21 +705,47 @@ When running sync with state mode enabled, you can verify Change Data Capture/In While running the build command, you can add print statements to debug the flow. If you prefer using a debugger with **VSCode**, please follow the section below. If you don't want to run the sync commands after every change, you can use this debugger mode for the Go side of the code. -:::caution -If using Iceberg as destination, you need to generate the jar file for the Java side of code and move it to the correct location: +:::caution Iceberg JAR Required +If using Iceberg as destination, you need to generate the jar file for the Java side of code and move it to the correct location. **This step is mandatory for Windows users** since `go run` (used in place of `build.sh`) does not auto-generate the JAR. 1. **Generate the jar file** by running the Maven command in the Java writer directory: + + ```bash cd /[PATH_TO_OLAKE_CODE]/olake/destination/iceberg/olake-iceberg-java-writer mvn clean package -DskipTests ``` + + + ```powershell + # Navigate to the Java writer directory (adjust path to your setup) + Set-Location "$PWD\destination\iceberg\olake-iceberg-java-writer" + + # Build the JAR (requires Java 21 and Maven) + mvn clean package -DskipTests + ``` + + 2. **Move the generated jar file** from the target directory to the iceberg destination directory: + + ```bash cp /[PATH_TO_OLAKE_CODE]/olake/destination/iceberg/olake-iceberg-java-writer/target/olake-iceberg-java-writer-0.0.1-SNAPSHOT.jar /[PATH_TO_OLAKE_CODE]/olake/destination/iceberg/ ``` + + + ```powershell + # Copy the built JAR to the iceberg destination directory + Copy-Item "target\olake-iceberg-java-writer-0.0.1-SNAPSHOT.jar" "..\olake-iceberg-java-writer-0.0.1-SNAPSHOT.jar" + + # Return to the olake root directory + Set-Location ..\..\.. + ``` + + -Alternatively, you can generate the jar file by running the ./build.sh sync command once, which will automatically handle the jar generation process. +Alternatively, on macOS/Linux you can generate the jar file by running the `./build.sh sync` command once, which will automatically handle the jar generation process. ::: ### Steps to Debug @@ -817,6 +1043,8 @@ Alternatively, you can generate the jar file by running the ./build.sh sync comm Update `workspaceFolder` with the absolute path where the OLake project is located on your system. For example: + + ```json "program": "/Users/john/Desktop/projects/olake/drivers/mongodb/main.go", ... @@ -824,6 +1052,21 @@ Update `workspaceFolder` with the absolute path where the OLake project is locat "/Users/john/Desktop/projects/olake/drivers/mongodb/examples/source.json", ... ``` + + +```json +"program": "C:\\Users\\john\\Desktop\\projects\\olake\\drivers\\mongodb\\main.go", +... + "--config", + "C:\\Users\\john\\Desktop\\projects\\olake\\drivers\\mongodb\\examples\\source.json", +... +``` + + + +:::tip +The `${workspaceFolder}` variable in the `launch.json` config automatically resolves to the correct path on any OS. You only need to update paths if you're not opening the OLake folder as your VS Code workspace. +::: Now, set up debug points in the codebase and click "Launch Go Code". ![VS Code debugger paused on a breakpoint in Go code for OLake, with locals, call stack, and debug console outputs visible](/img/docs/getting-started/debug.webp) diff --git a/docs/getting-started/playground.mdx b/docs/getting-started/playground.mdx index 445a0bab..62d7d4c4 100644 --- a/docs/getting-started/playground.mdx +++ b/docs/getting-started/playground.mdx @@ -96,23 +96,67 @@ Enable developers to experiment with an end-to-end, Iceberg-native lakehouse in - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Start the base OLake stack + docker compose -f docker-compose-v1.yml up -d + ``` + + + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Start the base OLake stack + docker compose -f docker-compose.yml up -d + ``` + + + + + + **Step 2: Clone the OLake repo and start services** + + The query engine compose files live in the [`olake` repository](https://github.com/datazip-inc/olake), not `olake-ui`. + + + ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + git clone https://github.com/datazip-inc/olake.git + cd olake/examples/trino-tablurarest-minio-mysql + docker compose up -d + ``` + + + ```powershell + git clone https://github.com/datazip-inc/olake.git + Set-Location olake/examples/trino-tablurarest-minio-mysql + docker compose up -d ``` - **Step 2: Navigate and start services** - - ```bash - cd examples/trino-tablurarest-minio-mysql - docker compose up -d - ``` + :::tip + If you already have the `olake` repo cloned, just `cd olake/examples/trino-tablurarest-minio-mysql` and run `docker compose up -d`. + ::: ### 2. Accessing Services @@ -259,14 +303,40 @@ Enable developers to experiment with an end-to-end, Iceberg-native lakehouse in **Step 2: Stop base OLake stack** - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down + ``` + + + ```powershell + # Download compose file if not already present + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Stop and remove all containers + docker compose -f docker-compose-v1.yml down + ``` + + - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down + ``` + + + ```powershell + # Download compose file if not already present + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Stop and remove all containers + docker compose -f docker-compose.yml down + ``` + + @@ -332,23 +402,67 @@ Enable developers to experiment with an end-to-end, Iceberg-native lakehouse in - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Start the base OLake stack + docker compose -f docker-compose-v1.yml up -d + ``` + + + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Start the base OLake stack + docker compose -f docker-compose.yml up -d + ``` + + + + + + **Step 2: Clone the OLake repo and start services** + + The query engine compose files live in the [`olake` repository](https://github.com/datazip-inc/olake), not `olake-ui`. + + + ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + git clone https://github.com/datazip-inc/olake.git + cd olake/examples/presto-tabularest-minio-mysql + docker compose up -d + ``` + + + ```powershell + git clone https://github.com/datazip-inc/olake.git + Set-Location olake/examples/presto-tabularest-minio-mysql + docker compose up -d ``` - **Step 2: Navigate and start services** - - ```bash - cd examples/presto-tabularest-minio-mysql - docker compose up -d - ``` + :::tip + If you already have the `olake` repo cloned, just `cd olake/examples/presto-tabularest-minio-mysql` and run `docker compose up -d`. + ::: ### 2. Accessing Services @@ -479,14 +593,40 @@ Enable developers to experiment with an end-to-end, Iceberg-native lakehouse in **Step 2: Stop base Olake stack** - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down + ``` + + + ```powershell + # Download compose file if not already present + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Stop and remove all containers + docker compose -f docker-compose-v1.yml down + ``` + + - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down + ``` + + + ```powershell + # Download compose file if not already present + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Stop and remove all containers + docker compose -f docker-compose.yml down + ``` + + @@ -552,23 +692,67 @@ Enable developers to experiment with an end-to-end, Iceberg-native lakehouse in - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Start the base OLake stack + docker compose -f docker-compose-v1.yml up -d + ``` + + + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Start the base OLake stack + docker compose -f docker-compose.yml up -d + ``` + + + + + + **Step 2: Clone the OLake repo and start services** + + The query engine compose files live in the [`olake` repository](https://github.com/datazip-inc/olake), not `olake-ui`. + + + ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + git clone https://github.com/datazip-inc/olake.git + cd olake/examples/spark-tablurarest-minio-mysql + docker compose up -d + ``` + + + ```powershell + git clone https://github.com/datazip-inc/olake.git + Set-Location olake/examples/spark-tablurarest-minio-mysql + docker compose up -d ``` - **Step 2: Navigate and start services** - - ```bash - cd examples/spark-tablurarest-minio-mysql - docker compose up -d - ``` + :::tip + If you already have the `olake` repo cloned, just `cd olake/examples/spark-tablurarest-minio-mysql` and run `docker compose up -d`. + ::: ### 2. Accessing Services @@ -708,14 +892,40 @@ Enable developers to experiment with an end-to-end, Iceberg-native lakehouse in **Step 2: Stop base OLake stack** - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down + ``` + + + ```powershell + # Download compose file if not already present + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Stop and remove all containers + docker compose -f docker-compose-v1.yml down + ``` + + - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down + ``` + + + ```powershell + # Download compose file if not already present + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Stop and remove all containers + docker compose -f docker-compose.yml down + ``` + + diff --git a/docs/getting-started/quickstart.mdx b/docs/getting-started/quickstart.mdx index 62900231..6698e5b2 100644 --- a/docs/getting-started/quickstart.mdx +++ b/docs/getting-started/quickstart.mdx @@ -13,11 +13,14 @@ This QuickStart guide helps get started with [OLake UI](/docs/install/olake-ui/) ## Prerequisites -The following requirements must be met before starting: -- [Docker](https://docs.docker.com/get-docker/) installed (Docker Desktop recommended) +- [Docker](https://docs.docker.com/get-docker/) installed and **running** (Docker Desktop recommended) - [Docker Compose](https://docs.docker.com/compose/) (included with Docker Desktop) -- At least 4GB RAM available for Docker -- Port 8000 available on the system +- At least 4GB RAM allocated to Docker +- Port `8000` available on your system + +:::tip +Make sure Docker Desktop is running before executing the commands below. You can verify by running `docker info` in your terminal. +::: ## Quick Start (Docker Compose) @@ -26,15 +29,41 @@ The fastest way to get OLake UI running is with a single command: - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Start all OLake services in the background + docker compose -f docker-compose-v1.yml up -d + ``` + + *This setup uses Postgres for both metadata and Temporal visibility.* - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Start all OLake services in the background + docker compose -f docker-compose.yml up -d + ``` + + *This setup uses Elasticsearch for Temporal visibility.* @@ -67,15 +96,45 @@ To update OLake UI to the latest version, use the following commands based on yo - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down && \ - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down && \ + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the latest compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Stop existing containers, then start with updated images + docker compose -f docker-compose-v1.yml down + docker compose -f docker-compose-v1.yml pull + docker compose -f docker-compose-v1.yml up -d + ``` + + - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down && \ - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down && \ + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the latest compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Stop existing containers, then start with updated images + docker compose -f docker-compose.yml down + docker compose -f docker-compose.yml pull + docker compose -f docker-compose.yml up -d + ``` + + \ No newline at end of file diff --git a/docs/install/docker-cli.mdx b/docs/install/docker-cli.mdx index bc3bf994..f787c324 100644 --- a/docs/install/docker-cli.mdx +++ b/docs/install/docker-cli.mdx @@ -14,6 +14,28 @@ Docker CLI can be installed and executed using the official Docker images. Each - [Docker](https://docs.docker.com/get-docker/) installed and running on the host system - Internet access to pull Docker images +:::note Windows Users +In PowerShell, the backslash `\` is **not** a line continuation character. Replace `\` with a backtick `` ` `` or write the command on a single line. + +**Example — Linux/macOS:** +```bash +docker run --pull=always \ + -v "/path/to/config:/mnt/config" \ + olakego/source-postgres:latest \ + discover \ + --config /mnt/config/source.json +``` + +**Equivalent — Windows PowerShell:** +```powershell +docker run --pull=always ` + -v "C:\path\to\config:/mnt/config" ` + olakego/source-postgres:latest ` + discover ` + --config /mnt/config/source.json +``` +::: + In the subsequent commands, replace `[SOURCE-TYPE]` with the value corresponding to the required driver.
diff --git a/docs/install/olake-ui/index.mdx b/docs/install/olake-ui/index.mdx index 76256d2a..b55ee51b 100644 --- a/docs/install/olake-ui/index.mdx +++ b/docs/install/olake-ui/index.mdx @@ -30,14 +30,17 @@ OLake UI provides a complete Docker Compose stack for running the replication sy ## Prerequisites -The following requirements must be met before starting: -- [Docker](https://docs.docker.com/get-docker/) installed (Docker Desktop recommended) +- [Docker](https://docs.docker.com/get-docker/) installed and **running** (Docker Desktop recommended) - [Docker Compose](https://docs.docker.com/compose/) (included with Docker Desktop) -- Port 8000 available on the system +- Port `8000` available on your system - System Requirements: - Minimum: 8 vCPU, 16 GB RAM - Recommended: 16 vCPU, 32 GB RAM +:::tip +Verify Docker is running before proceeding: `docker info`. On Windows, ensure Docker Desktop is started and WSL2 integration is enabled. +::: + ## Quick Start ### One-Command Setup @@ -46,15 +49,41 @@ The fastest way to get OLake UI running is with a single command: - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file (PowerShell equivalent of curl) + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Start all OLake services + docker compose -f docker-compose-v1.yml up -d + ``` + + *This setup uses Postgres for both metadata and Temporal visibility, eliminating the need for Elasticsearch.* - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Start all OLake services + docker compose -f docker-compose.yml up -d + ``` + + *This setup uses Elasticsearch for Temporal visibility and Postgres for metadata. Use this if you have existing data in Elasticsearch.* @@ -101,18 +130,52 @@ To update OLake UI to the latest version, use the following commands based on yo - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down && \ - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - down && \ + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the latest compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + + # Stop existing containers + docker compose -f docker-compose-v1.yml down + + # Pull latest images and restart + docker compose -f docker-compose-v1.yml pull + docker compose -f docker-compose-v1.yml up -d + ``` + + **Note**: Your data and configurations will be preserved as they are stored in persistent volumes and the `olake-data` directory. - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down && \ - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down && \ + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d + ``` + + + ```powershell + # Download the latest compose file + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + + # Stop existing containers + docker compose -f docker-compose.yml down + + # Pull latest images and restart + docker compose -f docker-compose.yml pull + docker compose -f docker-compose.yml up -d + ``` + + **Note**: Your data and configurations will be preserved as they are stored in persistent volumes and the `olake-data` directory. @@ -120,13 +183,33 @@ To update OLake UI to the latest version, use the following commands based on yo To move from a Legacy setup (with Elasticsearch) to the new Postgres-only visibility setup, follow these steps: 1. Stop and remove the existing legacy stack: - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - down + ``` + + + ```powershell + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml" -OutFile "docker-compose.yml" + docker compose -f docker-compose.yml down + ``` + + 2. Start the new stack: - ```bash - curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d - ``` + + + ```bash + curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml | docker compose -f - up -d + ``` + + + ```powershell + Invoke-WebRequest -Uri "https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml" -OutFile "docker-compose-v1.yml" + docker compose -f docker-compose-v1.yml up -d + ``` + + :::warning **Visibility Data Loss** When upgrading from the Legacy setup to the New setup, **existing job logs and workflow history** stored in Elasticsearch will **not** be visible in the new setup. Only new jobs and logs generated after the upgrade will be visible. @@ -302,9 +385,59 @@ The retention period is specified in **hours**. Only job and sync history within ### Common Issues #### Port Conflicts -If port binding errors occur: -1. Check what's using the ports: `lsof -i :8000` (on macOS/Linux) -2. Stop conflicting services or change ports in `docker-compose.yml` + +If Docker fails to start with a message like `Bind for 0.0.0.0:8000 failed: port is already allocated`, another process is using one of OLake's ports. + +**Step 1 — Find what's using the port** + + + +```bash +lsof -i :8000 +``` +Example output: +``` +COMMAND PID USER TYPE NODE NAME +node 4521 john IPv4 TCP *:8000 (LISTEN) +``` +The `PID` column shows the process ID. + + +```powershell +netstat -ano | findstr :8000 +``` +Example output: +``` +TCP 0.0.0.0:8000 0.0.0.0:0 LISTENING 4521 +``` +The **last column** (`4521` in the example) is the Process ID (PID). + +To see which application that process belongs to, run the following (replace `4521` with the actual PID from your output): +```powershell +Get-Process -Id 4521 +``` +Example output: +``` +Handles NPM(K) PM(K) WS(K) CPU(s) Id SI ProcessName +------- ------ ----- ----- ------ -- -- ----------- + 350 25 45000 52000 1.23 4521 1 node +``` +The `ProcessName` column tells you what's running on that port (e.g. `node`, `python`, `nginx`). + + + +**Step 2 — Resolve the conflict** + +You have two options: + +- **Option A — Stop the conflicting process**: Close the application that's using the port (e.g. stop another Docker container, close a local dev server). + +- **Option B — Change OLake's port**: Open `docker-compose-v1.yml` (or `docker-compose.yml` for legacy) and change the host port mapping. For example, to use port `8001` instead of `8000`: + ```yaml + ports: + - "8001:8000" # host:container + ``` + Then restart: `docker compose -f docker-compose-v1.yml up -d` #### Database Connection Issues - Ensure PostgreSQL container is healthy: `docker compose ps`