Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@ power_of_3-*.tar

**/.cubestore/*
**/model/*
TODO.md
8 changes: 0 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,10 @@ All notable changes to PowerOfThree will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.1.3] - 2024-12-24

### Added

- **Blocky Minecraft-Style Lifter**: Weightlifter character in completed snatch position
- Centered on barbell with arms extended to touch the bar
- Represents PowerOfThree successfully lifting heavy analytics workloads
- Displays on auto-generated cube compile output
- Built with Unicode block characters for consistent terminal rendering

- **ASCII Art Barbell Logo**: Olympic weightlifting barbell logo displaying on auto-generated cube output
- Left plate: Hexagon labeled "Ecto Macro Elixir" (representing Elixir/Ecto)
- Center bar: Realistic Olympic barbell with knurling pattern and collar clips
Expand Down
125 changes: 125 additions & 0 deletions CHANGELOG_v0.1.4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Changelog

## [0.1.4] - 2025-12-26

### Added

#### Features
- **SQL Keyword Collision Detection** - Automatically detects and warns when `sql_table` names collide with SQL keywords (e.g., "order", "user", "group"). Provides actionable suggestions to use schema-qualified names (`public.order`) to prevent SQL errors.
- New functions: `is_sql_keyword?/1`, `is_schema_qualified?/1`, `validate_sql_table/2`
- Tracks 50+ SQL keywords and Cube.js reserved keywords
- Helpful warning messages with solutions

#### Testing
- **HTTP vs Arrow Performance Test Suite** (809 lines)
- 11 comprehensive test scenarios
- Query sizes from 200 to 50K rows
- Column widths from 2 to 8 columns
- Cache performance validation
- **Result:** Arrow IPC is 25-66x faster than HTTP API

- **Pre-aggregation Routing Tests** (399 lines)
- Validates query rewriting logic
- Tests granularity matching (day, month, year)
- Pre-aggregation selection verification

- **Real-world Cube Tests** (430 lines)
- Comprehensive tests for mandata_captate cube
- Time dimension query patterns
- Aggregation and filter combinations

- **SQL Keyword Safety Tests** (237 lines)
- Validates keyword collision detection
- Tests schema-qualified name handling
- Warning message verification

- **CubeStore Metastore Tests** (240 lines)
- Metastore integration validation
- Pre-aggregation discovery tests

- **Comprehensive Performance Tests** (376 lines)
- End-to-end performance benchmarking
- Query generation and execution timing
- Cache warm-up and iteration testing

**Total Test Coverage Increase:** +2,491 lines (625% increase)

#### Documentation
- **cache_performance_impact.md** (251 lines)
- Documents dramatic Arrow IPC performance improvements
- Cache impact analysis: 3-89x speedup
- Arrow vs HTTP comparison: 25-66x faster
- Detailed benchmark tables for all test scenarios

- **PREAGG_GRANULARITY_IMPACT.md** (179 lines)
- Pre-aggregation granularity performance study
- Day vs month vs year granularity comparison
- Query routing logic documentation

- **LARGE_SCALE_TEST_RESULTS.md** (208 lines)
- 50K+ row query performance benchmarks
- Network overhead analysis
- Caching strategy recommendations

- **MANDATA_CAPTATE_TEST_RESULTS.md** (238 lines)
- Real-world cube query results
- Time dimension patterns
- Production query benchmarks

- **TEST_CLEANUP_SUMMARY.md** (182 lines)
- Test suite organization guide
- Test coverage summary
- Testing best practices

#### Presentations
- **v0.1.3-release-talk.md** (806 lines)
- Complete presentation deck for v0.1.3 release
- Architecture diagrams and performance comparisons
- Live demo scenarios

- **v0.1.3-talking-points.md** (701 lines)
- Detailed talking points and technical deep-dives
- Q&A preparation material

**Total Documentation Added:** +2,565 lines

### Changed
- Enhanced `lib/power_of_three.ex` with SQL keyword validation (+180 lines)
- Improved default value handling for auto-generation
- Enhanced test helper utilities
- Updated getting started guide

### Fixed
- Better handling of nil Ecto.Schema fields in auto-generation
- Improved default value sensibility
- Enhanced auto-generation with `from` option

### Performance
**Arrow IPC vs HTTP API (with cache):**
- Small queries (200 rows): **25.5x faster** (2ms vs 51ms)
- Medium queries (1,827 rows): **66x faster** (1ms vs 66ms)
- Large queries (50K rows): **25x faster** (46ms vs 1,149ms)

**Cache Impact on Arrow IPC:**
- Average speedup: **30.6x faster**
- Best case: **89x faster** (89ms → 1ms)
- Range: 3-89x improvement across all query types

### Statistics
```
27 files changed
5,291 insertions(+)
104 deletions(-)
```

---

## [0.1.3] - 2024-12-XX

### Fixed
- Excluded ADBC dependency from hex.publish package
- Fixed test coverage configuration

---

For complete release notes, see [RELEASE_v0.1.4.md](./RELEASE_v0.1.4.md)
16 changes: 8 additions & 8 deletions CUBE_SERVICE_MANAGEMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
The PowerOfThree `df/2` functionality requires three services to be running:
1. **PostgreSQL** - Data storage (port 7432)
2. **Cube API** - Cube.js server (port 4008)
3. **cubesqld** - Arrow Native protocol server (port 4445)
3. **cubesqld** - ADBC(Arrow Native) protocol server (port 8120)

All scripts are located in: `~/projects/learn_erl/cube/examples/recipes/arrow-ipc/`

Expand Down Expand Up @@ -40,7 +40,7 @@ cd ~/projects/learn_erl/cube/examples/recipes/arrow-ipc
```

**Features:**
- Provides Arrow Native protocol on port 4445
- Provides ADBC(Arrow Native) protocol on port 8120
- Provides PostgreSQL protocol on port 4444
- **Logs:** Output to terminal (stdout)

Expand All @@ -63,7 +63,7 @@ tail -f ~/projects/learn_erl/cube/examples/recipes/arrow-ipc/cubesqld.log
```bash
# If running in foreground: Ctrl+C
# If running in background:
kill $(lsof -ti:4445)
kill $(lsof -ti:8120)
```

### Stop Cube API
Expand All @@ -85,14 +85,14 @@ docker-compose down

```bash
# Check all services at once
lsof -i :7432,4008,4445 | grep LISTEN
lsof -i :7432,4008,8120 | grep LISTEN
```

Expected output:
```
postgres <pid> io 5u IPv4 ... TCP *:7432 (LISTEN)
node <pid> io 21u IPv4 ... TCP *:4008 (LISTEN)
cubesqld <pid> io 9u IPv4 ... TCP *:4445 (LISTEN)
cubesqld <pid> io 9u IPv4 ... TCP *:8120 (LISTEN)
```

---
Expand Down Expand Up @@ -133,7 +133,7 @@ Based on `~/projects/learn_erl/power-of-three-examples/config/config.exs`:
config :your_app, Adbc.CubePool,
pool_size: 10,
host: "localhost",
port: 4445, # Arrow Native protocol
port: 8120, # ADBC(Arrow Native) protocol
token: "test",
username: "username",
password: "password"
Expand All @@ -151,7 +151,7 @@ CUBEJS_DB_NAME=pot_examples_dev
CUBEJS_DB_USER=postgres
CUBEJS_DB_PASS=postgres
CUBEJS_DB_HOST=localhost
CUBEJS_ARROW_PORT=4445 # Arrow Native port
CUBEJS_ADBC_PORT=8120 # ADBC(Arrow Native) port
CUBESQL_CUBE_TOKEN=test # Authentication token
```

Expand Down Expand Up @@ -213,7 +213,7 @@ chmod +x ~/projects/learn_erl/cube/examples/recipes/arrow-ipc/start-all.sh
### Port Already in Use
```bash
# Find and kill process on specific port
lsof -ti:4445 | xargs kill -9
lsof -ti:8120 | xargs kill -9
```

### PostgreSQL Not Running
Expand Down
27 changes: 27 additions & 0 deletions CURRENT_FEATURES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
✅ COMPLETED: Column aliasing feature

You can now control the names of columns in the returned DataFrame using keyword list syntax:

```elixir
{:ok, df} = Customer.df(
columns: [
mah_brand: Customer.Dimensions.brand(),
mah_people: Customer.Measures.count()
],
limit: 1
)
```

This produces a DataFrame with columns: ["mah_brand", "mah_people"] instead of the default names.

Features:
- ✅ Works with both HTTP and ADBC modes
- ✅ Supports all query options (WHERE, ORDER BY, LIMIT, OFFSET)
- ✅ Backward compatible - plain list syntax still works
- ✅ Comprehensive test coverage (5 HTTP tests)

Implementation details:
- Column refs are parsed to detect keyword list format
- Aliases are extracted and mapped to Cube member names
- DataFrame columns are renamed after query execution
- Works with both normalized names (HTTP) and full member names (ADBC)
6 changes: 3 additions & 3 deletions IMPLEMENTATION_PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Implement the TODO from `lib/power_of_three.ex:152-191`:
┌──────────────────────────────────────────────────────┐
│ ADBC Connection Pool (via CubeQuery) │
│ • Executes query against Cube (port 4445) │
│ • Executes query against Cube (port 8120) │
│ • Returns Adbc.Result │
└──────────────────────────────────────────────────────┘
Expand Down Expand Up @@ -483,7 +483,7 @@ Must be called before using df/1.

* `:pool_module` - Module implementing the connection pool
* `:host` - Cube server host (default: "localhost")
* `:port` - Cube Arrow Native port (default: 4445)
* `:port` - Cube ADBC port (default: 8120)
* `:token` - Authentication token (default: "test")

## Examples
Expand All @@ -495,7 +495,7 @@ Must be called before using df/1.
# Configure cube pool
cube_pool MyApp.CubePool,
host: "localhost",
port: 4445,
port: 8120,
token: System.get_env("CUBE_TOKEN")

schema "customer" do
Expand Down
22 changes: 11 additions & 11 deletions PHASE3_INTEGRATION_TEST_RESULTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Phase 3 DataFrame functions have been successfully implemented and tested with l
|---------|------|--------|---------|
| PostgreSQL | 7432 | ✅ Running | Source database with customer data |
| Cube API | 4008 | ✅ Running | Cube.js semantic layer (HTTP/REST) |
| cubesqld | 4445 | ✅ Running | Arrow Native protocol server |
| cubesqld | 8120 | ✅ Running | ADBC(Arrow Native) protocol server |

### Configuration

Expand All @@ -38,7 +38,7 @@ Phase 3 DataFrame functions have been successfully implemented and tested with l
```elixir
[
host: "localhost",
port: 4445,
port: 8120,
token: "test",
driver_path: driver_path
]
Expand Down Expand Up @@ -233,9 +233,9 @@ end
```

**Connection Details:**
- Protocol: Arrow Native (via ADBC)
- Protocol: ADBC(Arrow Native) (via ADBC)
- Driver: `libadbc_driver_cube.so`
- Connection established to `localhost:4445`
- Connection established to `localhost:8120`
- Authentication: Token-based (`token: "test"`)

**Verification:**
Expand Down Expand Up @@ -317,9 +317,9 @@ LIMIT 5

**Data Flow Verified:**
```
cubesqld:4445 → Cube API:4008 → PostgreSQL:7432
cubesqld:8120 → Cube API:4008 → PostgreSQL:7432
Arrow IPC format
ADBC(Arrow Native) format
Materialized Result
Expand Down Expand Up @@ -478,8 +478,8 @@ When Explorer is available, the result would be an `Explorer.DataFrame` instead
│ ADBC
┌─────────────────────────────────────────────────┐
│ cubesqld (localhost:4445) │
│ • Arrow Native protocol
│ cubesqld (localhost:8120) │
│ • ADBC(Arrow Native) protocol │
│ • Receives SQL via ADBC │
│ • Forwards to Cube API │
└────────────────────┬────────────────────────────┘
Expand Down Expand Up @@ -664,15 +664,15 @@ end)
columns: [...],
connection_opts: [
host: "localhost",
port: 4445,
port: 8120,
token: System.get_env("CUBE_TOKEN")
]
)

# Option 2: Reuse connection (recommended for multiple queries)
{:ok, conn} = PowerOfThree.CubeConnection.connect(
host: "localhost",
port: 4445,
port: 8120,
token: "my-token"
)

Expand All @@ -686,7 +686,7 @@ result2 = Customer.df!(columns: [...], connection: conn)
# config/config.exs
config :power_of_three, PowerOfThree.CubeConnection,
host: "localhost",
port: 4445,
port: 8120,
token: System.get_env("CUBE_TOKEN")

# Then queries will use this config by default:
Expand Down
Loading