Epee -- The Pipeline-First Database Query Language

Originally created by Saif Al-Din Ali as a compiler for a pseudocode-style language. Enhanced with a full-featured, pipeline-first database query engine, an interactive REPL, and a growing library of built-in functions.

Overview
Building and Running
Pipeline Queries
Data Types
Schema Operations
Data Manipulation
Pipeline Stages Reference
Expressions and Operators
CASE / WHEN Expressions
Aggregate Functions
Scalar Functions
Predicate Expressions
Joins
Grouping and Aggregation
Transactions
Variables and Control Flow
User-Defined Functions
Interactive REPL
Legacy Compiler Mode
Project Structure

Overview

Epee is a database query language built around the pipeline operator |>. Queries are written as a left-to-right chain of transformations instead of the nested, inside-out structure found in SQL. The engine runs entirely in-memory, supports transactions, and includes an interactive REPL for exploratory work.

SQL versus Epee

Traditional SQL:

SELECT name, salary * 12 AS annual_pay
  FROM employees
 WHERE department = 'Engineering' AND salary > 80000
 ORDER BY annual_pay DESC
 LIMIT 5;

The same query in Epee:

employees
    |> where(department == "Engineering" and salary > 80000.0)
    |> select(name, salary * 12 as annual_pay)
    |> orderby(annual_pay desc)
    |> limit(5)
    |> print;

Each stage reads left to right: start with the employees table, keep rows matching the condition, project two columns, sort them, take the first five, and print the result.

Building and Running

Requirements: a C++17 compiler (g++ or clang++) and GNU Make.

make            # build the binary
make clean      # remove object files and the binary
make test       # run all built-in test suites

Three execution modes:

./epee                      # start the interactive REPL
./epee queries.ep           # run a query file
./epee --compile program.ep # run the legacy compiler pipeline

Pipeline Queries

A pipeline begins with a table name followed by one or more stages separated by |>. Each stage transforms the result set from the previous stage.

table_name
    |> stage1(...)
    |> stage2(...)
    |> ...
    |> print;

The final print stage outputs the result as a formatted ASCII table. A pipeline that does not end with print still executes but produces no visible output (useful for update and delete pipelines).

Stages can be written on a single line or across multiple lines for readability:

employees |> where(salary > 50000.0) |> select(name, salary) |> print;

Data Types

Type	Literal examples	Notes
`int`	`42`, `-7`, `0`	32-bit signed integer
`double`	`3.14`, `-0.5`, `1e6`	64-bit floating point
`string`	`"hello"`, `"it's fine"`	Double-quoted, supports `\"`, `\\`
`bool`	`true`, `false`
`null`	`null`	The absence of a value

Schema Operations

CREATE TABLE

create table employees (
    id      int primary key,
    name    string not null,
    email   string unique,
    salary  double,
    active  bool
);

Column constraints: primary key, unique, not null.

DROP TABLE

drop table employees;

SHOW TABLES

show tables;

Displays every table with its column count and row count.

DESCRIBE

describe employees;

Shows the column names, types, nullability, primary key, and uniqueness for the named table.

Data Manipulation

INSERT

insert into employees values (1, "Alice", "alice@co.com", 95000.0, true);

Multiple rows:

insert into employees values
    (2, "Bob",   "bob@co.com",   78000.0, true),
    (3, "Carol", "carol@co.com", 62000.0, false);

Named columns:

insert into employees (id, name, salary) values (4, "Dave", 55000.0);

SELECT (SQL-style)

select name, salary from employees where salary > 70000.0 orderby salary desc limit 5;

select distinct department from employees;

select count(*), avg(salary) from employees;

UPDATE (SQL-style)

update employees set salary = salary * 1.1 where active == true;

DELETE (SQL-style)

delete from employees where active == false;

Pipeline equivalents

The same operations expressed as pipelines:

// SELECT
employees
    |> where(salary > 70000.0)
    |> select(name, salary)
    |> orderby(salary desc)
    |> take(5)
    |> print;

// UPDATE
employees
    |> where(active == true)
    |> update(salary = salary * 1.1);

// DELETE
employees
    |> where(active == false)
    |> delete;

Pipeline Stages Reference

Complete list of pipeline stages

Stage	Description
`where(condition)`	Keep rows satisfying the condition
`select(columns...)`	Project or compute columns (replaces column set)
`map(columns...)`	Add computed columns to existing column set
`orderby(col asc/desc)`	Sort by one or more columns
`limit(n)` / `take(n)`	Return at most n rows
`offset(n)` / `skip(n)`	Skip the first n rows
`groupby(columns...)`	Group rows (combine with a following `select`)
`having(condition)`	Filter groups after aggregation
`join(table on condition)`	Inner-join another table
`distinct`	Remove duplicate rows
`count`	Replace the result with a single count row
`update(col = expr, ...)`	Modify matching rows in the underlying table
`delete`	Remove matching rows from the underlying table
`print`	Output the result as a formatted table

select versus map

select replaces the column set:

products |> select(name, price) |> print;
// result has exactly two columns: name, price

map appends new computed columns to the existing set:

products |> map(price * 1.1 as new_price) |> print;
// result has all original columns plus new_price

take and skip

take(n) and skip(n) are aliases for limit(n) and offset(n).

employees |> orderby(salary desc) |> skip(10) |> take(5) |> print;

Expressions and Operators

Arithmetic

Operator	Meaning	Example
`+`	Addition	`price + tax`
`-`	Subtraction	`total - discount`
`*`	Multiplication	`qty * price`
`/`	Division	`total / count`
`%`	Modulo	`id % 2`

String concatenation: "hello" + " world" produces "hello world".

Comparison

Operator	Meaning
`==`	Equal
`<>` or `!=`	Not equal
`<`	Less than
`>`	Greater than
`<=`	Less than or equal
`>=`	Greater than or equal

Logical

Operator	Meaning
`and`	Logical AND (short-circuit)
`or`	Logical OR (short-circuit)
`not`	Logical negation

CASE / WHEN Expressions

Conditional expressions with multiple branches:

select
    name,
    case
        when salary > 100000.0 then "Senior"
        when salary > 60000.0  then "Mid"
        else "Junior"
    end as level
from employees;

Inside a pipeline:

employees
    |> select(
        name,
        case
            when salary > 100000.0 then "Senior"
            when salary > 60000.0  then "Mid"
            else "Junior"
        end as level
    )
    |> print;

The else clause is optional. If no when branch matches and there is no else, the expression evaluates to null.

Aggregate Functions

These functions operate over groups of rows (or the entire result set when there is no groupby).

Function	Description
`count(*)`	Number of rows
`count(col)`	Number of non-null values
`sum(col)`	Sum of numeric values
`avg(col)`	Average of numeric values
`min(col)`	Minimum value
`max(col)`	Maximum value

Example:

orders
    |> groupby(region)
    |> select(region, count(*) as total, avg(amount) as avg_amount)
    |> orderby(total desc)
    |> print;

Scalar Functions

String functions

Function	Description
`upper(s)`	Convert to uppercase
`lower(s)`	Convert to lowercase
`length(s)`	Number of characters
`substr(s, start)` or `substr(s, start, len)`	Extract substring
`left(s, n)`	First n characters
`right(s, n)`	Last n characters
`trim(s)`	Remove leading/trailing whitespace
`lpad(s, len, pad)`	Left-pad to target length
`rpad(s, len, pad)`	Right-pad to target length
`replace(s, old, new)`	Replace all occurrences
`reverse(s)`	Reverse the string
`repeat(s, n)`	Repeat the string n times
`concat(a, b, ...)`	Concatenate values

Example:

employees
    |> select(
        name,
        upper(name) as shouting,
        left(name, 1) as initial,
        lpad(name, 12, ".") as padded,
        reverse(name) as backwards
    )
    |> take(5)
    |> print;

Math functions

Function	Description
`abs(n)`	Absolute value
`round(n)`	Round to nearest integer
`floor(n)`	Round down
`ceil(n)`	Round up
`power(base, exp)`	Raise base to the power of exp
`sqrt(n)`	Square root
`log(n)` or `log(base, n)`	Natural log, or log with custom base
`pi()`	The constant 3.14159...
`random()`	Random double in the range [0.0, 1.0)

Example:

products
    |> select(name, price, round(sqrt(price)) as root, power(price, 2) as squared)
    |> take(5)
    |> print;

Utility functions

Function	Description
`coalesce(a, b, ...)`	Return the first non-null argument
`nullif(a, b)`	Return null if a equals b, otherwise return a
`typeof(expr)`	Return the type name as a string
`cast(expr, "type")`	Convert to the given type (int/double/string/bool)
`iif(cond, t, f)`	Inline if: return t when cond is true, else f
`now()`	Current timestamp as a string

Example:

contacts
    |> select(name, coalesce(phone, email, "N/A") as contact)
    |> print;

inventory
    |> select(
        product,
        quantity,
        iif(quantity > 100, "In Stock", "Low Stock") as availability,
        typeof(price) as price_type
    )
    |> print;

Predicate Expressions

BETWEEN / NOT BETWEEN

products |> where(price between 10.0 and 100.0) |> print;
products |> where(price not between 10.0 and 100.0) |> print;

IN / NOT IN

products |> where(category in ("Electronics", "Books")) |> print;
products |> where(category not in ("Electronics", "Books")) |> print;

LIKE / NOT LIKE

Pattern matching with % (any sequence of characters) and _ (single character). Matching is case-insensitive.

products |> where(name like "Laptop%") |> print;
products |> where(name not like "%Pro%") |> print;

IS NULL / IS NOT NULL

products |> where(description is null) |> print;
products |> where(description is not null) |> print;

Joins

Four join types are supported.

SQL-style joins

select employees.name, departments.name
    from employees
    inner join departments on employees.dept_id == departments.id;

select employees.name, departments.name
    from employees
    left join departments on employees.dept_id == departments.id;

Join types: inner join, left join, right join, cross join.

Pipeline joins

employees
    |> join(departments on employees.dept_id == departments.id)
    |> select(employees.name, departments.name, employees.salary)
    |> orderby(employees.salary desc)
    |> print;

The pipeline join defaults to inner join.

Grouping and Aggregation

Group rows and compute aggregates per group:

orders
    |> groupby(region)
    |> select(region, count(*) as num, sum(total) as revenue)
    |> orderby(revenue desc)
    |> print;

Use CASE inside a pipeline for dynamic bucketing:

products
    |> select(
        case
            when price > 500.0  then "Premium"
            when price > 100.0  then "Mid-range"
            else "Budget"
        end as tier,
        price
    )
    |> groupby(tier)
    |> select(tier, count(*) as items, avg(price) as avg_price)
    |> print;

SQL-style syntax:

select dept_id, count(*) as cnt, avg(salary) as avg_sal
    from employees
    groupby dept_id;

Transactions

Wrap a group of operations in a transaction. Use rollback to undo changes if something goes wrong.

begin;
    products |> where(category == "Electronics") |> update(price = price * 0.9);
commit;

begin;
    // something went wrong
rollback;

Variables and Control Flow

Variable declaration and assignment

int count;
count = 42;
print count;

string greeting;
greeting = "hello";
print greeting;

Supported types: int, double, string, bool.

If / else

int x;
x = 10;
if x > 5 then
    print "big";
else
    print "small";
fi;

While loops

int i;
i = 0;
while i < 5 do
    print i;
    i = i + 1;
od;

User-Defined Functions

Define reusable functions with typed parameters and return values. Recursion is supported.

def int factorial(int n)
    if n <= 1 then
        return 1;
    fi;
    return n * factorial(n - 1);
fed;

print factorial(5);

Functions can be called inside pipeline stages:

def double discount(double price, double pct)
    return price * (1.0 - pct / 100.0);
fed;

products
    |> select(name, price, discount(price, 15.0) as sale_price)
    |> print;

Interactive REPL

Start the REPL with ./epee (no arguments).

$ ./epee
    Epee v2.0 -- Pipeline-First Database Query Language
    Type 'help' for commands, 'exit' to quit.

epee> create table t (id int, name string);
Table 't' created.

epee> insert into t values (1, "Alice"), (2, "Bob");
Inserted 2 row(s).

epee> t |> print;
+----+-------+
| id | name  |
+----+-------+
|  1 | Alice |
|  2 | Bob   |
+----+-------+
2 row(s)

epee> exit

Commands: help, exit, quit.

Multi-line input is supported. The REPL accumulates lines until a semicolon is found.

Comments

// This is a line comment

/* This is a
   block comment */

Legacy Compiler Mode

The original compiler pipeline processes a pseudocode language through lexical analysis, syntax tree construction, semantic analysis, and three-address code generation. Activate it with:

./epee --compile program.ep

This mode is independent of the database engine.

Project Structure

Compiler/
  include/
    database/          -- database engine headers
      value.hpp        -- variant value type (int/double/string/bool/null)
      table.hpp        -- table, row, database, query result
      dbLexer.hpp      -- lexer token types and scanner
      dbParser.hpp     -- AST node types and recursive descent parser
      executor.hpp     -- query executor
      repl.hpp         -- interactive REPL
    lexicalAnalysis/   -- legacy compiler lexer
    syntaxAnalysis/    -- legacy compiler parser
    semanticAnalysis/  -- legacy compiler semantic analyzer
    intermediateCode/  -- three-address code generator
    tokens/            -- token type definitions
  src/
    main.cpp           -- entry point (REPL / file / legacy mode)
    database/          -- database engine implementation
    lexicalAnalysis/   -- legacy compiler implementation
    syntaxAnalysis/
    semanticAnalysis/
    intermediateCode/
    tokens/
  input/
    TestDB1.ep         -- basic table operations
    TestDB2.ep         -- pipeline queries
    TestDB3.ep         -- joins and aggregations
    TestDB4.ep         -- transactions, LIKE, BETWEEN, variables
    TestDB5.ep         -- CASE/WHEN, new functions, map/take/skip
    Test1-11.ep        -- legacy compiler tests
Makefile               -- build rules

Running Tests

make test

This executes TestDB1.ep through TestDB5.ep, covering table operations, pipeline queries, joins, aggregations, transactions, new functions, and CASE expressions.

Quick Reference

-- Schema
create table T (col type [constraints], ...);
drop table T;
show tables;
describe T;

-- Insert
insert into T values (...);

-- Query (SQL-style)
select cols from T where cond orderby col desc limit n;

-- Query (pipeline)
T |> where(cond) |> select(cols) |> orderby(col desc) |> take(n) |> print;

-- Computed columns
T |> map(price * 1.1 as new_price) |> print;

-- Grouping
T |> groupby(col) |> select(col, count(*) as n, avg(val) as mean) |> print;

-- CASE expression
T |> select(case when x > 0 then "pos" else "neg" end as sign) |> print;

-- Update / Delete
T |> where(cond) |> update(col = expr);
T |> where(cond) |> delete;

-- Transactions
begin; ... commit;

-- Variables and control flow
int x; x = 10;
if x > 5 then print x; fi;
while x > 0 do x = x - 1; od;

-- Functions
def int f(int n) return n * 2; fed;

Name		Name	Last commit message	Last commit date
Latest commit History 222 Commits
Compiler.xcodeproj		Compiler.xcodeproj
Compiler		Compiler
.gitattributes		.gitattributes
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Epee -- The Pipeline-First Database Query Language

Table of Contents

Overview

SQL versus Epee

Building and Running

Pipeline Queries

Data Types

Schema Operations

CREATE TABLE

DROP TABLE

SHOW TABLES

DESCRIBE

Data Manipulation

INSERT

SELECT (SQL-style)

UPDATE (SQL-style)

DELETE (SQL-style)

Pipeline equivalents

Pipeline Stages Reference

Complete list of pipeline stages

select versus map

take and skip

Expressions and Operators

Arithmetic

Comparison

Logical

CASE / WHEN Expressions

Aggregate Functions

Scalar Functions

String functions

Math functions

Utility functions

Predicate Expressions

BETWEEN / NOT BETWEEN

IN / NOT IN

LIKE / NOT LIKE

IS NULL / IS NOT NULL

Joins

SQL-style joins

Pipeline joins

Grouping and Aggregation

Transactions

Variables and Control Flow

Variable declaration and assignment

If / else

While loops

User-Defined Functions

Interactive REPL

Comments

Legacy Compiler Mode

Project Structure

Running Tests

Quick Reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages