Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/01-language.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ From highest (evaluated first) to lowest (evaluated last):
| 1 (highest) | function call, array index `[]`, parentheses `()` |
| 2 | unary minus `-`, logical not `!` |
| 3 | `*`, `/` |
| 4 | `+`, `-` |
| 4 | `+`, `-`, `++`|
| 5 | `==`, `!=`, `<`, `<=`, `>`, `>=` |
| 6 | `and` |
| 7 (lowest) | `or` |
Expand Down
2 changes: 2 additions & 0 deletions docs/03-ast.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ Expr::Literal(42) -- the integer 42
Expr::Ident("x") -- the variable x
Expr::Add(left, right) -- left + right
Expr::Mul(left, right) -- left * right
Expr::len(expr) -- length of a string/array expression
Expr::contains(a, b) -- membership/substring check expression
Expr::Call { name, args } -- a function call
Expr::Index { base, index } -- base[index]
```
Expand Down
15 changes: 14 additions & 1 deletion docs/04-parser.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,15 @@ Here is roughly what happens, step by step:
The same recursive logic handles arbitrarily complex expressions like
`a * (b + c) - sqrt(d)`.

In addition to generic function calls, MiniC also parses two dedicated
expression forms with function-like syntax:

- `len(expr)`
- `contains(expr, expr)`

Even though they look like calls, they are mapped to dedicated AST nodes
(`Expr::len` and `Expr::contains`) instead of `Expr::Call`.

---

## What is a Parser Combinator?
Expand Down Expand Up @@ -103,13 +112,17 @@ expression
└── logical_and (and)
└── logical_not (!)
└── relational (== != < <= > >=)
└── additive (+ -)
└── additive (+ - ++)
└── multiplicative (* /)
└── unary (unary -)
└── primary (atoms + indexing)
└── atom
```

At the `atom` level, the parser gives dedicated precedence to `len(...)`
and `contains(...)` before the generic call parser, so these two constructs
always become core expression nodes.

When `additive` needs its right operand, it calls `multiplicative`. So `*`
always groups before `+` — naturally, without any precedence table.

Expand Down
14 changes: 14 additions & 0 deletions docs/05-type-checker.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,20 @@ pass type checking.
`Type::Any` is never inferred for a variable or expression — it only appears
in the registry as a parameter type for built-in functions.

### `len` and `contains` as expression forms

`len` and `contains` are no longer validated through stdlib function
signatures. They are checked as dedicated expression nodes:

- `len(expr)`
: `expr` must be `str` or `array`, result type is `int`.
- `contains(container, item)`
: if `container` is `str`, `item` must be `str`; if `container` is
`array(T)`, `item` must be compatible with `T`; result type is `bool`.

This moves type errors for those constructs to their specific expression
rules, instead of generic call-argument validation.

---

## Key Design Decision: Fail on the First Error
Expand Down
9 changes: 9 additions & 0 deletions docs/06-interpreter.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,15 @@ executor `exec_stmt` handles each statement form:
| `return expr` | Evaluates `expr` and signals an early return |
| `f(args)` | Evaluates arguments, calls `f`, discards the return value |

In expression evaluation, MiniC also supports dedicated nodes for:

- `len(expr)`
: evaluates `expr` and returns `Int` with character count (`str`) or element
count (`array`).
- `contains(container, item)`
: evaluates both operands and returns `Bool` using substring semantics for
strings and membership semantics for arrays.

### How `return` propagates

Statements do not normally produce values, but `return` must pass its value
Expand Down
4 changes: 4 additions & 0 deletions docs/07-stdlib.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ MiniC comes with a small set of built-in functions available to every
program. This document describes them from a user perspective and then
explains how they are implemented and how to add new ones.

Note: `len(...)` and `contains(...)` are core language expressions in the
parser/type-checker/interpreter pipeline. They are not registered as native
functions in `NativeRegistry`.

---

## Built-in Functions
Expand Down
53 changes: 53 additions & 0 deletions src/interpreter/eval_expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@ pub fn eval_expr(expr: &CheckedExpr, env: &mut Environment<Value>) -> Result<Val
Expr::Mul(l, r) => numeric_binop(eval_expr(l, env)?, eval_expr(r, env)?, |a, b| a * b, |a, b| a * b),
Expr::Div(l, r) => numeric_binop(eval_expr(l, env)?, eval_expr(r, env)?, |a, b| a / b, |a, b| a / b),

Expr::Concat(l, r) => string_binop(eval_expr(l, env)?, eval_expr(r, env)?, |a, b| a + &b),

Expr::Lt(l, r) => numeric_cmp(eval_expr(l, env)?, eval_expr(r, env)?, |a, b| a < b, |a, b| a < b),
Expr::Le(l, r) => numeric_cmp(eval_expr(l, env)?, eval_expr(r, env)?, |a, b| a <= b, |a, b| a <= b),
Expr::Gt(l, r) => numeric_cmp(eval_expr(l, env)?, eval_expr(r, env)?, |a, b| a > b, |a, b| a > b),
Expand Down Expand Up @@ -121,6 +123,43 @@ pub fn eval_expr(expr: &CheckedExpr, env: &mut Environment<Value>) -> Result<Val
Ok(Value::Array(vals?))
}

Expr::Len(arg) => {
let val = eval_expr(arg, env)?;
match val {
Value::Str(s) => Ok(Value::Int(s.chars().count() as i64)),
Value::Array(elems) => Ok(Value::Int(elems.len() as i64)),
v => Err(RuntimeError::new(format!(
"len: expected string or array argument, got: {}",
v
))),
}
}

Expr::Contains(container, item) => {
let container_val = eval_expr(container, env)?;
let item_val = eval_expr(item, env)?;
match container_val {
Value::Str(s) => {
if let Value::Str(item_str) = item_val {
Ok(Value::Bool(s.contains(&item_str)))
} else {
Err(RuntimeError::new("contains: string container requires string item"))
}
}
Value::Array(elems) => {
if let Some(_) = elems.iter().find(|&e| values_equal(e, &item_val)) {
Ok(Value::Bool(true))
} else {
Ok(Value::Bool(false))
}
}
v => Err(RuntimeError::new(format!(
"contains: expected string or array container, got: {}",
v
))),
}
}

Expr::Index { base, index } => {
let base_val = eval_expr(base, env)?;
let idx_val = eval_expr(index, env)?;
Expand Down Expand Up @@ -209,6 +248,20 @@ fn numeric_binop(
}
}

fn string_binop(
lv: Value,
rv: Value,
concat_op: impl Fn(String, String) -> String,
) -> Result<Value, RuntimeError> {
match (lv, rv) {
(Value::Str(a), Value::Str(b)) => Ok(Value::Str(concat_op(a, b))),
(l, r) => Err(RuntimeError::new(format!(
"string concatenation requires Str operands, got: {} and {}",
l, r
))),
}
}

fn numeric_cmp(
lv: Value,
rv: Value,
Expand Down
3 changes: 3 additions & 0 deletions src/ir/ast.rs
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ pub enum Expr<Ty> {
Neg(Box<ExprD<Ty>>),
Add(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
Sub(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
Concat(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
Mul(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
Div(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
Eq(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
Expand All @@ -100,6 +101,8 @@ pub enum Expr<Ty> {
Not(Box<ExprD<Ty>>),
And(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
Or(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
Len(Box<ExprD<Ty>>),
Contains(Box<ExprD<Ty>>, Box<ExprD<Ty>>),
/// Function call: name(args)
Call {
name: String,
Expand Down
41 changes: 39 additions & 2 deletions src/parser/expressions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ use nom::{
character::complete::{char, multispace0},
combinator::map,
multi::separated_list0,
sequence::{delimited, pair, preceded, tuple},
sequence::{delimited, pair, preceded, tuple, separated_pair},
IResult,
};

Expand All @@ -65,11 +65,42 @@ pub fn parse_call(input: &str) -> IResult<&str, (String, Vec<UncheckedExpr>)> {
Ok((rest, (name.to_string(), args)))
}

/// Atom: literal, call, array literal, identifier, or parenthesized expression.
/// Parse length: `len ( expr )`. Returns the inner expr.
pub fn parse_len(input: &str) -> IResult<&str, UncheckedExpr> {
let (rest, _) = preceded(multispace0, tag("len"))(input)?;
let (rest, arg) = delimited(
preceded(multispace0, tag("(")),
preceded(multispace0, expression),
preceded(multispace0, tag(")")),
)(rest)?;
Ok((rest, arg))
}

/// Parse contains: `contains ( expr, expr )`. Returns (container, item).
pub fn parse_contains(input: &str) -> IResult<&str, (UncheckedExpr, UncheckedExpr)> {
let (rest, _) = preceded(multispace0, tag("contains"))(input)?;
let (rest, (container, item)) = delimited(
preceded(multispace0, tag("(")),
separated_pair(
preceded(multispace0, expression),
preceded(multispace0, tag(",")),
preceded(multispace0, expression),
),
preceded(multispace0, tag(")")),
)(rest)?;
Ok((rest, (container, item)))
}

/// Atom: literal, len, contains, call, array literal, identifier, or parenthesized expression.
fn atom(input: &str) -> IResult<&str, UncheckedExpr> {
alt((
map(literal, |l| wrap(Expr::Literal(l.into()))),
map(parse_len, |arg| wrap(Expr::Len(Box::new(arg)))),
map(parse_contains, |(container, item)| {
wrap(Expr::Contains(Box::new(container), Box::new(item)))
}),
map(parse_call, |(name, args)| wrap(Expr::Call { name, args })),

map(
delimited(
preceded(multispace0, char('[')),
Expand Down Expand Up @@ -160,6 +191,12 @@ fn additive(input: &str) -> IResult<&str, UncheckedExpr> {
rest = r;
continue;
}
let str_concat = tuple((multispace0, tag("++"), multispace0, multiplicative))(rest);
if let Ok((r, (_, _, _, e))) = str_concat {
acc = wrap(Expr::Concat(Box::new(acc), Box::new(e)));
rest = r;
continue;
}
break;
}
Ok((rest, acc))
Expand Down
49 changes: 49 additions & 0 deletions src/semantic/type_checker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -358,6 +358,10 @@ fn type_check_expr_inner(
Box::new(type_check_expr_to_typed(l, env)?),
Box::new(type_check_expr_to_typed(r, env)?),
)),
Expr::Concat(l, r) => Ok(Expr::Concat(
Box::new(type_check_expr_to_typed(l, env)?),
Box::new(type_check_expr_to_typed(r, env)?),
)),
Expr::Mul(l, r) => Ok(Expr::Mul(
Box::new(type_check_expr_to_typed(l, env)?),
Box::new(type_check_expr_to_typed(r, env)?),
Expand Down Expand Up @@ -399,6 +403,11 @@ fn type_check_expr_inner(
Box::new(type_check_expr_to_typed(l, env)?),
Box::new(type_check_expr_to_typed(r, env)?),
)),
Expr::Len(arg) => Ok(Expr::Len(Box::new(type_check_expr_to_typed(arg, env)?))),
Expr::Contains(container, item) => Ok(Expr::Contains(
Box::new(type_check_expr_to_typed(container, env)?),
Box::new(type_check_expr_to_typed(item, env)?),
)),
Expr::Call { name, args } => {
let args_checked: Result<Vec<_>, _> =
args.iter().map(|a| type_check_expr_to_typed(a, env)).collect();
Expand Down Expand Up @@ -446,6 +455,11 @@ fn type_check_expr(
let rt = type_check_expr(r, env)?;
numeric_binop_result(&lt, &rt)
}
Expr::Concat(l, r) => {
let lt = type_check_expr(l, env)?;
let rt = type_check_expr(r, env)?;
string_binop_result(&lt, &rt)
}
Expr::Eq(l, r) | Expr::Ne(l, r) => {
let lt = type_check_expr(l, env)?;
let rt = type_check_expr(r, env)?;
Expand Down Expand Up @@ -485,6 +499,34 @@ fn type_check_expr(
Err(TypeError::new("and/or require Bool operands"))
}
}
Expr::Len(arg) => {
let ty = type_check_expr(arg, env)?;
match ty {
Type::Str | Type::Array(_) => Ok(Type::Int),
_ => Err(TypeError::new("len requires a string or array operand")),
}
}
Expr::Contains(container, item) => {
let container_ty = type_check_expr(container, env)?;
let item_ty = type_check_expr(item, env)?;
match container_ty {
Type::Str => {
if item_ty == Type::Str {
Ok(Type::Bool)
} else {
Err(TypeError::new("contains: string container requires string item"))
}
}
Type::Array(elem_ty) => {
if types_compatible(&item_ty, &elem_ty) {
Ok(Type::Bool)
} else {
Err(TypeError::new("contains: array item type mismatch"))
}
}
_ => Err(TypeError::new("contains requires a string or array container")),
}
}
Expr::Call { name, args } => {
let args_checked: Result<Vec<_>, _> =
args.iter().map(|a| type_check_expr_to_typed(a, env)).collect();
Expand Down Expand Up @@ -565,6 +607,13 @@ fn numeric_binop_result(l: &Type, r: &Type) -> Result<Type, TypeError> {
}
}

fn string_binop_result(l: &Type, r: &Type) -> Result<Type, TypeError> {
match (l, r) {
(Type::Str, Type::Str) => Ok(Type::Str),
_ => Err(TypeError::new("string concatenation requires Str operands")),
}
}

fn is_numeric(ty: &Type) -> bool {
matches!(ty, Type::Int | Type::Float)
}
Expand Down
34 changes: 32 additions & 2 deletions src/stdlib/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,10 @@
//! return type) with the Rust function that implements the behaviour.
//!
//! The default registry (via `NativeRegistry::default()`) registers:
//! `print`, `readInt`, `readFloat`, `readString` (IO), and `pow`, `sqrt`
//! (math). Implementations live in the [`io`] and [`math`] sub-modules.
//! `print`, `readInt`, `readFloat`, `readString` (IO), `pow`, `sqrt`
//! (math), and string utilities like `substr`, `toUpper`, `toLower`,
//! `strToInt`, `strToFloat`. Implementations live in the [`io`],
//! [`math`], and [`string`] sub-modules.
//!
//! # Design Decisions
//!
Expand Down Expand Up @@ -55,6 +57,7 @@ use crate::interpreter::value::NativeFn;

pub mod io;
pub mod math;
pub mod string;

/// A registry entry: MiniC type signature + Rust implementation.
pub struct NativeEntry {
Expand Down Expand Up @@ -129,6 +132,33 @@ impl Default for NativeRegistry {
func: math::sqrt_fn,
});

// String
r.register("substr", NativeEntry {
params: vec![Type::Str, Type::Int, Type::Int],
return_type: Type::Str,
func: string::substr,
});
r.register("toUpper", NativeEntry {
params: vec![Type::Str],
return_type: Type::Str,
func: string::to_upper,
});
r.register("toLower", NativeEntry {
params: vec![Type::Str],
return_type: Type::Str,
func: string::to_lower,
});
r.register("strToInt", NativeEntry {
params: vec![Type::Str],
return_type: Type::Int,
func: string::str_to_int,
});
r.register("strToFloat", NativeEntry {
params: vec![Type::Str],
return_type: Type::Float,
func: string::str_to_float,
});

r
}
}
Loading