Product engineer building developer tools, TypeScript SDKs, APIs, and workflow infrastructure for complex product systems. Currently at Comet ML working on OPIK: open-source tooling for debugging, evaluating, and monitoring LLM applications.
I work across product surfaces, typed SDK behavior, API contracts, integrations, evaluation workflows, datasets, guardrails, and developer experience. The recurring challenge is making complex workflow behavior understandable, composable, and maintainable without forcing developers into one rigid application shape.
Barcelona · yaroslavboiko.com · LinkedIn · Email
| Project | What it proves | Stars |
|---|---|---|
| OPIK | Core contributor to Comet ML's open-source LLM evaluation and observability platform. Work spans product workflows, SDK surfaces, API boundaries, integrations, datasets, guardrails, and developer tooling. | |
| OPIK MCP | Public contribution work on Comet ML's MCP server for bringing OPIK workflows into assistant-enabled development environments. | |
| notion-mcp-server | MCP server for Notion pages, databases, and workspace workflows through typed tool interfaces. | |
| replicate-flux-mcp | Focused MCP server for Replicate Flux image-generation workflows with inspectable inputs and outputs. | |
| yaroslavboiko.com | Personal site and technical writing. Astro, Cloudflare Workers, typed content, and a Three.js scene that earns its bytes. |
- Product engineering for expert workflows. I turn ambiguous workflows into reliable product surfaces, APIs, SDKs, and reusable abstractions.
- TypeScript SDKs and API design. Typed boundaries, compatibility tradeoffs, integration behavior, and debugging feedback loops.
- React product interfaces. Complex stateful workflows, evaluations, datasets, guardrails, playgrounds, and developer-facing UX.
- Open-source maintenance. Reviewable changes, stable interfaces, practical examples, and visible contribution history.
- MCP-native integrations. Not as a headline identity, but as a useful surface for connecting developer environments to real tools.
- Stop Using Claude Code on Defaults - five settings I changed in
~/.claude/settings.jsonto save tokens and stop approvinglsfor the 400th time. - Agentic UX Primitives - streaming, HITL gates, reasoning traces, confidence indicators: the frontend patterns behind products like Cursor and Claude.
- Context Engineering Ate Prompt Engineering - what's replacing prompt engineering, and how it separates AI-augmented developers from AI-dependent ones.
- The Vibe Coding Reckoning - AI coding tools changed the pace of software work, but production engineering still comes down to understanding, review, and ownership.
Full archive: yaroslavboiko.com/blog
- Specs before cleverness. I prefer clear tradeoffs, small interfaces, and written constraints over impressive-looking demos.
- SDKs over one-off adapters. If behavior crosses product boundaries, it needs typed contracts and examples people can trust.
- Evals as a habit. Agentic features need traces, offline evals, and reproducible feedback loops. "Looks good in chat" is not a gate.
- Stream everything that should feel alive. SSE, structured outputs, and human-in-the-loop gates where mistakes cost more than a click.
- Context engineering over prompt theater. Tight working sets, retrieval that earns its tokens, and prompts treated as code.
- TypeScript end to end. React and Node for product surfaces, Python when eval or ML tooling makes it the right tool.





