Skip to content

Tune extraction/scoring rules and add CLI examples to README#1

Open
devin-ai-integration[bot] wants to merge 3 commits into
mainfrom
devin/1779967031-tune-extraction-rules
Open

Tune extraction/scoring rules and add CLI examples to README#1
devin-ai-integration[bot] wants to merge 3 commits into
mainfrom
devin/1779967031-tune-extraction-rules

Conversation

@devin-ai-integration
Copy link
Copy Markdown

Summary

Improves summary extraction quality by expanding pattern matching rules and fixing several edge cases where filler text was preserved or the wrong clause was selected. Also updates the README with comprehensive CLI usage examples.

Extraction/scoring improvements:

  • Leading filler patterns: Added 11 new patterns to strip common conversational filler ("I just wanted to reach out", "I'm experiencing", "we are facing issues with", "just wanted to let you know", "we noticed that", etc.)
  • Root cause patterns: Added 12 new patterns for "not working/rendering/loading/showing/displaying/sending", "times/timing out", "crashes", "broken", "incorrect", "returns empty/404/500", "failing"
  • Issue keywords: Added 30+ new terms (crash, broken, webhook, api, email, sso, database, connection, 502/403/500, subscription, etc.)
  • Context-only patterns: Added 5 patterns to de-score greetings and sentiment ("hope you're doing well", "we've been using", "I love", etc.)
  • Clause splitting on "but": Clauses are now split on "but" in addition to "and", so "I love the product but the export is broken" correctly isolates "the export is broken"
  • Causal-tail candidates: When a segment contains "because/since/etc.", the text after the causal word is now scored as a separate candidate, preventing filler heads from winning
  • Inline score regex expansion: Now also boosts crash, broken, invalid, incorrect, and "timing/timed out" phrases
  • Compact summary improvements: Better stripping of linking verbs after user prefixes ("users are", "our team is/are/was/were/has/have")
  • Biome formatting fixes: Applied auto-formatting to pre-existing style issues

Before → After examples:

Input Before After
"Hey there! ... I just wanted to reach out because our team is unable to upload files..." "I just wanted to reach out" "Unable to upload files larger than 10MB"
"I'm experiencing a critical bug where the app crashes..." "Experiencing a critical bug where..." "The app crashes whenever I open the settings page on mobile"
"I wanted to report that our webhook endpoints are returning 502 errors..." "I wanted to report that our webhook endpoints..." "Our webhook endpoints are returning 502 errors"
"We are facing issues with the database connection pool. Queries are timing out..." "We are facing issues with the database connection pool" "Queries are timing out during peak hours"
"I love the product but the export feature is broken." "I love the product but the export feature is broken" "The export feature is broken"

README updates:

  • Added flag reference table for --summary, --customer, --platform
  • Added 7 example CLI invocations covering all 3 platform presets, noisy inputs, and fallback behavior

Review & Testing Checklist for Human

  • Run bun run preview with your own real support messages to verify extraction quality matches expectations
  • Verify the new leading filler patterns don't over-strip legitimate content (e.g. messages where "I wanted to report" IS the substantive content)
  • Check that the "but" clause splitting doesn't lose important context in edge cases (e.g. "tried X but Y happened" — verify Y is correctly selected over X)

Notes

  • All 5 existing tests pass without modification
  • Lint, typecheck, and build all pass
  • The biome formatting changes (trailing commas, import sorting in cli.ts, etc.) are pre-existing issues auto-fixed by bun run lint:fix

Link to Devin session: https://app.devin.ai/sessions/881ab13a1c2a41db95dc38db66d9a681
Requested by: @warengonzaga

@devin-ai-integration
Copy link
Copy Markdown
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

warengonzaga and others added 3 commits May 28, 2026 11:56
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration devin-ai-integration Bot force-pushed the devin/1779967031-tune-extraction-rules branch from 119cd93 to 3818a4f Compare May 28, 2026 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant