Skip to content

Develop#2

Open
oscarcitoz wants to merge 287 commits intooscarcitoz:masterfrom
neuro-publico:develop
Open

Develop#2
oscarcitoz wants to merge 287 commits intooscarcitoz:masterfrom
neuro-publico:develop

Conversation

@oscarcitoz
Copy link
Copy Markdown
Owner

No description provided.

oscarcitoz and others added 30 commits March 10, 2025 13:40
generate images from
add logic json parser structure, copywriter generic.
add endpoint for generate images from api-key
fix scrapper when prices has more elements in string
StephanSrz and others added 30 commits March 24, 2026 16:52
Pass message_service from ScrapingFactory to AliexpressScraper, and add
fallback to IAScraper when AliExpress RapidAPI returns 429 rate limit errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add jemalloc to Dockerfile (returns freed memory to OS, unlike glibc)
- Add shared asyncio.Semaphore(15) to limit concurrent image processing
- Add explicit del/gc.collect() to free image buffers after use
- Fix PIL Images never being closed (~20MB leaked per compression)
- Cache output_buffer.getvalue() to avoid duplicate copies
- Add Image.MAX_IMAGE_PIXELS limit against decompression bombs
- Reuse shared Gemini session in google_image() instead of creating new
- Show current RSS (not just maxrss) in RequestTracker for jemalloc verification
- Remove unused `import requests` from image_client.py
- Add PYTHONUNBUFFERED=1 for reliable log flushing

Measured: 59 concurrent requests, 1327MB peak, 28 errors.
Expected: max 15 concurrent, ~400-600MB peak, 0 errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With semaphore at 15, 170+ coroutines queued up waiting, which starved
the event loop and caused liveness probe failures (health check couldn't
respond in 5s). Pod was killed with exit code 137 despite RSS being
only 734MB (well within 4GB limit).

With 4GB pod: 50 concurrent * 30MB peak = 1.5GB + 400MB baseline = 1.9GB.
Safe within 4GB. The semaphore is now a safety net, not a bottleneck.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The semaphore queued 170+ coroutines internally, which blocked the
event loop and caused health check timeouts (liveness probe failed).
Kubernetes killed the pod (exit code 137) even though RSS was only
734MB (well within 4GB limit).

With 4GB + jemalloc + memory cleanup, the pod can handle concurrent
requests without artificial limiting. Concurrency should be controlled
at the source (ecommerce SQS concurrency) not at the destination.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The httpx.AsyncClient() had no timeout configured (default 5s connect).
Under load (50+ concurrent requests), the agent-config service can't
respond in 5s, causing ConnectTimeout errors. This was responsible
for 10 of 11 errors in the last test.

Set connect=30s, total=60s to handle burst load.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Temporary debug log to verify exactly what prompt (price, angle,
product info) reaches Gemini. Remove after verification.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added explicit TEMPLATE vs PRODUCT DISTINCTION section to the system
prompt. Gemini must understand that the style reference image contains
EXAMPLE products that must be REPLACED with the real product photo.

Also reinforced pricing format rule to prevent currency/decimal changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d-price-format

feat: OOM prevention + prompt improvements for section images
COP and other Latin American currencies don't use decimal places for
whole amounts. $ 140.000,00 → $ 140.000. Only strips ,00 and .00
(zero decimals), preserves meaningful decimals like ,90 or .50.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Temporary log to verify which sections the AI recommends.
Logs response when agent_id contains "design".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The old .replace(",00","").replace(".00","") removed .00 from ANYWHERE
in the string, not just the end. $ 140.000,00 → $ 140.000 → $ 1400.

Fixed with regex that only strips ,00 or .00 at the END of the string.
$ 140.000,00 → $ 140.000 (correct)
$ 1.000.000,00 → $ 1.000.000 (correct)
$ 49,90 → $ 49,90 (preserved)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SectionImageRequest: new optional brand_colors field (list of hex)
- SYSTEM_PROMPT: instruction to use brand colors for design harmony
- _build_prompt: adds BRAND COLORS block with reference palette
  when colors are provided. Creative freedom preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SYSTEM_PROMPT:
- Brand colors DEFINE the color identity, not just a reference
- Template's light/dark logic preserved but in brand tones
- Text must be adapted to the product (no copying template text)
- Priority: communicate persuasively from the sales angle
- Well-structured, well-diagrammed designs with clear visual hierarchy

BRAND COLORS block:
- Colors MUST determine overall tone (not just a suggestion)
- Template colors must be ADAPTED to brand tones
- All sections share consistent visual identity

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Logs agent_id, query, and response for both /handle-message and
/handle-message-json endpoints. Shows what design_structure_selector,
brand_style_selector, select_best_image, and all other agents
receive and return.

Temporary — remove after debugging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New system: every AI prompt and response is permanently stored in
a prompt_logs PostgreSQL table for debugging, auditing, and
quality improvement.

Architecture:
- asyncpg connection pool initialized in FastAPI lifespan
- Fire-and-forget logging via asyncio.create_task (never blocks requests)
- Fails silently if DB not configured (AUDIT_DB_HOST empty = disabled)
- Truncates long responses to 5KB to prevent bloat

What gets logged:
- Section image generation (prompt, s3_url, model, attempt, elapsed)
- Agent calls (query, response, agent_id)
- Errors and fallbacks with error messages

New files:
- app/db/audit_logger.py — pool init + log_prompt function
- app/db/__init__.py

Modified:
- main.py — lifespan handler for pool init/close
- requirements.txt — added asyncpg
- section_image_service.py — log success/fallback/error
- handle_controller.py — log all agent calls (replaced temp debug prints)

Requires: CREATE DATABASE prompt_logs + CREATE TABLE on RDS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pipeline

New endpoint POST /edit-section-image (@require_auth, Bearer token):
- Routes through the same pipeline as creation but with EDIT_SYSTEM_PROMPT
- Uses Gemini's recommended edit patterns: "keep everything else exactly
  the same", "preserve identity", "targeted modification"

Changes:
- SectionImageRequest: new fields edit_mode, current_section_url,
  reference_image_url (all optional, backward-compatible)
- EDIT_SYSTEM_PROMPT: focused on modifying existing sections, not
  regenerating from scratch
- _build_prompt: branches on edit_mode for correct system prompt
- _collect_image_urls: edit mode sends current→reference→product,
  creation mode sends template→product (order matters for Gemini)

Edit images get same quality as creation: 2K resolution, High thinking,
CTA detection, brand colors, pricing, sales angle context.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove API key debug prints from auth_middleware (security)
- Remove [PROMPT-DEBUG] full prompt logging from section_image_service
- Fix audit_logger default DB name mismatch (prompt_logs → analytics)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: audit logging + edit mode + prompt improvements
- Run black formatter on handle_controller.py, aliexpress_scraper.py, section_image_service.py
- Add noqa: F821 to legitimate try/del/except NameError patterns in image_service.py and section_image_service.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fix: resolve lint and format CI failures
Adds POST /generate-section-image/async/api-key that returns 202 immediately
and generates the image in the background. When done, POSTs the result
(success with s3_url or error) to the provided callback_url.

This eliminates ReadTimeoutException when ecommerce-service dispatches
10-15 concurrent custom image requests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4 tests covering: successful POST, retry on failure,
success on second attempt, and API key from config fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-webhook

feat: async webhook for custom image generation
Uses the existing MAX_CONCURRENT_IMAGE_REQUESTS semaphore (default 50)
to limit concurrent Gemini/OpenAI calls. Without this, all async tasks
hit the APIs simultaneously causing memory spikes and service hangs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-webhook

feat: async webhook for custom image generation
- post_callback now raises RuntimeError on failure instead of silent fail
- generate_and_callback logs callback success/error to prompt_logs table
- Allows debugging callback issues via analytics DB query:
  SELECT * FROM prompt_logs WHERE log_type = 'callback_result'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-webhook

fix: log callback results to analytics DB
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants