Volume 7: 이벤트 파이프라인 + 선착순 쿠폰 발급 - 최숙희#299
Volume 7: 이벤트 파이프라인 + 선착순 쿠폰 발급 - 최숙희#299SukheeChoi wants to merge 14 commits intoLoopers-dev-lab:SukheeChoifrom
Conversation
- Event record 6개 생성 (Like/Order/ProductViewed) - EventOutbox Entity + Transactional Outbox 패턴 - LikeFacade: incrementLikeCount 제거 → 이벤트 발행 + Outbox 저장 - LikeCountEventListener: AFTER_COMMIT + async executor + TransactionTemplate - CacheEvictionEventListener: 이벤트 수신 → 캐시 무효화 - AsyncConfig: eventExecutor (core=2, max=4, CallerRunsPolicy) - LikeController: 인라인 캐시 무효화 제거
- kafka.yml: acks=all, idempotence, lz4 압축, isolation.level=read_committed - KafkaConfig: SINGLE_LISTENER ContainerFactory + DeadLetterPublishingRecoverer - KafkaTopicConfig: catalog-events, order-events, coupon-issue-requests 토픽 - infra-compose.yml: MySQL binlog 활성화 + kafka-connect(Debezium) 서비스 - register-debezium-connector.sh: Debezium MySQL Connector 등록 스크립트 - RedisConfig: commandTimeout 500ms 설정 - commerce-api: kafka 모듈 의존성 추가
- ProductMetrics Entity: 좋아요·조회수·판매량 집계 테이블 - EventHandled Entity: 멱등 소비 보장용 처리 이력 - MetricsConsumer: catalog-events BATCH 수신 → UPSERT product_metrics - ProductViewKafkaPublisher: @async AFTER_COMMIT → catalog-events 발행 - ProductFacade: 상품 조회 시 ProductViewedEvent 발행 - OrderFacade: 주문 생성/취소 시 이벤트 발행 + Outbox 저장
- CouponIssueRequest Entity: 발급 요청 상태 추적 (PENDING → COMPLETED/REJECTED)
- CouponFacade: 요청 저장 → Kafka 발행 (coupon-issue-requests)
- CouponController: POST /issue-request, GET /issue-requests/{id} 엔드포인트
- Coupon: maxIssuanceCount/issuedCount 필드 추가
- CouponIssue: (coupon_id, member_id) UNIQUE 제약 추가
- CouponIssueConsumer: CAS UPDATE 기반 동시성 안전 발급 처리
- MetricsReconcileJobConfig: like_count + sales 정합성 보정 (LikeCountSync 대체) - OutboxCleanupJobConfig: 7일 이상 event_outbox 레코드 정리 - EventHandledCleanupJobConfig: 7일 이상 event_handled 레코드 정리 - LikeCountSyncJobConfig/Tasklet 삭제 (MetricsReconcile로 통합)
미사용 코드 제거: - ProductLikeStats/Repository: 이벤트 기반으로 대체됨 - DemoKafkaConsumer: MetricsConsumer/CouponIssueConsumer로 대체 신규 테스트: - LikeCountEventListenerTest, CacheEvictionEventListenerTest - ProductViewKafkaPublisherTest - CouponIssueConcurrencyTest: CAS UPDATE 동시성 검증 기존 테스트 수정: - Facade 테스트: 이벤트 발행 의존성 반영 - LikeConcurrencyTest: async listener 대기 로직 추가 - SlidingWindowRateLimiterTest: 타이밍 마진 확대 (flaky 방지) - PaymentE2ETest: 테스트 데이터 사전 생성 - CommerceBatchApplicationTest: job.enabled=false - Batch E2E 테스트: schema-batch-test.sql 추가
- 08-event-pipeline.md: Outbox + Debezium CDC + Kafka 이벤트 파이프라인 설계 - 09-event-review.md: 이벤트 파이프라인 구현 리뷰 기록 - 05-payment-resilience.md: 결제 복구 배치 관련 설계 업데이트 - blog/week7-event-pipeline-notes.md: 주간 학습 기록
- kafka.yml: partition.assignment.strategy=CooperativeStickyAssignor 명시 - KafkaConfig: group.instance.id 설정 (Static Membership, 배포 시 불필요한 리밸런싱 방지) - SINGLE_MAX_POLL_INTERVAL_MS: 10분 → 3분 (최악 케이스 대비 5배 마진, stuck 감지 3배 단축) - 블로그 소재: 리밸런싱 전략 분석 및 산술 근거 정리
- kafka.yml: 전역 auto.offset.reset=latest 제거 (consumer earliest와 충돌) - CouponIssueConsumer: try-catch 제거 → 예외가 DefaultErrorHandler로 전파 - 기존: 예외를 삼키고 무조건 ack → DLQ 도달 불가 (at-most-once) - 수정: 예외 전파 → FixedBackOff(1초×3회) 재시도 → DLT 토픽 전송 (at-least-once) - 블로그 소재: 오프셋 전략 분석 및 DLQ 버그 발견/수정 기록
Facade가 EventOutboxRepository + ObjectMapper + ApplicationEventPublisher 3개를 직접 조합하던 패턴을 DomainEventPublisher 인터페이스로 추상화. - DomainEventPublisher 도메인 인터페이스 추가 - DomainEventPublisherImpl 구현체 (Outbox + Spring Event 캡슐화) - LikeFacade, OrderFacade 의존 3→1 축소
결제 실패 시 재고 복원 로직(Redis INCR + DB increaseStock)을 Product 도메인 Facade에 캡슐화하여 cross-domain 직접 접근 제거.
📝 WalkthroughWalkthrough이 PR은 이벤트 아키텍처(아웃박스) 도입, 도메인 이벤트 발행/비동기 처리, 쿠폰 선착순 발급(요청→Kafka→consumer 처리), 좋아요/상품 조회 이벤트 비동기화, Kafka/Debezium 통합 및 관련 배치·테스트·설정 변경을 포함한다. Changes
Sequence DiagramsequenceDiagram
participant Client as Client
participant Facade as CouponFacade
participant DB as Database
participant DomainPub as DomainEventPublisher
participant Outbox as EventOutbox(DB)
participant Kafka as KafkaBroker
participant Consumer as CouponIssueConsumer
participant EventHandled as EventHandledTable
Client->>Facade: requestCouponIssue(couponId, memberId)
activate Facade
Facade->>DB: validate coupon exists & not expired
Facade->>DB: save CouponIssueRequest (PENDING)
Facade->>DomainPub: publish(aggregateType="coupon", payload)
activate DomainPub
DomainPub->>Outbox: save EventOutbox(JSON payload)
DomainPub->>Facade: publish ApplicationEvent (in-memory)
DomainPub->>Kafka: (Debezium via Outbox -> Kafka) -- out-of-band -->
deactivate DomainPub
Facade-->>Client: return request id
deactivate Facade
note over Kafka,Consumer: Debezium Connector streams Outbox -> topic coupon-issue-requests
Kafka->>Consumer: deliver message (requestId, couponId, memberId)
activate Consumer
Consumer->>EventHandled: INSERT IGNORE event_id
alt already handled
Consumer-->>Kafka: ack and skip
else new
Consumer->>DB: atomic UPDATE coupon SET issued_count = issued_count + 1 WHERE issued_count < max OR max IS NULL
alt update affected==0
Consumer->>DB: update coupon_issue_request status = REJECTED(reason)
else success
Consumer->>DB: INSERT INTO coupon_issue (may fail by unique constraint)
alt unique constraint violation
Consumer->>DB: decrement issued_count (compensate)
Consumer->>DB: update coupon_issue_request status = REJECTED(reason)
else success
Consumer->>DB: update coupon_issue_request status = COMPLETED, set completed_at
end
end
Consumer-->>Kafka: acknowledge
end
deactivate Consumer
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
✨ Finishing Touches🧪 Generate unit tests (beta)
|
Event Sourcing 경량 적용으로 payment_status_history 테이블 추가. 5-layer recovery 구조에서 상태 변경 경로를 추적할 수 없던 문제 해결. - PaymentStatusHistory 엔티티 + 리포지토리 추가 - PaymentModel에 @transient pendingTransitions 전이 리스트 추가 - PaymentRepositoryImpl.save()에서 History 자동 기록 - 단위 테스트 + Fake 리포지토리 추가
CALLBACK, POLLING, WAL_RECOVERY, BATCH_RECOVERY 모든 복구 경로에서 상태 전이 이력을 기록하도록 적용. - PaymentRecoveryService: Facade 위임 + CALLBACK/POLLING History 기록 - WalRecoveryScheduler: WAL_RECOVERY History 기록 - PaymentRecoveryTasklet: SELECT→History INSERT→Status UPDATE 패턴 리팩토링 - 기존 테스트 7개 호환성 수정
- week7-decomposition-analysis.md: 4가지 Decomposition 패턴 점검 + 개선 기록 - week7-data-management-patterns.md: 8가지 Data Management 패턴 점검 + PaymentStatusHistory 도출 과정
There was a problem hiding this comment.
Actionable comments posted: 3
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/commerce-api/src/test/java/com/loopers/interfaces/api/payment/PaymentE2ETest.java (1)
124-132:⚠️ Potential issue | 🟡 Minor요청 금액이 중복 하드코딩되어 금액 검증 회귀 탐지력이 떨어진다
운영 관점에서 Line 125의 주문 생성 금액과 Line 131의 요청 금액이 분리되면, 결제 금액 검증 로직 회귀 시 테스트가 오탐/미탐을 만들 수 있다. 금액을 단일 변수로 관리해 주문 생성과 요청 본문에 동일하게 주입하도록 단순화하는 것이 안전하다. 추가 테스트로
주문 금액과 요청 금액 불일치 시 BAD_REQUEST(또는 도메인 에러)를 명시적으로 검증하는 케이스를 추가하는 것이 필요하다.변경 제안
- Order order = createTestOrder(5000); + int amount = 5000; + Order order = createTestOrder(amount); Map<String, Object> paymentRequest = Map.of( "orderId", order.getId(), "cardType", "SAMSUNG", "cardNo", "1234-5678-9012-3456", - "amount", 5000 + "amount", amount );As per coding guidelines
**/*Test*.java: 단위 테스트는 경계값/실패 케이스/예외 흐름을 포함하는지 점검한다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/interfaces/api/payment/PaymentE2ETest.java` around lines 124 - 132, In PaymentE2ETest, avoid hardcoding the amount twice: introduce a single variable (e.g., int amount = 5000) used for createTestOrder(amount) and for the paymentRequest Map so both sides use the exact same value; update the Map creation in the test to reference that variable (instead of duplicate literal) and add a new test case that calls createTestOrder(amount) but sends a different request amount to assert the system returns BAD_REQUEST (or the domain error) to cover the mismatch/failure path.apps/commerce-api/src/test/java/com/loopers/application/product/ProductFacadeTest.java (1)
36-43:⚠️ Potential issue | 🟠 Major
restoreStock()메서드에 대한 테스트 케이스가 누락되어 있다.ProductFacade.restoreStock()은 결제 실패 및 주문 취소 시 호출되어 재고 복구를 담당하지만, 현재 ProductFacadeTest에 해당 메서드에 대한 테스트가 없다. 재고 시스템의 정합성이 깨지면 주문 처리 오류로 이어질 수 있으므로 다음 테스트 케이스를 추가해야 한다.
필수 테스트:
- 정상 복구 케이스: Redis와 DB 모두 정상 업데이트 확인
- 존재하지 않는 상품에 대한 복구 시도: ifPresent()로 안전하게 처리되는지 확인
- 경계값 케이스: 음수 수량 복구 시 처리 방식 검증
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/product/ProductFacadeTest.java` around lines 36 - 43, Add unit tests for ProductFacade.restoreStock in ProductFacadeTest: create three test methods that use the existing setUp (productRepository, brandRepository, likeRepository, FakeProductCachePort, FakeStockReservationRedisRepository) to exercise ProductFacade.restoreStock; 1) normal recovery: seed a product and a reservation in FakeStockReservationRedisRepository, call productFacade.restoreStock(reservationId or payload) and assert DB stock in FakeProductRepository and Redis entries in FakeStockReservationRedisRepository were updated/removed as expected; 2) missing product: call restoreStock for a reservation referencing a non-existent product and assert no exception is thrown and repository state unchanged (verify ProductFacade handles Optional.ifPresent safely); 3) negative quantity boundary: attempt to restore a negative quantity and assert ProductFacade either rejects it or clamps/ignores it per domain rules (verify FakeProductRepository stock unchanged or validated behavior) and that no invalid DB/Redis state occurs. Ensure tests reference ProductFacade.restoreStock, FakeStockReservationRedisRepository, and FakeProductRepository when setting up and asserting results.
🟠 Major comments (22)
modules/redis/src/main/java/com/loopers/config/redis/RedisConfig.java-79-80 (1)
79-80:⚠️ Potential issue | 🟠 Major
commandTimeout500ms 하드코딩은 운영 환경에서 동적 조정이 불가능하므로 외부 설정화가 필요하다현재 500ms 고정값은 네트워크 지연, 부하 변화 등에 대응할 수 없어 타임아웃 폭증으로 이어질 수 있다. RedisProperties에 commandTimeout 필드를 추가하거나, 다른 설정 클래스(TossSandboxPgConfig, SimulatorFeignConfig)처럼
@Value("${datasource.redis.command-timeout:500}")애노테이션으로 외부화하고 기본값을 유지하되 환경별 조정 가능하게 구성해야 한다.통합 테스트 추가:
command-timeout미설정 시 기본값(500ms) 적용 검증command-timeout설정 시 해당 값으로 적용되는지 검증- 경계값(낮은 지연/높은 지연) 환경에서 타임아웃 동작 및 재시도 검증
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/redis/src/main/java/com/loopers/config/redis/RedisConfig.java` around lines 79 - 80, Externalize the hardcoded 500ms command timeout in RedisConfig by wiring it from configuration: add a commandTimeout field to RedisProperties (or inject with `@Value`("${datasource.redis.command-timeout:500}") into RedisConfig) and replace the literal Duration.ofMillis(500) in LettuceClientConfiguration.LettuceClientConfigurationBuilder builder -> .commandTimeout(...) with Duration.ofMillis(commandTimeout). Keep the default at 500ms, validate parsing (long/int) and unit semantics, and update/ add integration tests to verify default behavior when unset, correct applied value when set, and timeout/retry behavior under low/high latency scenarios.apps/commerce-api/src/test/java/com/loopers/interfaces/api/payment/PaymentE2ETest.java-94-103 (1)
94-103:⚠️ Potential issue | 🟠 Major하드코딩된 사용자 ID(1L)로 통합 테스트가 환경 의존적으로 깨질 수 있다
Line 98의
Order.create(1L, ...)는 운영 관점에서 FK 제약 또는 사용자 검증 로직 강화 시 즉시 플래키를 유발하는 위험이 있다. 테스트 데이터 준비 단계에서 실제 사용자 엔티티를 생성한 뒤 해당 ID를createTestOrder로 주입하도록 수정하는 편이 안전하다. 추가 테스트로사용자 미존재 시 주문 생성/결제 요청 실패케이스와사용자 생성 후 동일 시나리오 성공케이스를 분리해 검증하는 것이 필요하다.As per coding guidelines
**/*Test*.java: 통합 테스트는 격리 수준, 플래키 가능성, 테스트 데이터 준비/정리 전략을 점검한다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/interfaces/api/payment/PaymentE2ETest.java` around lines 94 - 103, The helper createTestOrder currently hardcodes user ID (Order.create(1L,...)), which can make tests flaky; modify createTestOrder to accept a userId parameter (or create and save a User inside the helper using userRepository) and pass that real user's ID into Order.create instead of 1L, ensuring the test creates and persists the User before creating the Order; update tests to include a negative case (attempt order/payment with a non-existent userId to assert failure) and a positive case (create User, call createTestOrder with that user's ID and assert success), touching createTestOrder, Order.create usage, and any tests that call createTestOrder so they pass the correct userId.apps/commerce-streamer/src/main/java/com/loopers/domain/metrics/ProductMetrics.java-10-18 (1)
10-18: 🛠️ Refactor suggestion | 🟠 Major식별자 기반
equals/hashCode구현이 필요하다JPA 엔티티에서
equals/hashCode가 없으면Set,Map등 컬렉션 연산에서 예기치 않은 동작이 발생할 수 있다. 코딩 가이드라인에 따라 식별자(productId) 기반으로 일관되게 구현해야 한다.♻️ equals/hashCode 추가
+@Override +public boolean equals(Object o) { + if (this == o) return true; + if (o == null || getClass() != o.getClass()) return false; + ProductMetrics that = (ProductMetrics) o; + return productId != null && productId.equals(that.productId); +} + +@Override +public int hashCode() { + return getClass().hashCode(); +}As per coding guidelines: "equals/hashCode는 식별자 기반 또는 값 기반으로 일관되게 설계한다."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-streamer/src/main/java/com/loopers/domain/metrics/ProductMetrics.java` around lines 10 - 18, The ProductMetrics entity lacks equals/hashCode causing collection issues; add overrides of equals(Object) and hashCode() in the ProductMetrics class that base equality solely on the identifier field productId (compare class/type and then Objects.equals(this.productId, other.productId)), and implement hashCode() as Objects.hash(productId) (ensure a defined behavior when productId is null per JPA lifecycle). Add these methods to the ProductMetrics class so Sets/Maps behave consistently.apps/commerce-api/src/main/java/com/loopers/infrastructure/kafka/KafkaTopicConfig.java-11-33 (1)
11-33:⚠️ Potential issue | 🟠 Major
replicas(1)설정은 운영 환경에서 데이터 유실 위험을 야기한다.
KafkaTopicConfig의 모든 토픽이replicas(1)로 하드코딩되어 있고, 프로필별 환경설정(dev, qa, prd)이 존재하지 않는다. 브로커 장애 시 해당 파티션의 메시지가 유실되며, 주문 및 쿠폰 발급 이벤트 누락으로 이어진다. 운영 환경에서는 최소replicas(2), 권장replicas(3)이상으로 설정해야 한다.
kafka.yml에 이미 프로필별 구조가 존재하므로, 토픽 설정을 외부화하고 환경별로 다음과 같이 분리한다:수정 예시
+@Value("${kafka.topic.catalog-events.partitions:3}") +private int catalogPartitions; + +@Value("${kafka.topic.catalog-events.replicas:1}") +private short catalogReplicas; + `@Bean` public NewTopic catalogEventsTopic() { return TopicBuilder.name("catalog-events") - .partitions(3) - .replicas(1) + .partitions(catalogPartitions) + .replicas(catalogReplicas) .build(); }그 후
kafka.yml의 프로덕션 프로필에 다음 설정을 추가한다:--- spring.config.activate.on-profile: prd spring: kafka: topic: catalog-events: partitions: 3 replicas: 3 order-events: partitions: 3 replicas: 3 coupon-issue-requests: partitions: 3 replicas: 3🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/infrastructure/kafka/KafkaTopicConfig.java` around lines 11 - 33, The KafkaTopicConfig currently hardcodes replicas(1) in catalogEventsTopic(), orderEventsTopic(), and couponIssueRequestsTopic(), which is unsafe for production; change these beans to read partitions and replicas from external configuration (e.g., bind properties under spring.kafka.topic.<topicName>.partitions and .replicas using `@ConfigurationProperties` or `@Value`) and use those values when building topics instead of hardcoded numbers, then add profile-specific entries in kafka.yml (prd profile should set replicas: 3, partitions: 3) so operational environments use higher replica counts.apps/commerce-api/src/main/java/com/loopers/infrastructure/kafka/AsyncConfig.java-15-25 (1)
15-25: 🛠️ Refactor suggestion | 🟠 Major스레드 풀 설정 및 Graceful Shutdown 미구성
운영 관점에서 다음 사항을 검토해야 한다:
- 풀 크기: Core 2, Max 4로 이벤트 급증 시 큐 포화 →
CallerRunsPolicy발동 → 요청 스레드 지연 발생 가능- Graceful Shutdown 미설정: 애플리케이션 종료 시 큐에 남은 작업이 유실될 수 있다
권장 수정안:
Graceful Shutdown 및 모니터링 추가
`@Bean`("eventExecutor") public Executor eventExecutor() { ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor(); executor.setCorePoolSize(2); executor.setMaxPoolSize(4); executor.setQueueCapacity(100); executor.setThreadNamePrefix("event-async-"); executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy()); + executor.setWaitForTasksToCompleteOnShutdown(true); + executor.setAwaitTerminationSeconds(30); executor.initialize(); return executor; }추가 테스트: 부하 테스트 시 스레드 풀 포화 빈도 및
CallerRunsPolicy발동 횟수 모니터링 지표 추가를 권장한다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/infrastructure/kafka/AsyncConfig.java` around lines 15 - 25, The ThreadPoolTaskExecutor bean eventExecutor uses fixed small core/max sizes and CallerRunsPolicy and lacks graceful shutdown and visibility; change it to read corePoolSize/maxPoolSize from configuration (inject an Environment or `@Value`), replace or wrap CallerRunsPolicy with a non-blocking handler that logs/exports a metric (or use ThreadPoolExecutor.AbortPolicy) to avoid stalling caller threads, and enable graceful shutdown by calling executor.setWaitForTasksToCompleteOnShutdown(true) and executor.setAwaitTerminationSeconds(30) (tune seconds via config); also consider exposing metrics when rejection occurs so monitoring can count CallerRuns/abort events.apps/commerce-api/src/test/java/com/loopers/application/order/OrderFacadeTest.java-57-59 (1)
57-59:⚠️ Potential issue | 🟠 Major도메인 이벤트 발행 검증 누락
DomainEventPublisher가 no-op 람다로 전달되어ORDER_CREATED,ORDER_CANCELLED이벤트 발행이 전혀 검증되지 않는다.운영 관점에서 이벤트 발행 실패 시 메트릭 집계, 재고 동기화 등 하위 시스템에 영향을 주지만, 현재 테스트로는 이를 감지할 수 없다.
권장 수정안:
DomainEventPublisher를 mock 또는 spy로 교체하여 호출 검증- 이벤트 발행 시 전달되는
aggregateType,eventType,payload값 검증 추가검증 예시
// setUp에서 DomainEventPublisher domainEventPublisher = mock(DomainEventPublisher.class); // 테스트에서 verify(domainEventPublisher).publish( eq("order"), anyString(), eq("ORDER_CREATED"), any(), any(OrderCreatedEvent.class) );추가 테스트: 주문 생성/취소 시 올바른 이벤트 타입과 페이로드가 발행되는지 검증하는 테스트 케이스 추가가 필요하다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/order/OrderFacadeTest.java` around lines 57 - 59, The test currently passes a no-op DomainEventPublisher lambda into OrderFacade (constructed in the setUp where orderFacade = new OrderFacade(..., domainEventPublisher)), so ORDER_CREATED and ORDER_CANCELLED emissions are never asserted; replace that lambda with a mock or spy (mock(DomainEventPublisher.class)) provided to OrderFacade and add verifications in the relevant tests to assert publish(...) was called with the correct aggregateType (e.g., "order"), eventType ("ORDER_CREATED"/"ORDER_CANCELLED"), and payload type/content (e.g., instances or fields of OrderCreatedEvent/OrderCancelledEvent) using verify/eq/any matchers; add dedicated tests for order creation and cancellation that assert both the event type and payload values.apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java-127-134 (1)
127-134:⚠️ Potential issue | 🟠 Major
restoreStock- Redis/DB 데이터 정합성 보장 미흡현재 구현에서 다음 문제가 발생할 수 있다:
- Redis 증가 후 DB 실패 시 불일치: Redis는 트랜잭션 롤백 대상이 아니므로, DB 저장 실패 시 Redis만 증가된 상태로 남는다
- 상품 미존재 시 silent fail:
ifPresent로 상품이 없으면 Redis만 증가하고 DB는 무시된다. 이는 운영 중 재고 불일치의 원인이 된다권장 수정안:
- 상품 미존재 시 명시적 예외 또는 경고 로그 출력
- DB 업데이트 성공 후 Redis 업데이트, 또는 실패 시 Redis 보상 로직 추가
수정안 예시
`@Transactional` public void restoreStock(Long productId, int quantity) { - stockRedisRepository.increase(productId, quantity); - productRepository.findById(productId).ifPresent(product -> { - product.increaseStock(quantity); - productRepository.save(product); - }); + Product product = productRepository.findById(productId) + .orElseThrow(() -> new CoreException(ErrorType.NOT_FOUND, "재고 복원 대상 상품을 찾을 수 없습니다: " + productId)); + product.increaseStock(quantity); + productRepository.save(product); + stockRedisRepository.increase(productId, quantity); }추가 테스트:
- 상품 미존재 시 예외 발생 검증
- DB 저장 실패 시 Redis 상태 검증 (통합 테스트)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java` around lines 127 - 134, The restoreStock method can leave Redis and DB out of sync because it increases Redis first and silently ignores missing products; change the flow so you verify product existence with productRepository.findById(productId) first and throw a clear exception or log+fail if absent, then perform the DB update (call product.increaseStock(...) and productRepository.save(product)) within the `@Transactional` boundary, and only after a successful DB save call stockRedisRepository.increase(productId, quantity); alternatively, if you must increase Redis first, wrap the DB save in try/catch and on any failure call stockRedisRepository.decrease(productId, quantity) as a compensating action and rethrow the exception; add tests to assert exception on missing product and to verify Redis state is corrected when DB save fails.apps/commerce-batch/src/main/java/com/loopers/batch/job/eventhandledcleanup/step/EventHandledCleanupTasklet.java-21-25 (1)
21-25: 🛠️ Refactor suggestion | 🟠 Major보존 기간 하드코딩 및 설정 외부화 미적용
7일 보존 기간이 쿼리 내에 하드코딩되어 있다. 운영 환경별로 보존 정책이 다를 수 있으며, 변경 시 코드 재배포가 필요하다.
권장 수정안:
@Value("${batch.event-handled.retention-days:7}")등으로 외부 설정화설정 외부화 예시
+@Value("${batch.event-handled.retention-days:7}") +private int retentionDays; + `@Override` public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) { - log.info("[EventHandledCleanup] event_handled 7일 이전 데이터 삭제 시작"); + log.info("[EventHandledCleanup] event_handled {}일 이전 데이터 삭제 시작", retentionDays); int deleted = entityManager.createNativeQuery( - "DELETE FROM event_handled WHERE created_at < DATE_SUB(NOW(), INTERVAL 7 DAY)" + "DELETE FROM event_handled WHERE created_at < DATE_SUB(NOW(), INTERVAL " + retentionDays + " DAY)" ).executeUpdate();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-batch/src/main/java/com/loopers/batch/job/eventhandledcleanup/step/EventHandledCleanupTasklet.java` around lines 21 - 25, The deletion query currently hardcodes a 7-day retention window; update EventHandledCleanupTasklet to inject the retention period from configuration (e.g., add a field like int retentionDays annotated with `@Value`("${batch.event-handled.retention-days:7}") in the class) and use that value when executing the delete via entityManager (build the native query to use a parameter or compute the cutoff timestamp in Java and pass it as a parameter to the native query instead of embedding "INTERVAL 7 DAY"). Ensure you reference the existing entityManager.createNativeQuery(...) call and replace the hardcoded interval with a parameterized form that uses the injected retentionDays.apps/commerce-batch/src/main/java/com/loopers/batch/job/eventhandledcleanup/step/EventHandledCleanupTasklet.java-22-24 (1)
22-24:⚠️ Potential issue | 🟠 Major대량 DELETE 시 테이블 락 및 성능 저하 위험
단일
DELETE문으로 7일 이전 데이터를 일괄 삭제하면 운영 환경에서 다음 문제가 발생할 수 있다:
- 대량 행 삭제 시 테이블 락 장시간 유지로 Consumer 처리 지연
- 바이너리 로그 폭증으로 복제 지연
- 트랜잭션 롤백 세그먼트 부족
권장 수정안:
- 배치 단위로 분할 삭제 (예: 1000건씩 반복)
- 삭제 대상 건수가 임계치 초과 시 경고 로그 출력
분할 삭제 예시
-int deleted = entityManager.createNativeQuery( - "DELETE FROM event_handled WHERE created_at < DATE_SUB(NOW(), INTERVAL 7 DAY)" -).executeUpdate(); +int totalDeleted = 0; +int batchSize = 1000; +int deleted; +do { + deleted = entityManager.createNativeQuery( + "DELETE FROM event_handled WHERE created_at < DATE_SUB(NOW(), INTERVAL 7 DAY) LIMIT " + batchSize + ).executeUpdate(); + totalDeleted += deleted; + entityManager.flush(); + entityManager.clear(); +} while (deleted == batchSize);추가 테스트: 대량 데이터(10만 건 이상) 환경에서 삭제 시간 및 락 영향도 측정 필요하다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-batch/src/main/java/com/loopers/batch/job/eventhandledcleanup/step/EventHandledCleanupTasklet.java` around lines 22 - 24, The single DELETE in EventHandledCleanupTasklet (the entityManager.createNativeQuery call) can cause long table locks and replication issues for large volumes; change it to perform batched deletes (e.g., loop executing DELETE ... LIMIT 1000 via the same entityManager until executeUpdate returns 0), accumulate the total deleted count, call entityManager.flush()/clear() between batches if needed, and emit a warning log from EventHandledCleanupTasklet when the initial total-to-delete or accumulated deletions exceed a safe threshold (e.g., 10000) so operators are alerted.apps/commerce-batch/src/main/java/com/loopers/batch/job/outboxcleanup/step/OutboxCleanupTasklet.java-22-24 (1)
22-24:⚠️ Potential issue | 🟠 Major단일 대량 DELETE는 배치와 온라인 쓰기 모두를 흔든다.
Lines 22-24는 cutoff 이전 outbox를 한 트랜잭션에서 전부 삭제한다. 적재량이 늘면 긴 잠금, undo/redo 증가, CDC 삭제 이벤트 폭증으로 이어져 배치 시간과 온라인 쓰기 지연이 함께 악화된다. 수정안은 cutoff와 batch size를 job parameter로 받고
LIMIT기반 chunk delete를 반복 실행하는 것이다. 추가 테스트로 chunk size보다 많은event_outbox레코드를 넣고 여러 번 실행해도 누락 없이 모두 삭제되는지 배치 통합 테스트를 추가한다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-batch/src/main/java/com/loopers/batch/job/outboxcleanup/step/OutboxCleanupTasklet.java` around lines 22 - 24, OutboxCleanupTasklet currently issues one large DELETE via entityManager.createNativeQuery which can cause long locks/CDC storms; change it to accept cutoff (cutoffTimestamp or cutoffHours) and batchSize as job parameters and replace the single DELETE with a loop that runs a native DELETE ... LIMIT :batchSize (or uses vendor-specific LIMIT/ROWNUM) using the same cutoff until executeUpdate() returns 0, committing between iterations to avoid long transactions; update the code paths in OutboxCleanupTasklet that build the query and read job parameters (e.g., the tasklet execute method) and add an integration test that inserts more than batchSize event_outbox rows and runs the tasklet repeatedly to assert all rows older than cutoff are deleted without missing any.apps/commerce-api/src/main/java/com/loopers/infrastructure/event/DomainEventPublisherImpl.java-27-31 (1)
27-31:⚠️ Potential issue | 🟠 Major직렬화 실패를 일반
RuntimeException으로 바꾸면 API 오류 계약이 흔들린다.Lines 27-31은 cause를 보존하고 있지만 예외 타입을 일반
RuntimeException으로 바꾼다. 이 구현은 commerce-api 쓰기 경로에서 직접 호출되므로, 여기서 표준 예외를 벗어나면 클라이언트 응답 형식과 장애 분류가 일관되지 않을 수 있다. 수정안은 프로젝트 표준 예외(CoreException 등)로 감싸고, aggregate/event 식별자와 payload 요약만 로그에 남겨 사용자 메시지와 분리하는 것이다. 추가 테스트로 직렬화 실패 시 facade/controller가 공통 에러 스키마로 응답하는지 검증한다.Based on learnings, "In the loop-pack-be-l2-vol3-java project, enforce unified error handling by routing errors through CoreException to ApiControllerAdvice to ensure a consistent response format."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/infrastructure/event/DomainEventPublisherImpl.java` around lines 27 - 31, The serializePayload method in DomainEventPublisherImpl currently wraps JsonProcessingException in a generic RuntimeException; change this to wrap and rethrow a project-standard exception (e.g., CoreException) while preserving the cause, and ensure the thrown CoreException carries a user-facing message (not raw payload) and that detailed context (aggregateId/eventId and a short payload summary) is logged separately (use objectMapper/Logging in serializePayload or the surrounding publish method). Update serializePayload to reference CoreException (preserve e as cause) and add logging that records only aggregate/event identifiers and a truncated payload summary. Add/update tests to assert that when serializePayload fails the controller/facade returns the unified error schema via ApiControllerAdvice.apps/commerce-api/src/main/java/com/loopers/domain/event/EventOutbox.java-10-12 (1)
10-12:⚠️ Potential issue | 🟠 Major
created_at인덱스가 없으면 cleanup job이 테이블 전체를 훑게 된다.Line 11의 테이블 정의에는
created_at인덱스가 없다. 이 PR에서 outbox cleanup이created_at범위 삭제를 수행하므로, 데이터가 쌓이면 full scan과 큰 잠금으로 outbox insert/write latency까지 같이 밀릴 수 있다. 수정안은 동일 인덱스를 migration에 추가하고, schema auto-ddl 환경이면@Table(indexes = ...)도 함께 선언하는 것이다. 추가 테스트로 migration 이후EXPLAIN DELETE ... WHERE created_at < ?가 range scan을 타는지 검증 절차를 넣는다.예시 수정
-@Table(name = "event_outbox") +@Table( + name = "event_outbox", + indexes = { + `@Index`(name = "idx_event_outbox_created_at", columnList = "created_at") + } +)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/domain/event/EventOutbox.java` around lines 10 - 12, EventOutbox 엔티티에 created_at 인덱스가 없어 cleanup job의 범위 삭제 시 테이블 전체 스캔 및 잠금/지연을 유발하므로, EventOutbox 엔티티(클래스 EventOutbox)의 `@Table` 선언에 `@Table`(indexes = `@Index`(name = "idx_event_outbox_created_at", columnList = "created_at"))를 추가하고, DB 마이그레이션(해당 outbox migration 파일)에 동일한 인덱스 생성 SQL을 추가하여 스키마 자동DDL 환경과 수동 마이그레이션 환경 모두를 커버하세요; 또한 cleanup이 수행되는 코드(예: outbox cleanup job) 변경 후 마이그레이션 적용 환경에서 EXPLAIN DELETE FROM event_outbox WHERE created_at < ? 를 실행해 range scan(인덱스 사용)을 검증하는 간단한 통합/DB 검증 테스트를 추가하세요.apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponFacade.java-169-171 (1)
169-171:⚠️ Potential issue | 🟠 Major조회 시그니처만으로는 소유자 검증을 강제할 수 없다.
현재 메서드는
requestId만 받아 대고객 엔드포인트에 연결되면 다른 사용자의 요청 상태를 열람해도 application layer가 막을 수 없다.memberId를 함께 받아findByIdAndMemberId로 제한하거나 내부용 조회와 분리하고, 타 사용자 조회 시FORBIDDEN이 반환되는 테스트를 추가하라.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponFacade.java` around lines 169 - 171, The getIssueRequest method in CouponFacade currently accepts only requestId, allowing unauthorized users to read others' requests; change its signature to accept the requesting memberId and replace the repository call to use a constrained lookup (e.g., couponIssueRequestRepository.findByIdAndMemberId(requestId, memberId)) or perform an existence check plus owner equality check inside getIssueRequest, and when the request exists but does not belong to memberId throw new CoreException(ErrorType.FORBIDDEN, "..."); also add a unit/integration test that calls getIssueRequest with a valid requestId but different memberId and asserts a FORBIDDEN CoreException is thrown.apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponFacade.java-162-163 (1)
162-163:⚠️ Potential issue | 🟠 Major내부 직렬화 오류도
CoreException경로로 통일해야 한다.여기서
RuntimeException을 직접 던지면 API 응답 포맷이 다른 오류와 달라지고, 사용자 메시지와 로그 메시지도 분리되지 않는다. cause를 보존한 서버 오류용CoreException으로 감싸고 상세 원인은 로그에만 남기며, API 테스트로 오류 응답 스키마가 일관적인지 고정하라.Based on learnings, "enforce unified error handling by routing errors through CoreException to ApiControllerAdvice to ensure a consistent response format."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponFacade.java` around lines 162 - 163, Replace the direct RuntimeException thrown in the JsonProcessingException handler inside CouponFacade (the catch block for JsonProcessingException e that currently throws new RuntimeException("쿠폰 발급 요청 페이로드 직렬화 실패", e)) with a CoreException so the error is routed through ApiControllerAdvice: log the detailed cause (e) at error/debug level and throw a new CoreException containing a user-facing message key or brief message (preserving the original cause internally) so the API response format stays consistent; ensure the thrown CoreException preserves the original exception as its cause and that ApiControllerAdvice will map it to the unified error schema.apps/commerce-api/src/main/java/com/loopers/interfaces/listener/LikeCountEventListener.java-33-57 (2)
33-57:⚠️ Potential issue | 🟠 Major캐시 무효화와 비동기 likeCount 갱신 순서에서 stale 캐시 문제 발생 가능하다.
CacheEvictionEventListener는 동기로 실행되어 즉시 캐시를 무효화하지만, LikeCountEventListener는 eventExecutor로 작업을 큐에 넣기만 하고 반환한다. 캐시 무효화 완료 후 비동기 작업이 시작되기 전 클라이언트 요청이 들어올 경우, DB에서 조회한 이전 likeCount를 다시 캐시에 저장하고 Redis 기준 10분 동안 고정된다. 현재 테스트는 동기 Executor로 실행되어 이 문제를 감지하지 못한다.
likeCount 갱신과 캐시 무효화를 다음 중 하나로 개선하라:
- 두 작업을 같은 파이프라인에서 직렬화하거나 (
@Order로명시적 순서 지정)- 비동기 갱신 완료 후 선택적으로 캐시를 다시 evict하거나
evict → read → async update → cache set순서를 재현하는 통합 테스트를 필수로 추가하라.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/interfaces/listener/LikeCountEventListener.java` around lines 33 - 57, The current race between CacheEvictionEventListener and the asynchronous LikeCountEventListener causes stale caches because LikeCountEventListener (handleLikeCreated/handleLikeRemoved) queues work on eventExecutor and returns before the DB update completes; fix by either making like-count updates run in the same serialized event pipeline as cache eviction (remove the async queueing in LikeCountEventListener or add explicit ordering using `@Order` on the listener methods so incrementLikeCount/decrementLikeCount (productRepository) run before/after CacheEvictionEventListener as required), or keep async execution but perform cache eviction only after the async transactionTemplate.executeWithoutResult completes (i.e., have the async task call the cache-evict logic or trigger a follow-up evict event once productRepository.incrementLikeCount/decrementLikeCount succeed); then add an integration test that reproduces evict → read → async update → cache set sequence to assert no stale value is left in Redis.
33-57:⚠️ Potential issue | 🟠 Major비동기 좋아요 수 갱신이 장애 발생 시 복구 불가능하다.
AFTER_COMMIT이후eventExecutor에 제출되는 좋아요 수 갱신은EventOutbox로 영구 저장되지 않으므로, 실행 보장이 없다. 현재CallerRunsPolicy설정으로 큐 포화 시 호출자 스레드에서 동기 실행되지만, 스레드 충돌, DB 예외, 서버 재시작 발생 시 좋아요 수가 영구히 틀어진다. 반면 결제 이벤트는OutboxPollerScheduler로 5초 주기 복구가 되는데, 좋아요 수는 복구 메커니즘이 없다.또한
CacheEvictionEventListener는 동기로 캐시를 즉시 무효화하고, 이후 비동기로 좋아요 수가 업데이트되므로, 캐시 무효화와 DB 갱신 사이 시간차가 발생해 stale 캐시가 재입력될 수 있다.결제 이벤트와 동일하게
EventOutbox로 like 이벤트를 저장하고OutboxPollerScheduler를 확장하여 보정 배치를 추가하거나, 최소한 좋아요 수 업데이트 실패를 별도 테이블에 기록해 별도 배치에서 재처리하도록 수정해야 한다. 추가로 executor 큐 포화 및 DB 예외 시나리오에 대한 테스트를 작성하라.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/interfaces/listener/LikeCountEventListener.java` around lines 33 - 57, LikeCountEventListener's async updates (handleLikeCreated/handleLikeRemoved) are best-effort and not durable; persist like change intents to the EventOutbox (same pattern as payments) in the AFTER_COMMIT phase instead of directly submitting to eventExecutor, then let OutboxPollerScheduler process those outbox like-events and call productRepository.incrementLikeCount/decrementLikeCount inside its transactional worker; alternatively, on immediate execution failures write a failure record to a dedicated like_update_failures table for batched reprocessing by OutboxPollerScheduler; also reconcile CacheEvictionEventListener ordering so cache eviction happens only after the durable outbox commit or after successful DB update to avoid stale re-population; add automated tests simulating executor queue saturation, DB exceptions and server restarts to verify outbox persistence and retry/recovery paths.apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponFacade.java-152-166 (1)
152-166:⚠️ Potential issue | 🟠 MajorKafka 전송 실패 시 요청이 PENDING 상태로 영구 정체된다.
kafkaTemplate.send(...)호출의 반환값을 버리고 있어서 브로커 거부, 타임아웃, 네트워크 장애 등 전송 실패 시나리오를 처리하지 못한다. 컨슈머는 성공 경로에서 상태를 COMPLETED/REJECTED로 업데이트하지만, 메시지가 Kafka에 도달하지 못하면 상태 전이가 영구적으로 일어나지 않는다. 운영 환경에서는 사용자가 상태 조회만 반복하게 되고 재처리 포인트가 사라진다. 이미 codebase에 존재하는 outbox 패턴(PaymentOutbox, event_outbox 테이블)을 적용하거나, 최소한 send()의 ListenableFuture를 캡처하여 실패 콜백에서 요청 상태를 REJECTED로 전환하고 재시도 레코드를 기록해야 한다. 추가로 JsonProcessingException이 아닌 다른 예외(예: 직렬화 실패 후 send 호출)도 CoreException으로 통일하여 facade의 오류 처리 패턴을 일관되게 유지하도록 수정하라. 테스트에서는 send() 실패 시나리오(비동기 콜백 실패 포함)에서 요청 상태가 REJECTED로 전환되는지 검증하라.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponFacade.java` around lines 152 - 166, The coupon issue flow in CouponFacade currently ignores kafkaTemplate.send(...) result so broker/network failures leave CouponIssueRequest stuck PENDING; change CouponFacade to capture the send() ListenableFuture (from kafkaTemplate.send), attach success and failure callbacks that on success leave behavior as-is and on failure update the saved CouponIssueRequest status to REJECTED (via couponIssueRequestRepository.save or a repository method) and record a retry/outbox entry (reuse the existing outbox pattern or PaymentOutbox/event_outbox) so the failure is persisted for retry; also catch and rethrow non-JsonProcessing exceptions from serialization/send as the common CoreException type to match facade error handling and add tests that simulate send() async failure to assert the request transitions to REJECTED.docker/register-debezium-connector.sh-19-21 (1)
19-21:⚠️ Potential issue | 🟠 Major데이터베이스 자격 증명이 하드코딩되어 있다.
database.user: "root",database.password: "root"가 스크립트에 직접 포함되어 있다. 개발 환경용이라도 자격 증명을 환경 변수로 분리해야 한다. 스크립트가 버전 관리에 포함되어 자격 증명이 노출될 위험이 있다.🔒 환경 변수 사용 제안
- "database.user": "root", - "database.password": "root", + "database.user": "${DB_USER:-root}", + "database.password": "${DB_PASSWORD:-root}",스크립트 상단에 환경 변수 참조 추가:
DB_USER="${DB_USER:-root}" DB_PASSWORD="${DB_PASSWORD:-root}"그리고 JSON에서 변수 치환을 위해 heredoc을 수정하거나
envsubst를 사용해야 한다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/register-debezium-connector.sh` around lines 19 - 21, The JSON in register-debezium-connector.sh hardcodes credentials ("database.user" and "database.password"); change it to read DB_USER and DB_PASSWORD environment variables (with safe defaults if desired) and update the heredoc/JSON generation to substitute those variables (e.g., use parameter expansion DB_USER="${DB_USER:-root}" and DB_PASSWORD="${DB_PASSWORD:-root}" at the top or require them to be set), then render the connector JSON via a variable-substituting heredoc or run envsubst before posting so "database.user" and "database.password" are populated from the environment rather than hardcoded.apps/commerce-batch/src/main/java/com/loopers/batch/job/metricsreconcile/step/MetricsReconcileTasklet.java-21-51 (1)
21-51:⚠️ Potential issue | 🟠 Major단계 간 데이터 무결성 문제 및 대용량 테이블 성능 이슈가 존재한다.
데이터 무결성: 1단계에서
INSERT ... ON DUPLICATE KEY UPDATE로like_count를 설정할 때view_count=0, sales_count=0, sales_amount=0으로 삽입한다. 이미product_metrics행이 없는 상품의 경우, 1단계에서 삽입된 후 3단계에서 다시like_count=0으로 덮어쓸 수 있다. 두 단계 사이에 스트리밍 업데이트가 발생하면 데이터 불일치가 발생한다.성능: 전체
likes,order_item테이블을 스캔하는 네이티브 쿼리가 단일 트랜잭션에서 실행된다. 대용량 데이터에서 장시간 락 점유 및 타임아웃 발생 가능성이 있다.🔧 개선 제안
- // 1단계: likes 테이블 기준 → product_metrics.like_count 보정 - log.info("[MetricsReconcile] 1단계: like_count 대사 시작"); - int likeCorrected = entityManager.createNativeQuery( - "INSERT INTO product_metrics (product_id, like_count, view_count, sales_count, sales_amount, updated_at) " - + "SELECT l.product_id, COUNT(*), 0, 0, 0, NOW(6) FROM likes l GROUP BY l.product_id " - + "ON DUPLICATE KEY UPDATE like_count = VALUES(like_count), updated_at = NOW(6)" - ).executeUpdate(); + // 1단계: likes 테이블 기준 → product_metrics.like_count 보정 (기존 값 유지) + log.info("[MetricsReconcile] 1단계: like_count 대사 시작"); + int likeCorrected = entityManager.createNativeQuery( + "INSERT INTO product_metrics (product_id, like_count, view_count, sales_count, sales_amount, updated_at) " + + "SELECT l.product_id, COUNT(*), " + + "COALESCE((SELECT pm.view_count FROM product_metrics pm WHERE pm.product_id = l.product_id), 0), " + + "COALESCE((SELECT pm.sales_count FROM product_metrics pm WHERE pm.product_id = l.product_id), 0), " + + "COALESCE((SELECT pm.sales_amount FROM product_metrics pm WHERE pm.product_id = l.product_id), 0), " + + "NOW(6) FROM likes l GROUP BY l.product_id " + + "ON DUPLICATE KEY UPDATE like_count = VALUES(like_count), updated_at = NOW(6)" + ).executeUpdate();또는 단일 통합 쿼리로 리팩토링하거나, 배치 크기를 제한하여 페이징 처리를 고려해야 한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-batch/src/main/java/com/loopers/batch/job/metricsreconcile/step/MetricsReconcileTasklet.java` around lines 21 - 51, The current MetricsReconcileTasklet runs three full-table native upserts that (a) insert zeros for unrelated metrics in step1/step3 causing later overwrites (e.g., like_count or sales_* being zeroed) and (b) scan huge tables in one transaction causing long locks/timeouts; fix by changing the upserts so each step only inserts/updates the relevant columns (e.g., in the Metric upsert SQLs used in MetricsReconcileTasklet do NOT set view_count/sales_count/sales_amount to 0 when inserting — insert minimal columns or use INSERT ... ON DUPLICATE KEY UPDATE that only updates like_count in the like step and only sales_count/sales_amount in the sales step), and convert the large-table operations (the native queries that aggregate likes and order_item) into chunked/paginated batch jobs (process product_id groups or date ranges in batches with a LIMIT-based loop inside MetricsReconcileTasklet) or combine aggregations into a single safe upsert that computes both likes and sales before writing to product_metrics to avoid intermediate zeroing and long transactions.apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/MetricsConsumer.java-32-43 (1)
32-43:⚠️ Potential issue | 🟠 Major실패 레코드를 삼킨 뒤 배치 전체를 ack하면 지표 이벤트가 영구 유실된다.
현재는 개별 레코드 처리 실패를 로그만 남기고 계속 진행한 뒤 Line 43에서 배치 전체를 커밋한다. 이러면 transient DB 오류나 역직렬화 오류도 재시도/DLQ로 가지 않으며, 특히
PRODUCT_VIEWED처럼 재조정 배치로 복구되지 않는 이벤트는 실제 집계가 영구히 낮아질 수 있다. 실패 레코드가 하나라도 나오면 즉시 예외를 다시 던져 컨테이너 오류 처리기로 넘기거나, 배치 대신 단건 소비로 바꿔 실패 건만 DLT 처리하도록 수정하는 편이 안전하다. 추가로 손상된 한 건이 섞인 배치에서 offset이 커밋되지 않거나 실패 레코드가 DLT로 이동하는지 확인하는 통합 테스트가 필요하다.As per coding guidelines, "예외 흐름이 명확한지 점검한다."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/MetricsConsumer.java` around lines 32 - 43, 현재 consume(List<ConsumerRecord<String, byte[]>> records, Acknowledgment ack)에서 개별 레코드 처리 실패를 로그만 남기고 배치 전체를 ack해서 실패 레코드가 영구 유실될 수 있습니다; 수정방법은 processRecord 호출을 감싸는 try/catch에서 예외를 흡수하지 말고(혹은 실패 시 즉시 rethrow) consume 메서드가 예외를 던져 컨테이너의 재시도/DLQ 흐름으로 흐르도록 하거나, 또는 배치 대신 단건 처리로 변경해 각 레코드별로 트랜잭션(TransactionTemplate)과 ack를 관리하도록 바꿔 실패 건만 DLT로 보내지게 하세요; 관련 심볼: MetricsConsumer.consume, processRecord, TransactionTemplate, Acknowledgment. 또한 변경 후에는 실패 레코드가 offset 커밋되지 않고 DLT로 이동하는 통합 테스트를 추가해 동작을 검증하세요.apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/MetricsConsumer.java-52-71 (1)
52-71:⚠️ Potential issue | 🟠 Major지원하지 않는 이벤트를 먼저
event_handled에 기록하면 재처리 기회를 잃는다.Line 52-57에서
event_handled를 먼저 넣고, Line 64-70에서 미지원 타입은 경고만 남긴다. 운영에서 producer 스키마 드리프트나 헤더 누락이 발생하면 실제 집계는 반영되지 않은 채 해당 이벤트가 처리 완료로 고정되어, 코드 수정 후 replay해도 다시 먹지 않는다. 허용된eventType인지 먼저 검증하고, 미지원 타입은event_handled에 기록하지 말고 예외를 던져 retry/DLQ로 보내는 쪽이 안전하다. 추가로eventType누락/오탈자 입력에서event_handledrow가 생성되지 않고 재시도 경로로 빠지는지 테스트가 필요하다.As per coding guidelines, "예외 흐름이 명확한지 점검한다."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/MetricsConsumer.java` around lines 52 - 71, The current code inserts into event_handled before validating eventType, which can mark unsupported events as handled; change the flow in MetricsConsumer.java so you validate the incoming eventType (and presence/non-empty header) against supported values ("LIKE_CREATED", "LIKE_REMOVED", "PRODUCT_VIEWED", "ORDER_CREATED", "ORDER_CANCELLED") before calling the native INSERT IGNORE; for unsupported or missing eventType throw a specific exception (so the message goes to retry/DLQ) instead of logging and proceeding, and only after successful validation perform the INSERT and then call the appropriate handler methods (upsertLikeCount, upsertViewCount, upsertSalesMetrics); add unit/integration tests to assert that unsupported or missing eventType does not create an event_handled row and triggers retry behavior.apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CouponIssueConsumer.java-71-87 (1)
71-87:⚠️ Potential issue | 🟠 Major
coupon_issueINSERT의 모든 실패를 중복 발급으로 취급하면 정상 장애가 영구 거절로 바뀐다.Line 79-87은 어떤 예외든
issued_count를 되돌리고 요청을REJECTED로 끝낸다. DB 일시 장애, lock timeout, SQL 오류까지 이 분기로 들어오면 재시도 가능한 요청이이미 발급된 쿠폰으로 잘못 종료되고, 카운터 보정도 실제 원인과 다르게 적용될 수 있다. UNIQUE 제약 위반만 식별해서 보상/거절하고, 그 외 예외는 cause를 보존한 채 다시 던져 retry/DLQ로 넘기도록 catch 범위를 좁히는 편이 안전하다. 추가로 중복 발급 예외는 거절로 끝나고 일반 DB 예외는 상태 변경 없이 재시도되는지 분리된 테스트가 필요하다.As per coding guidelines, "예외 흐름이 명확한지 점검한다."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CouponIssueConsumer.java` around lines 71 - 87, The current catch around the coupon_issue INSERT treats any exception as a duplicate-issue case and rolls back issued_count and calls rejectRequest(requestId, "..."), which masks transient DB errors; change the error handling in the block performing the entityManager.createNativeQuery INSERT into discriminating exception handling: detect a UNIQUE/constraint-violation (e.g., inspect PersistenceException/ConstraintViolationException or SQLState/vendor error code thrown by the INSERT) and only in that case run the compensating UPDATE on coupon (issued_count--) and call rejectRequest; for all other exceptions rethrow the original exception (preserving cause) so it can be retried/handled by DLQ; update tests to cover duplicate-issue path (expect rejectRequest and issued_count decrement) and transient DB failure path (no state change, exception propagated for retry).
🟡 Minor comments (10)
apps/commerce-api/src/main/java/com/loopers/domain/event/OrderCancelledEvent.java-5-6 (1)
5-6:⚠️ Potential issue | 🟡 Minor
OrderCreatedEvent와 동일하게List<OrderItemSnapshot>의 불변성이 보장되지 않는다.
OrderCreatedEvent에서 제안한 것과 동일하게 방어적 복사를 적용해야 한다. 동일한 구조의 이벤트이므로, 두 이벤트 모두 일관된 방식으로 처리해야 유지보수가 용이하다.수정안
-public record OrderCancelledEvent(long orderId, long memberId, List<OrderItemSnapshot> items) { -} +public record OrderCancelledEvent(long orderId, long memberId, List<OrderItemSnapshot> items) { + public OrderCancelledEvent { + items = items == null ? List.of() : List.copyOf(items); + } +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/domain/event/OrderCancelledEvent.java` around lines 5 - 6, OrderCancelledEvent currently accepts a mutable List<OrderItemSnapshot> without defensive copying; add a canonical/compact constructor for the record OrderCancelledEvent that copies the items via List.copyOf(items) (and null-checks if you followed OrderCreatedEvent) so the stored list is immutable, and ensure any accessor or usage relies on that copied list to preserve immutability consistent with OrderCreatedEvent.apps/commerce-api/src/main/java/com/loopers/domain/event/OrderCreatedEvent.java-5-6 (1)
5-6:⚠️ Potential issue | 🟡 Minor
List<OrderItemSnapshot> items의 불변성이 보장되지 않는다.Java record는 기본적으로 불변이지만,
List참조는 외부에서 변경 가능하다. 이벤트가 비동기로 여러 리스너에게 전달될 때, 한 리스너가 리스트를 수정하면 다른 리스너에 영향을 주거나 디버깅이 어려운 상태 불일치가 발생한다.수정안: 방어적 복사 적용
-public record OrderCreatedEvent(long orderId, long memberId, List<OrderItemSnapshot> items) { -} +public record OrderCreatedEvent(long orderId, long memberId, List<OrderItemSnapshot> items) { + public OrderCreatedEvent { + items = items == null ? List.of() : List.copyOf(items); + } +}추가 테스트: 생성 후 원본 리스트 변경 시 이벤트 내 리스트가 변경되지 않음을 검증한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/domain/event/OrderCreatedEvent.java` around lines 5 - 6, OrderCreatedEvent의 List<OrderItemSnapshot> items가 외부에서 변경 가능한 상태로 노출되므로 불변성을 보장하도록 생성자(레코드의 정식 생성자)에서 방어적 복사 및 불변화(List.copyOf 또는 Collections.unmodifiableList)를 적용하고 null 체크(Objects.requireNonNull)를 추가하세요; 구체적으로 OrderCreatedEvent의 canonical constructor에서 받은 items를 List.copyOf(items)로 변환해 레코드 필드에 할당하도록 변경하고, 생성 후 원본 리스트를 수정해도 이벤트 내부 리스트가 변경되지 않는 단위 테스트를 추가하세요.apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java-53-54 (1)
53-54:⚠️ Potential issue | 🟡 Minor
ProductViewedEvent의memberId가 항상0L로 강제되고 있다
getProductDetailCached()메서드가 현재 사용자 정보를 받지 않으므로, 캐시 히트/미스 양쪽에서 모두memberId=0L로 이벤트를 발행한다. 이는 다음 문제를 야기한다:
- 분석 데이터 손실: 사용자별 상품 조회 패턴을 추적할 수 없어 개인화 추천 등 향후 기능 확장 불가능
- 의미 없는 이벤트: 캐시를 통해 조회하는 경우도 조회 이벤트를 발행하지만 사용자 정보가 없어 분석 가치 제로
수정 방안:
ProductController.getProduct()에서 현재 로그인 사용자 정보를 추출 (@AuthenticationPrincipal또는 동등한 메커니즘)getProductDetailCached(Long productId)시그니처를getProductDetailCached(Long productId, Long memberId)로 변경- 호출 시 실제 사용자 ID를 전달하도록 수정
- 테스트: 로그인 사용자와 비로그인 사용자 시나리오별 테스트 추가 (비로그인 시
0L또는null처리 정책 결정)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java` around lines 53 - 54, ProductViewedEvent is being published with memberId always set to 0L because getProductDetailCached(Long productId) has no user context; update the flow to capture and pass the real user ID: extract current user (e.g. via `@AuthenticationPrincipal`) in ProductController.getProduct(), change the service method signature to getProductDetailCached(Long productId, Long memberId) and update all call sites (both cache hit and miss paths) to pass the memberId into ProductViewedEvent instead of 0L (ensure ProductViewedEvent usage is updated), and add tests for logged-in and anonymous users to verify memberId handling (decide whether anonymous is 0L or null and codify it).apps/commerce-api/src/main/java/com/loopers/domain/event/EventOutbox.java-40-47 (1)
40-47:⚠️ Potential issue | 🟡 Minoroutbox 컬럼 제약을 팩토리에서 바로 검증해야 한다.
Lines 40-47의 팩토리는
aggregateType/aggregateId/eventType의 null·blank·length를 전혀 확인하지 않는다. 이 상태로는 잘못된 이벤트가 비즈니스 트랜잭션 말단의 DB 제약으로만 드러나서 장애 원인 추적이 늦어진다. 수정안은create()에서aggregateType(50),aggregateId(100),eventType(50)의 null/blank/length를 명시적으로 검증해 빠르게 실패시키는 것이다. 추가 테스트로 50/51자, 100/101자 경계값 케이스를 넣는다.As per coding guidelines, "
**/domain/**/*.java: 엔티티/값 객체/DTO 경계를 명확히 하고, 불변성과 캡슐화를 점검한다."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/domain/event/EventOutbox.java` around lines 40 - 47, The EventOutbox.create factory does not validate aggregateType/aggregateId/eventType and can allow values that will later violate DB constraints; update the EventOutbox.create(String aggregateType, String aggregateId, String eventType, String payload) method to validate each of aggregateType (non-null, non-blank, max length 50), aggregateId (non-null, non-blank, max length 100) and eventType (non-null, non-blank, max length 50) and throw a clear IllegalArgumentException on violation so failures surface early; add unit tests covering boundary lengths (50 vs 51 for aggregateType/eventType and 100 vs 101 for aggregateId) and blank/null cases to ensure the validations work.apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java-44-49 (1)
44-49:⚠️ Potential issue | 🟡 Minor팩토리에서 식별자 invariant를 먼저 막아야 한다.
Lines 44-49는
couponId나memberId가 null/비정상 값이어도 엔티티를 만든다. 이렇게 되면 실패가 flush 시점의 DB 예외로 늦어져 비동기 발급 요청 재시도에서 원인 분류가 어려워지고, 엔티티도 스스로 상태를 보장하지 못한다. 수정안은create()에서 두 식별자를 즉시 검증해 빠르게 실패시키는 것이다. 추가 테스트로 null/0/음수 ID 입력이 영속화 전에 거절되는지 검증한다.As per coding guidelines, "
**/domain/**/*.java: 엔티티/값 객체/DTO 경계를 명확히 하고, 불변성과 캡슐화를 점검한다."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java` around lines 44 - 49, The factory method CouponIssueRequest.create currently assigns couponId and memberId without validation; update CouponIssueRequest.create to validate couponId and memberId (reject null, zero, or negative) and throw an appropriate unchecked exception (e.g., IllegalArgumentException) immediately if invalid, keeping status set to CouponIssueRequestStatus.PENDING only after validation; add unit tests for CouponIssueRequest.create to assert that null, 0, and negative IDs are rejected before persistence.apps/commerce-api/src/test/java/com/loopers/application/like/LikeFacadeTest.java-132-165 (1)
132-165:⚠️ Potential issue | 🟡 Minor이 테스트는 outbox 계약의 절반만 검증한다.
Lines 132-165는
published.event()중심으로만 확인해서payload나removeLike의 aggregate metadata가 틀려도 통과한다. 운영에서는 outbox에 직렬화되는 값과 aggregate 정보가 downstream 계약이므로, 이 누락이 Kafka 메시지 스키마 회귀를 숨긴다. 수정안은LIKE_CREATED/LIKE_REMOVED모두published.payload()타입,productId,memberId,aggregateType,aggregateId를 함께 검증하는 것이다. 추가 테스트로 event와 payload가 다른 객체일 때도 outbox용 데이터가 기대한 형태인지 확인한다.예시 assertion
assertThat(published.aggregateType()).isEqualTo("catalog"); assertThat(published.aggregateId()).isEqualTo(String.valueOf(product.getId())); assertThat(published.eventType()).isEqualTo("LIKE_CREATED"); + assertThat(published.payload()).isInstanceOf(LikeCreatedEvent.class); + LikeCreatedEvent payload = (LikeCreatedEvent) published.payload(); + assertThat(payload.productId()).isEqualTo(product.getId()); + assertThat(payload.memberId()).isEqualTo(1L); assertThat(published.event()).isInstanceOf(LikeCreatedEvent.class);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/like/LikeFacadeTest.java` around lines 132 - 165, The tests currently only assert published.event() and miss verifying the outbox payload and aggregate metadata; update both addLike_publishesDomainEvent and removeLike_publishesDomainEvent to also assert published.payload() is the expected payload type and that payload.productId() and payload.memberId() match the product and member, and assert published.aggregateType() and published.aggregateId() equal the expected "catalog" and product.getId(); additionally add a case that constructs/returns a differing event vs payload to assert the outbox (published.payload()) contains the expected DTO shape even when published.event() is a different object so regressions in serialized outbox data are caught (refer to PublishedEvent, LikeCreatedEvent, LikeRemovedEvent, likeFacade.addLike/removeLike).apps/commerce-api/src/test/java/com/loopers/infrastructure/event/DomainEventPublisherImplTest.java-66-69 (1)
66-69:⚠️ Potential issue | 🟡 Minor직렬화 실패 시 부수효과 부재까지 고정해야 한다.
현재 검증은 예외 메시지에만 의존해서, 구현이 나중에 outbox 저장이나 Spring 이벤트 발행을 먼저 수행해도 회귀를 놓칠 수 있다. 운영에서는 이런 부분 저장이 중복 소비나 유령 이벤트로 이어지므로, 실패 후
savedOutboxes와publishedEvents가 모두 비어 있는지까지 단언해 원자성을 고정해야 한다. 추가 테스트로 두 컬렉션이 0건인지 검증하라.수정 예시
assertThatThrownBy(() -> domainEventPublisher.publish("test", "1", "TEST", unserializable, new Object())) .isInstanceOf(RuntimeException.class) .hasMessageContaining("이벤트 페이로드 직렬화 실패"); + assertThat(savedOutboxes).isEmpty(); + assertThat(publishedEvents).isEmpty(); } }As per coding guidelines, "
**/*Test*.java: 단위 테스트는 경계값/실패 케이스/예외 흐름을 포함하는지 점검한다."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/infrastructure/event/DomainEventPublisherImplTest.java` around lines 66 - 69, The test currently only asserts the serialization exception message from domainEventPublisher.publish; extend it to also assert that there are no side-effects by verifying savedOutboxes and publishedEvents collections remain empty after the call. After calling domainEventPublisher.publish("test","1","TEST", unserializable, new Object()), add assertions that savedOutboxes.size() == 0 and publishedEvents.size() == 0 (or use assertThat(...).isEmpty()) to lock in atomic failure behavior and prevent regressions where outbox persistence or Spring events might be performed before the serialization error is thrown.apps/commerce-streamer/src/main/java/com/loopers/domain/event/EventHandled.java-36-38 (1)
36-38:⚠️ Potential issue | 🟡 Minor생성 시점에 null 상태를 차단해야 한다.
eventId나eventType이 null이어도 엔티티가 만들어져 오류가 DB flush 시점까지 늦게 드러난다. 소비자 장애 시 원인 파악이 어려워지므로 생성자에서 즉시 검증하고, null 인자 전달 시 예외가 발생하는 테스트를 추가하라.수정 예시
public EventHandled(String eventId, String eventType) { - this.eventId = eventId; - this.eventType = eventType; + this.eventId = Objects.requireNonNull(eventId, "eventId must not be null"); + this.eventType = Objects.requireNonNull(eventType, "eventType must not be null"); }import java.util.Objects;As per coding guidelines, "
**/domain/**/*.java: ... null 처리 ... 점검한다."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-streamer/src/main/java/com/loopers/domain/event/EventHandled.java` around lines 36 - 38, In the EventHandled(String eventId, String eventType) constructor, proactively reject nulls by adding null checks (e.g., Objects.requireNonNull) for eventId and eventType and import java.util.Objects; update the constructor in class EventHandled to throw immediately on null inputs rather than deferring to DB flush, and add a unit test that asserts a NullPointerException (or the chosen exception) is thrown when passing null for either eventId or eventType.apps/commerce-api/src/test/java/com/loopers/fake/FakeCouponIssueRequestRepository.java-13-20 (1)
13-20:⚠️ Potential issue | 🟡 MinorFake Repository의 ID 발급이 동시성 안전하지 않다.
sequence++연산은 원자적이지 않아서 병렬save호출 시 동일한 ID가 발급될 수 있다. 예를 들어 스레드 A와 B가 동시에sequence++을 실행하면 둘 다 같은 값을 읽은 후 증가시켜 중복 ID가 생기고, 결과적으로store.put()에서 이전 요청을 덮어쓴다.ConcurrentHashMap은 맵 자체만 보호할 뿐 sequence 변수는 보호하지 않는다. 동시성 관련 테스트 시나리오에서 비결정적 동작을 초래하므로AtomicLong을 사용하여 ID 발급을 원자화하고, 병렬 저장 시 모든 ID가 유일한지 검증하는 테스트를 추가하라.수정 예시
import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.atomic.AtomicLong; - private long sequence = 1L; + private final AtomicLong sequence = new AtomicLong(1L); if (request.getId() == null) { - setId(request, sequence++); + setId(request, sequence.getAndIncrement()); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/fake/FakeCouponIssueRequestRepository.java` around lines 13 - 20, The FakeCouponIssueRequestRepository currently uses a non-atomic long field named sequence and does sequence++ inside save(CouponIssueRequest), which can produce duplicate IDs under concurrent saves; change the sequence field to an AtomicLong (e.g., AtomicLong sequence = new AtomicLong(1L)) and replace the sequence++ usage in save(...) with an atomic operation (e.g., sequence.getAndIncrement() or incrementAndGet() as appropriate) when calling setId(...), ensuring IDs are assigned uniquely under concurrency; also add a concurrent test that performs parallel save(...) calls against FakeCouponIssueRequestRepository and asserts all generated IDs are unique and no entries were overwritten in store (ConcurrentHashMap).docker/register-debezium-connector.sh-5-7 (1)
5-7:⚠️ Potential issue | 🟡 Minor폴링 루프에 타임아웃이 없고 에러 처리가 미흡하다.
until루프에 타임아웃이 없어 Kafka Connect가 시작되지 않으면 무한 대기한다.curl명령에-f플래그가 없어 HTTP 오류(4xx, 5xx)가 발생해도set -e가 작동하지 않는다.- 커넥터 등록 실패 시 스크립트가 성공으로 종료될 수 있다.
🔧 타임아웃 및 에러 처리 추가
echo "Waiting for Kafka Connect to be ready..." +MAX_RETRIES=30 +RETRY_COUNT=0 -until curl -s http://localhost:8083/connectors > /dev/null 2>&1; do +until curl -sf http://localhost:8083/connectors > /dev/null 2>&1; do + RETRY_COUNT=$((RETRY_COUNT + 1)) + if [ $RETRY_COUNT -ge $MAX_RETRIES ]; then + echo "Error: Kafka Connect not available after $MAX_RETRIES attempts" + exit 1 + fi sleep 2 done echo "Registering Debezium MySQL Connector..." -curl -X POST http://localhost:8083/connectors \ +curl -sf -X POST http://localhost:8083/connectors \ -H "Content-Type: application/json" \ - -d `@-` << 'EOF' + -d `@-` << 'EOF' || { echo "Failed to register connector"; exit 1; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/register-debezium-connector.sh` around lines 5 - 7, The polling loop "until curl -s http://localhost:8083/connectors > /dev/null 2>&1; do sleep 2; done" lacks a timeout and proper error handling; modify it to track elapsed time with a MAX_WAIT (e.g. seconds) and break with a non‑zero exit if the max is exceeded, change the curl invocation to fail on HTTP errors (use curl -fS or equivalent) and surface curl's stderr, and ensure subsequent connector registration commands check curl/registration exit codes (or enable set -e) so failures don't cause the script to exit successfully; in short, replace the current until loop with a timed retry loop using curl -fS and explicit exit-on-failure for connector registration.
| @Async("eventExecutor") | ||
| @TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT, fallbackExecution = true) | ||
| public void handle(ProductViewedEvent event) { | ||
| try { | ||
| String payload = objectMapper.writeValueAsString(Map.of( | ||
| "eventId", UUID.randomUUID().toString(), | ||
| "eventType", "PRODUCT_VIEWED", | ||
| "productId", event.productId(), | ||
| "memberId", event.memberId() | ||
| )); | ||
| kafkaTemplate.send("catalog-events", String.valueOf(event.productId()), payload); | ||
| } catch (JsonProcessingException e) { | ||
| log.warn("조회수 이벤트 Kafka 발행 실패 — productId={}", event.productId(), e); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n apps/commerce-api/src/main/java/com/loopers/application/event/ProductViewKafkaPublisher.javaRepository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 1887
🏁 Script executed:
# Search for eventExecutor configuration
rg "eventExecutor" --type java
# Check other kafkaTemplate.send usage patterns
rg "kafkaTemplate\.send" --type java -A 2Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 2072
🏁 Script executed:
# Read AsyncConfig
cat -n apps/commerce-api/src/main/java/com/loopers/infrastructure/kafka/AsyncConfig.java
# Search for tests
find . -name "*ProductViewKafkaPublisher*Test*" -o -name "*ProductViewKafkaPublisherTest*"Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 1257
🏁 Script executed:
cat -n ./apps/commerce-api/src/test/java/com/loopers/application/event/ProductViewKafkaPublisherTest.javaRepository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 1945
Kafka 전송 실패가 조용히 유실될 수 있다.
kafkaTemplate.send()의 반환값인 ListenableFuture를 캡처하지 않아 전송 실패가 감지되지 않는다. 직렬화 실패는 warn으로 삼키고, Kafka 브로커 오류나 타임아웃 같은 send 실패는 완전히 무시된다. 결과적으로 조회수 이벤트가 손실되어도 운영자는 감지하기 어렵고, 다운스트림 조회수 집계와 상품 순위에 오차가 누적된다.
다음을 적용하라:
send()결과에 콜백을 붙여 전송 실패를 메트릭으로 기록하거나 실패 큐에 적재한다.JsonProcessingException외에send()메서드가 던질 수 있는 다른 예외도 처리한다.- 전송 실패와 직렬화 실패 시나리오를 포함한 통합 테스트를 추가한다(현재 테스트는 정상 경로만 커버).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-api/src/main/java/com/loopers/application/event/ProductViewKafkaPublisher.java`
around lines 25 - 38, The handle(ProductViewedEvent) method currently ignores
kafkaTemplate.send() result and only catches JsonProcessingException; capture
the send() ListenableFuture, attach a callback (onSuccess/onFailure) to log
failures, increment a metric and/or enqueue the failed payload to a
retry/failure queue, and also handle exceptions thrown synchronously by send()
by surrounding send with try/catch in addition to the callback; update
references to objectMapper.writeValueAsString and kafkaTemplate.send in
ProductViewKafkaPublisher to ensure both serialization and broker/send errors
are surfaced and recorded, and add integration tests that simulate
JsonProcessingException and Kafka send failures to assert metrics/queueing
behavior.
| // 8. Outbox INSERT + 이벤트 발행 | ||
| List<OrderItemSnapshot> eventItems = order.getItems().stream() | ||
| .map(item -> new OrderItemSnapshot(item.getProductId(), item.getQuantity(), item.getProductPrice())) | ||
| .toList(); | ||
|
|
||
| domainEventPublisher.publish("order", String.valueOf(order.getId()), | ||
| "ORDER_CREATED", | ||
| Map.of("orderId", order.getId(), "memberId", memberId, "items", eventItems), | ||
| new OrderCreatedEvent(order.getId(), memberId, eventItems)); | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# DomainEventPublisherImpl 구현 확인
ast-grep --pattern $'class DomainEventPublisherImpl {
$$$
publish($$$) {
$$$
}
$$$
}'
# Kafka 전송이 AFTER_COMMIT 리스너에서 발생하는지 확인
rg -n "@TransactionalEventListener" --type=java -A 5Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 4583
🏁 Script executed:
# Find DomainEventPublisher interface and implementations
rg -n "interface DomainEventPublisher|class.*DomainEventPublisher" --type=java -A 3
# Find OrderCreatedEvent and its listener
rg -n "class OrderCreatedEvent|@TransactionalEventListener.*OrderCreatedEvent" --type=java -A 5
# Check the publish method signature
rg -n "void publish\(" --type=java -B 2 -A 10Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 3992
🏁 Script executed:
# Find OrderCreatedEvent listener
rg -n "OrderCreatedEvent" --type=java -B 2 -A 8
# Check if there's a listener for OrderCreatedEvent
rg -n "@TransactionalEventListener.*Order|handle.*OrderCreatedEvent" --type=java -A 10
# Look at the full DomainEventPublisherImpl
cat apps/commerce-api/src/main/java/com/loopers/infrastructure/event/DomainEventPublisherImpl.javaRepository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 4631
🏁 Script executed:
# Search for any listener that might handle OrderCreatedEvent
rg -n "OrderCreatedEvent" --type=java
# Get the complete createOrder method from OrderFacade
sed -n '90,128p' apps/commerce-api/src/main/java/com/loopers/application/order/OrderFacade.java
# Look for Kafka publishers in the codebase
rg -n "KafkaTemplate|kafkaTemplate|sendMessage" --type=java -B 2 -A 3 | head -50Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 7452
🏁 Script executed:
# Search for EventOutbox processing/polling
rg -n "EventOutbox|eventOutbox" --type=java | grep -i "process\|poll\|schedule\|publish" | head -20
# Look for scheduled methods that might process outbox
rg -n "@Scheduled|@Async" --type=java -A 5 | grep -i "outbox\|event" -B 2 -A 5
# Search for any processor or poller
rg -n "class.*Outbox.*|class.*Poller" --type=java
# Check test for DomainEventPublisher to understand expected behavior
cat apps/commerce-api/src/test/java/com/loopers/infrastructure/event/DomainEventPublisherImplTest.javaRepository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 8951
🏁 Script executed:
# Search for any listener that handles Order-related events
rg -n "handle.*Order|Order.*Event" --type=java -B 2 -A 5
# Check if EventOutbox has any processor/consumer
rg -n "EventOutbox" --type=java | grep -v "test\|Repository\|domain\|event/DomainEventPublisher"
# Look for all `@TransactionalEventListener` handlers
rg -n "@TransactionalEventListener" --type=java -B 1 | grep "public void\|EventListener" | head -20Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java
Length of output: 6468
OrderCreatedEvent 리스너 구현이 누락되었다.
DomainEventPublisherImpl은 올바르게 구현되었으나, OrderCreatedEvent에 대한 @TransactionalEventListener 핸들러가 존재하지 않는다. EventOutbox에 저장되는 이벤트가 실제로 처리되지 않아 주문 생성 이벤트가 손실된다.
코드베이스의 다른 도메인 이벤트(ProductViewedEvent, LikeCreatedEvent)는 @TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT) 핸들러로 커밋 후 처리되는데, 주문 이벤트는 이를 구현해야 한다. 현재 상태로는 주문이 생성되어도 다운스트림 시스템(알림, 분석, 재고 연동 등)이 이벤트를 수신하지 못한다.
필수 구현:
OrderCreatedEvent핸들러 클래스 생성@TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT)적용- Kafka 발행 또는 필요한 비즈니스 로직 처리
OrderCancelledEvent핸들러도 동일 패턴으로 구현
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-api/src/main/java/com/loopers/application/order/OrderFacade.java`
around lines 117 - 126, The OrderCreatedEvent is never handled after outbox
persistence; create a new listener class (e.g., OrderEventListener) that defines
`@TransactionalEventListener`(phase = TransactionPhase.AFTER_COMMIT) methods to
handle OrderCreatedEvent and OrderCancelledEvent, map the payload to required
DTOs, and invoke the existing DomainEventPublisherImpl or Kafka publishing logic
to emit the event downstream (mirror the pattern used by
ProductViewedEvent/LikeCreatedEvent handlers); ensure methods subscribe to the
domain event types (OrderCreatedEvent, OrderCancelledEvent), run after commit,
and perform the same outbox-to-kafka publish flow or business logic as other
domain listeners.
| @GetMapping("/api/v1/coupons/issue-requests/{requestId}") | ||
| public ApiResponse<CouponDto.CouponIssueRequestResponse> getIssueRequest( | ||
| @AuthMember Member member, | ||
| @PathVariable Long requestId | ||
| ) { | ||
| CouponIssueRequest request = couponFacade.getIssueRequest(requestId); | ||
| return ApiResponse.success(CouponDto.CouponIssueRequestResponse.from(request)); | ||
| } |
There was a problem hiding this comment.
인가 검증 누락으로 IDOR 취약점이 존재한다.
getIssueRequest 메서드에서 @AuthMember Member member를 주입받지만 실제로 사용하지 않는다. 인증된 사용자라면 누구나 타인의 발급 요청(requestId)을 조회할 수 있어 IDOR(Insecure Direct Object Reference) 취약점이 발생한다.
운영 시 악의적 사용자가 requestId를 순차적으로 조회하여 다른 회원의 쿠폰 발급 상태를 확인할 수 있다.
🔒 인가 검증 추가 제안
`@GetMapping`("/api/v1/coupons/issue-requests/{requestId}")
public ApiResponse<CouponDto.CouponIssueRequestResponse> getIssueRequest(
`@AuthMember` Member member,
`@PathVariable` Long requestId
) {
CouponIssueRequest request = couponFacade.getIssueRequest(requestId);
+ if (!request.getMemberId().equals(member.getId())) {
+ throw new CoreException(ErrorType.FORBIDDEN, "본인의 발급 요청만 조회할 수 있습니다.");
+ }
return ApiResponse.success(CouponDto.CouponIssueRequestResponse.from(request));
}또는 Facade 레이어에서 memberId를 함께 받아 검증하는 방식으로 변경:
-CouponIssueRequest request = couponFacade.getIssueRequest(requestId);
+CouponIssueRequest request = couponFacade.getIssueRequest(requestId, member.getId());📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @GetMapping("/api/v1/coupons/issue-requests/{requestId}") | |
| public ApiResponse<CouponDto.CouponIssueRequestResponse> getIssueRequest( | |
| @AuthMember Member member, | |
| @PathVariable Long requestId | |
| ) { | |
| CouponIssueRequest request = couponFacade.getIssueRequest(requestId); | |
| return ApiResponse.success(CouponDto.CouponIssueRequestResponse.from(request)); | |
| } | |
| `@GetMapping`("/api/v1/coupons/issue-requests/{requestId}") | |
| public ApiResponse<CouponDto.CouponIssueRequestResponse> getIssueRequest( | |
| `@AuthMember` Member member, | |
| `@PathVariable` Long requestId | |
| ) { | |
| CouponIssueRequest request = couponFacade.getIssueRequest(requestId); | |
| if (!request.getMemberId().equals(member.getId())) { | |
| throw new CoreException(ErrorType.FORBIDDEN, "본인의 발급 요청만 조회할 수 있습니다."); | |
| } | |
| return ApiResponse.success(CouponDto.CouponIssueRequestResponse.from(request)); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-api/src/main/java/com/loopers/interfaces/api/coupon/CouponController.java`
around lines 43 - 50, The getIssueRequest endpoint
(CouponController.getIssueRequest) ignores the injected `@AuthMember` Member
leading to an IDOR; fix by enforcing authorization either in the controller or
facade: retrieve the CouponIssueRequest via
couponFacade.getIssueRequest(requestId) and verify its owner matches
member.getId() (e.g., compare CouponIssueRequest.getMemberId() to
member.getId()) and throw an access-denied/forbidden error if they differ, or
modify couponFacade.getIssueRequest to accept a memberId parameter and perform
the ownership check there before returning the
CouponIssueRequest/CouponDto.CouponIssueRequestResponse.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.java (1)
127-142:⚠️ Potential issue | 🟠 Major
restoreStock과restoreCoupon의 에러 처리가 비대칭적이다.
ProductFacade.restoreStock()은 상품을 찾지 못하면 조용히 무시하고(Redis 재고만 증가),CouponFacade.restoreCoupon()은 쿠폰을 찾지 못하면CoreException을 던진다. 복구 실패 시 동작이 예측 불가능하다:
- 쿠폰 복원 실패 시 트랜잭션이 롤백되지만, Redis 재고는 이미 증가된 상태(Line 133의
restoreStock이 먼저 실행됨)- 상품 복원은 부분 성공(Redis만 증가, DB 미반영)이 가능하지만 쿠폰은 전체 실패
운영 환경에서 데이터 불일치 장애 발생 시 원인 파악이 어려워진다.
🛡️ 개선 제안
- 개별 복원 작업을 try-catch로 감싸고 실패 시 로깅/알림:
private void handlePaymentFailure(PaymentModel payment) { Order order = orderRepository.findById(payment.getOrderId()).orElse(null); if (order == null) return; - // 재고 복원 → ProductFacade 위임 - for (OrderItem item : order.getItems()) { - productFacade.restoreStock(item.getProductId(), item.getQuantity()); + // 재고 복원 — 개별 실패 허용 + for (OrderItem item : order.getItems()) { + try { + productFacade.restoreStock(item.getProductId(), item.getQuantity()); + } catch (Exception e) { + log.error("재고 복원 실패: productId={}, quantity={}, error={}", + item.getProductId(), item.getQuantity(), e.getMessage()); + } } log.info("재고 복원 완료: orderId={}", order.getId()); - // 쿠폰 복원 → CouponFacade 위임 if (order.getCouponIssueId() != null) { - couponFacade.restoreCoupon(order.getCouponIssueId()); - log.info("쿠폰 복원 완료: couponIssueId={}", order.getCouponIssueId()); + try { + couponFacade.restoreCoupon(order.getCouponIssueId()); + log.info("쿠폰 복원 완료: couponIssueId={}", order.getCouponIssueId()); + } catch (Exception e) { + log.error("쿠폰 복원 실패: couponIssueId={}, error={}", + order.getCouponIssueId(), e.getMessage()); + } } }
- 또는
ProductFacade.restoreStock과CouponFacade.restoreCoupon의 에러 처리를 일관되게 수정🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.java` around lines 127 - 142, handlePaymentFailure currently calls productFacade.restoreStock and couponFacade.restoreCoupon with different failure semantics causing inconsistent state; wrap each restore in its own try-catch inside handlePaymentFailure (around productFacade.restoreStock in the OrderItem loop and around couponFacade.restoreCoupon) so failures are handled symmetrically: catch exceptions, log detailed context (orderId, item.productId, item.quantity for restoreStock; orderId, couponIssueId for restoreCoupon), and send/trigger an alert/notification instead of letting one failure rollback earlier successful external-side effects (or rethrow only after compensating actions if you choose compensation). Ensure you reference handlePaymentFailure, productFacade.restoreStock, couponFacade.restoreCoupon and use order.getId()/item.getProductId()/order.getCouponIssueId() in the logs/alerts.
🧹 Nitpick comments (12)
apps/commerce-api/src/main/java/com/loopers/infrastructure/scheduler/WalRecoveryScheduler.java (1)
82-86:fromStatus기록이 정확하나, 로그 일관성을 위해 명시적 변수 사용을 권장한다.
payment.getStatus()는updateStatusConditionally호출 전 시점의 상태를 반환하므로fromStatus로 적절하다. 그러나 코드 가독성과 의도 명확화를 위해 별도 변수로 추출하면 유지보수성이 향상된다.♻️ 수정안
PaymentStatus targetStatus = "SUCCESS".equals(pgStatus) ? PaymentStatus.PAID : PaymentStatus.FAILED; + PaymentStatus fromStatus = payment.getStatus(); int affected = paymentRepository.updateStatusConditionally(payment.getId(), targetStatus, allowedStatuses); if (affected > 0) { historyRepository.save(PaymentStatusHistory.create( - payment.getId(), payment.getStatus(), targetStatus, "WAL_RECOVERY", null)); + payment.getId(), fromStatus, targetStatus, "WAL_RECOVERY", null)); log.info("WAL Recovery 성공: paymentId={}, newStatus={}", payment.getId(), targetStatus); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/infrastructure/scheduler/WalRecoveryScheduler.java` around lines 82 - 86, Extract the current status into a local variable (e.g., fromStatus = payment.getStatus()) before calling paymentRepository.updateStatusConditionally, then pass that fromStatus into historyRepository.save(PaymentStatusHistory.create(...)) and use fromStatus in the log.info call (replace payment.getStatus()) to ensure consistent and explicit from-status recording and logging around updateStatusConditionally, PaymentStatusHistory.create, and log.info.apps/commerce-api/src/test/java/com/loopers/application/payment/CallbackMissFaultTest.java (1)
62-64:Clock.systemDefaultZone()사용 시 테스트 격리 주의가 필요하다.
CouponFacade에Clock.systemDefaultZone()을 주입하는데, 만약 쿠폰 로직에서 현재 시간 기반 만료 검증 등이 있다면 테스트가 실행 시점에 따라 결과가 달라질 수 있다(Flaky). 현재 테스트는 결제 콜백 복구에 집중하므로 즉각적인 문제는 없으나, 쿠폰 관련 시나리오 확장 시Clock.fixed(...)로 교체하는 것이 안정적이다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/payment/CallbackMissFaultTest.java` around lines 62 - 64, The test injects a real Clock via Clock.systemDefaultZone() into CouponFacade which can make time-sensitive coupon logic flaky; replace that with a deterministic fixed clock (e.g., create a Clock.fixed(...) with a chosen Instant and ZoneId) and pass that fixed Clock into the CouponFacade constructor (the instantiation that currently uses Clock.systemDefaultZone()) so time-dependent coupon behavior is stable and reproducible across test runs.apps/commerce-api/src/test/java/com/loopers/application/payment/PaymentCallbackTest.java (1)
65-69: 콜백 처리 후 상태 전이 이력 검증이 필요하다.
PaymentRecoveryService가 콜백 처리 시PaymentStatusHistory를 기록하는 것이 PR 목표 중 하나인데, 테스트에서 이 동작을 검증하지 않는다. SUCCESS/FAILED 콜백 후 이력이 남는지 확인하면, 장애 분석 시 이력 누락 문제를 사전에 방지할 수 있다.추가 검증 예시:
// callback_success_paidPaymentAndOrder 테스트 내 FakePaymentStatusHistoryRepository historyRepo = ...; // 필드로 추출 List<PaymentStatusHistory> histories = historyRepo.findAllByPaymentId(payment.getId()); assertThat(histories).isNotEmpty();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/payment/PaymentCallbackTest.java` around lines 65 - 69, The test currently constructs PaymentRecoveryService but doesn't assert that PaymentStatusHistory entries are recorded; update the tests (e.g., callback_success_paidPaymentAndOrder and the FAILED case) to extract the FakePaymentStatusHistoryRepository instance used to create PaymentRecoveryService, call its finder (e.g., findAllByPaymentId(payment.getId()) or equivalent) after invoking PaymentRecoveryService callback handling, and assert the returned list is not empty (and optionally contains the expected status entries). Ensure you reference the existing FakePaymentStatusHistoryRepository instance used in the test setup and PaymentRecoveryService to locate where to add these assertions.apps/commerce-api/src/test/java/com/loopers/application/payment/DbFailureFaultTest.java (1)
40-40: WAL Recovery 시 이력 기록 검증이 누락되었다.
WalRecoveryScheduler가 이제 복구 성공 시PaymentStatusHistory를 기록하는데, 테스트에서 이 동작을 검증하지 않는다.FakePaymentStatusHistoryRepository를 주입했으므로, 복구 후 해당 리포지토리에 이력이 저장되었는지 검증하는 것이 운영 장애 시 이력 추적 가능성을 보장한다.추가 테스트 예시:
// Then: Payment → PAID + WAL 파일 삭제 + 이력 기록 FakePaymentStatusHistoryRepository historyRepo = ...; // 필드로 추출 assertThat(historyRepo.findAllByPaymentId(payment.getId())).hasSize(1);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/payment/DbFailureFaultTest.java` at line 40, The test currently constructs WalRecoveryScheduler with a FakePaymentStatusHistoryRepository but never asserts that a PaymentStatusHistory was recorded; update the test that initializes walRecovery (WalRecoveryScheduler) to hold the injected FakePaymentStatusHistoryRepository as a field (or local variable) and after invoking the recovery path assert that the fake repo contains the expected history entry for the recovered payment (use FakePaymentStatusHistoryRepository.findAllByPaymentId(payment.getId()) or equivalent) to verify a single PaymentStatusHistory was saved and WAL file removed.apps/commerce-api/src/test/java/com/loopers/application/payment/ManualRecoveryTest.java (2)
50-55:CouponIssueRequestRepository익명 클래스가 여러 테스트 파일에 중복 정의되어 있다.동일한 stub 패턴이
ManualRecoveryTest,PaymentRecoveryServiceTest,GhostPaymentFaultTest,CallbackDlqSchedulerTest에 반복 정의되어 있다. 유지보수 시 변경 누락 위험이 있으므로FakeCouponIssueRequestRepository로 통합하는 것을 권장한다.♻️ FakeCouponIssueRequestRepository 추출 제안
apps/commerce-api/src/test/java/com/loopers/fake/FakeCouponIssueRequestRepository.java생성:package com.loopers.fake; import com.loopers.domain.coupon.CouponIssueRequest; import com.loopers.domain.coupon.CouponIssueRequestRepository; import java.util.Optional; public class FakeCouponIssueRequestRepository implements CouponIssueRequestRepository { `@Override` public CouponIssueRequest save(CouponIssueRequest request) { return request; } `@Override` public Optional<CouponIssueRequest> findById(Long id) { return Optional.empty(); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/payment/ManualRecoveryTest.java` around lines 50 - 55, Multiple tests duplicate the same anonymous CouponIssueRequestRepository stub; extract it into a single FakeCouponIssueRequestRepository implementing CouponIssueRequestRepository with the same save(CouponIssueRequest) and findById(Long) behavior, then replace the inline anonymous implementations in ManualRecoveryTest (and other tests like PaymentRecoveryServiceTest, GhostPaymentFaultTest, CallbackDlqSchedulerTest) to construct and pass new FakeCouponIssueRequestRepository() where the local issueRequestRepository was used (e.g., the CouponFacade constructor call).
63-85: 수동 복구 API 테스트에서 실패 경로 및 경계 케이스가 누락되어 있다.현재 테스트는 성공 전이와 이미 최종 상태인 경우만 검증한다. 운영 환경에서 발생할 수 있는 다음 시나리오에 대한 테스트 추가를 권장한다:
- 존재하지 않는
paymentId로 호출 시CoreException발생 검증- PG 조회 실패(타임아웃/에러) 시 동작 검증
- PG가
FAILED상태 반환 시 재고/쿠폰 복원 검증🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/payment/ManualRecoveryTest.java` around lines 63 - 85, Add three new tests alongside manualConfirm_pgSuccess_transitionToPaid to cover missing failure and edge cases: (1) a test that calls recoveryService.manualConfirm with a non-existent paymentId and asserts it throws CoreException (use paymentRepository to ensure non-existence and assert exception type/message), (2) a test simulating PG lookup failure by configuring pgClient to throw or return an error/timeout when queried for the transaction (use the same payment creation flow and verify manualConfirm handles the PG error path appropriately — e.g., returns an error result or raises a defined exception), and (3) a test where pgClient.registerStatus returns a FAILED status and you verify recoveryService restores inventory/coupons by asserting Order/Stock/Coupon state changes and that PaymentModel transitions to the expected failed/recovered state (use PaymentModel.create/markPending, paymentRepository.save, recoveryService.manualConfirm, and assert on paymentRepository.findById and related repositories). Ensure each test uses unique tx id values and verifies both return value and persisted state changes.apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.java (2)
149-165:@Scheduled메서드에 트랜잭션 경계가 없다.
checkPendingPayments()는 여러 결제건을 순회하며pollPgStatus()를 호출한다.pollPgStatus()내부에서updateStatusConditionally와handlePaymentSuccess/handlePaymentFailure가 호출되지만, 각 결제건 처리가 개별 트랜잭션으로 분리되지 않는다.한 건의 실패가 전체 폴링 작업에 영향을 줄 수 있으며, 부분 커밋/롤백 동작이 예측 불가능하다.
♻️ 개별 트랜잭션 분리 제안
`@Scheduled`(fixedRate = 10_000) public void checkPendingPayments() { List<PaymentModel> pendingPayments = paymentRepository.findAllByStatus(PaymentStatus.PENDING); List<PaymentModel> unknownPayments = paymentRepository.findAllByStatus(PaymentStatus.UNKNOWN); ZonedDateTime threshold = ZonedDateTime.now().minusSeconds(10); for (PaymentModel payment : pendingPayments) { if (payment.getCreatedAt() != null && payment.getCreatedAt().isBefore(threshold)) { pollPgStatusSafely(payment.getId()); } } for (PaymentModel payment : unknownPayments) { pollPgStatusSafely(payment.getId()); } } // 별도 `@Transactional` 메서드 또는 TransactionTemplate 사용 private void pollPgStatusSafely(Long paymentId) { try { transactionTemplate.executeWithoutResult(status -> { PaymentModel payment = paymentRepository.findById(paymentId).orElse(null); if (payment != null) { pollPgStatus(payment); } }); } catch (Exception e) { log.warn("개별 폴링 실패: paymentId={}, error={}", paymentId, e.getMessage()); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.java` around lines 149 - 165, checkPendingPayments processes multiple payments in one scheduler invocation without per-payment transaction boundaries; change it to delegate each payment to a separate transactional execution so one payment's failure won't affect others. For example, replace direct calls to pollPgStatus(payment) with a safe delegating method (e.g., pollPgStatusSafely(paymentId)) that either is annotated `@Transactional` or uses TransactionTemplate to load the PaymentModel by id via paymentRepository and call pollPgStatus inside that transactional boundary; ensure you pass the paymentId (not the entity) to avoid self-invocation/proxy issues and catch/log exceptions per payment so failures are isolated and do not abort the whole scheduled run.
102-104: 상태 전이 이력 기록 시점이 트랜잭션 실패 시 불일치를 유발할 수 있다.
historyRepository.save()가handlePaymentSuccess()/handlePaymentFailure()호출 전에 실행된다. 만약 후속 처리(주문 상태 변경, 재고/쿠폰 복원)에서 예외가 발생하면 이력은 기록되지만 실제 처리는 실패한 상태가 된다.이력 기록을 후처리 성공 후로 이동하거나, 실패 시 별도 이력(reason에 실패 정보 포함)을 남기는 것을 고려해야 한다.
♻️ 이력 기록 시점 조정 제안
if (affected == 0) { log.info("조건부 UPDATE 미적용 (이미 처리된 건): paymentId={}, currentStatus={}", payment.getId(), payment.getStatus()); inbox.markProcessed(); callbackInboxRepository.save(inbox); return; } - // 상태 전이 이력 기록 - historyRepository.save(PaymentStatusHistory.create( - payment.getId(), payment.getStatus(), targetStatus, "CALLBACK", null)); - // 상태 전이 성공 if (targetStatus == PaymentStatus.PAID) { handlePaymentSuccess(payment); } else { handlePaymentFailure(payment); } + // 후처리 성공 후 이력 기록 + historyRepository.save(PaymentStatusHistory.create( + payment.getId(), payment.getStatus(), targetStatus, "CALLBACK", null)); + inbox.markProcessed();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.java` around lines 102 - 104, The history save is happening before downstream processing which can leave inconsistent records if subsequent steps fail; update PaymentRecoveryService so that historyRepository.save(PaymentStatusHistory.create(...)) is performed after handlePaymentSuccess(payment, ...) and handlePaymentFailure(payment, ...) complete successfully, or alternatively wrap the handlers in a try/catch: on success call historyRepository.save(...) with the normal targetStatus, on exception call historyRepository.save(PaymentStatusHistory.create(..., targetStatus, "CALLBACK", failureReason)) and rethrow or handle accordingly; reference the existing historyRepository.save call, PaymentStatusHistory.create, and the handlePaymentSuccess/handlePaymentFailure methods to locate and modify the logic.apps/commerce-api/src/test/java/com/loopers/application/payment/PaymentRecoveryServiceTest.java (2)
100-126: UNKNOWN 상태 결제 복구 테스트가 누락되어 있다.
PaymentRecoveryService.checkPendingPayments()는 PENDING뿐 아니라 UNKNOWN 상태도 폴링한다(Line 152-164 참조). UNKNOWN 상태는 threshold 없이 즉시 폴링되므로 이에 대한 테스트가 필요하다. 유령 결제 복구 시나리오에서 핵심적인 경로이다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/payment/PaymentRecoveryServiceTest.java` around lines 100 - 126, Add a new test in PaymentRecoveryServiceTest that covers UNKNOWN-status polling: create and save an Order and a PaymentModel, set its status to UNKNOWN (use PaymentModel.markUnknown or equivalent) and set createdAt to now, register the PG client to return an appropriate PgPaymentStatusResponse for that transaction, invoke recoveryService.checkPendingPayments(), and assert the expected recovery behavior (e.g., status transitions, updates or cleanup) for UNKNOWN payments. Reference PaymentRecoveryService.checkPendingPayments() to ensure the test exercises the immediate (no-threshold) polling path for UNKNOWN states and use existing helpers like setCreatedAt(payment, ...) and pgClient.registerStatus(...) to mirror other tests.
64-67:PaymentStatusHistory저장 검증이 누락되어 있다.
PaymentRecoveryService가 상태 전이 시historyRepository.save()를 호출하는지 검증이 없다. 이력 기록은 5-layer recovery 구조의 핵심 감사 기능이므로 저장 여부를 검증해야 한다.♻️ 이력 저장 검증 추가 제안
+ private FakePaymentStatusHistoryRepository historyRepository; + `@BeforeEach` void setUp() { // ... existing setup ... + historyRepository = new FakePaymentStatusHistoryRepository(); recoveryService = new PaymentRecoveryService( - paymentRepository, new FakePaymentStatusHistoryRepository(), + paymentRepository, historyRepository, callbackInboxRepository, orderRepository, productFacade, couponFacade, pgRouter); } `@Test` void polling_pgSuccess_transitionToPaid() throws Exception { // ... existing test code ... assertThat(updated.getStatus()).isEqualTo(PaymentStatus.PAID); + + // 상태 전이 이력 검증 + List<PaymentStatusHistory> histories = historyRepository.findAllByPaymentId(payment.getId()); + assertThat(histories).hasSize(1); + assertThat(histories.get(0).getToStatus()).isEqualTo(PaymentStatus.PAID); + assertThat(histories.get(0).getReason()).isEqualTo("POLLING"); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/payment/PaymentRecoveryServiceTest.java` around lines 64 - 67, The test is missing verification that PaymentRecoveryService records transitions by calling historyRepository.save(); update PaymentRecoveryServiceTest to assert this: replace or wrap FakePaymentStatusHistoryRepository with a Mockito mock or spy of PaymentStatusHistoryRepository (or inject a mock into the PaymentRecoveryService constructor), execute the recovery scenario as currently done, then verify via Mockito.verify(mockHistoryRepository).save(...) (or verify(..., times(...))). Reference PaymentRecoveryService, PaymentStatusHistoryRepository (or FakePaymentStatusHistoryRepository) and the test class PaymentRecoveryServiceTest when making the change.apps/commerce-api/src/test/java/com/loopers/application/payment/GhostPaymentFaultTest.java (1)
81-118: 유령 결제 복구 테스트에서 상태 전이 이력(PaymentStatusHistory) 저장 검증이 누락되어 있다.유령 결제 복구는 운영상 중요한 시나리오이며, 복구 경로 추적을 위해 이력이 올바르게 기록되는지 검증해야 한다. 특히
UNKNOWN → PAID전이 시 reason이POLLING으로 기록되는지 확인이 필요하다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/application/payment/GhostPaymentFaultTest.java` around lines 81 - 118, Add assertions in GhostPaymentFaultTest.ghostPayment_timeout_polling_recovery to verify a PaymentStatusHistory record was created when the payment transitioned from UNKNOWN to PAID: after recoveryService.checkPendingPayments() and after loading recoveredPayment (use result.paymentId() / recoveredPayment.getId()), query the payment's status history (via PaymentStatusHistoryRepository or paymentRepository method that returns histories) and assert there exists a history entry with fromStatus = PaymentStatus.UNKNOWN, toStatus = PaymentStatus.PAID and reason = "POLLING" (or enum/string constant used), e.g., check the last history entry matches these values.apps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentStatusHistory.java (1)
36-37:reason필드 길이 50자 제한이 모든 복구 경로 문자열을 수용하는지 확인이 필요하다.현재 사용되는 reason 값:
CALLBACK,POLLING,WAL_RECOVERY,BATCH_RECOVERY등은 50자 이내이다. 그러나 향후 상세 정보가 reason에 포함될 경우 truncation이 발생할 수 있다. 상수로 정의하거나 enum으로 관리하는 것을 권장한다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentStatusHistory.java` around lines 36 - 37, The reason field in PaymentStatusHistory is a plain String with a 50-char column limit which risks truncation if future recovery details are added; replace it with a typed enum (e.g., PaymentRecoveryReason) and store it as a string (use `@Enumerated`(EnumType.STRING) on the reason property) or, if you must keep a String, increase the `@Column`(length=...) to a safe value and centralize valid values as constants; also add a DB migration to alter the column and update usages that set/rely on PaymentStatusHistory.reason to use the new enum or constants.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@apps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentModel.java`:
- Around line 52-63: The pendingTransitions list is being cleared via
clearPendingTransitions() which loses transition history on transaction
rollback; replace this pattern by adding an atomic drain method (e.g.,
drainPendingTransitions()) on PaymentModel that returns and clears a snapshot of
pendingTransitions (List<StatusTransition>) so callers can persist the snapshot
and, on rollback, restore or retry from that snapshot; update callers (notably
PaymentRepositoryImpl) to use drainPendingTransitions() instead of
clearPendingTransitions(), and add a test that saves, forces a rollback, then
re-saves the same entity to assert payment_status_history contains the
transition only once.
In
`@apps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentStatusHistory.java`:
- Around line 42-51: The create() factory in PaymentStatusHistory must validate
required fields to fail fast; add null checks for paymentId, fromStatus,
toStatus, and reason inside PaymentStatusHistory.create(...) and throw an
IllegalArgumentException (or a custom unchecked validation exception) with clear
messages like "paymentId must not be null" for each missing field so the domain
layer surfaces a descriptive error before JPA persistence.
In
`@apps/commerce-api/src/test/java/com/loopers/fake/FakePaymentStatusHistoryRepository.java`:
- Around line 30-35: The fake repository method findAllByPaymentId currently
returns unsorted results from store, causing tests that rely on chronological
order to flake; update FakePaymentStatusHistoryRepository.findAllByPaymentId to
mirror production by sorting the filtered PaymentStatusHistory entries by their
createdAt in ascending order before returning (i.e., after .filter(h ->
h.getPaymentId().equals(paymentId)), sort by h.getCreatedAt() ascending and then
collect to list) so the fake matches
PaymentStatusHistoryRepositoryImpl.findAllByPaymentId behavior.
- Around line 16-28: The save method in FakePaymentStatusHistoryRepository uses
a non-atomic sequence long (sequence++) which can cause ID collisions under
concurrent saves; replace the long sequence with an AtomicLong (e.g., sequence
-> AtomicLong sequence) and use sequence.getAndIncrement() when assigning id
inside save(PaymentStatusHistory), keeping the rest of setBaseEntityId(history,
id), setCreatedAtIfAbsent(history) and store.put(...) logic the same so IDs are
generated atomically and thread-safe.
In
`@apps/commerce-batch/src/main/java/com/loopers/batch/job/paymentrecovery/step/PaymentRecoveryTasklet.java`:
- Around line 63-67: The test schema is missing the payment_status_history DDL
referenced by the native query in PaymentRecoveryTasklet (the
createNativeQuery(...) calls that insert into payment_status_history), so add
the payment_status_history table definition to
apps/commerce-batch/src/test/resources/schema-batch-test.sql and update any
related test fixtures; then add at least one integration test that invokes the
PaymentRecoveryTasklet.execute() (or the enclosing job/step) to assert that
payments with status REQUESTED are updated to FAILED and a corresponding row is
inserted into payment_status_history (verify both status change and history
insert), and ensure similar additions/coverage for the other native-query sites
noted (around the other createNativeQuery calls at the same class).
- Around line 63-72: The insert into payment_status_history is happening before
the payments UPDATE which can produce false history if another process updates
the payment in between; change PaymentRecoveryTasklet so you perform the UPDATE
first (using the existing entityManager.createNativeQuery(... UPDATE
...).setParameter("ids", targetIds).executeUpdate() or by selecting FOR UPDATE)
and then insert history only for the IDs actually updated (derive the updated ID
set from the UPDATE result or by selecting rows locked/updated), or
alternatively lock candidate rows (SELECT ... FOR UPDATE) before making changes;
apply the same change to the other similar blocks (lines around 89-98 and
120-129) and add an integration test that runs the batch and a concurrent
callback to assert no false FAILED history entries are created.
- Around line 87-105: The batch currently passes the pre-fetched targetIds into
restoreStockForFailedPayments() causing over-restoration and missing coupon
restores; change PaymentRecoveryTasklet to capture only the IDs actually updated
to FAILED (e.g., have the INSERT/UPDATE native queries return or select the
affected payment IDs instead of using the original targetIds), pass that precise
list to the shared failure-compensation component (create/use a common component
used by PaymentRecoveryService that restores both stock and coupons), update
restoreStockForFailedPayments() usage to accept only the confirmed-failed IDs
(or replace it with the new shared compensator), and add/instrument integration
tests covering the race where a PENDING becomes PAID mid-batch and cases with
couponIssueId; apply the same fix to the other similar block in this file that
mirrors this logic.
---
Outside diff comments:
In
`@apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.java`:
- Around line 127-142: handlePaymentFailure currently calls
productFacade.restoreStock and couponFacade.restoreCoupon with different failure
semantics causing inconsistent state; wrap each restore in its own try-catch
inside handlePaymentFailure (around productFacade.restoreStock in the OrderItem
loop and around couponFacade.restoreCoupon) so failures are handled
symmetrically: catch exceptions, log detailed context (orderId, item.productId,
item.quantity for restoreStock; orderId, couponIssueId for restoreCoupon), and
send/trigger an alert/notification instead of letting one failure rollback
earlier successful external-side effects (or rethrow only after compensating
actions if you choose compensation). Ensure you reference handlePaymentFailure,
productFacade.restoreStock, couponFacade.restoreCoupon and use
order.getId()/item.getProductId()/order.getCouponIssueId() in the logs/alerts.
---
Nitpick comments:
In
`@apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.java`:
- Around line 149-165: checkPendingPayments processes multiple payments in one
scheduler invocation without per-payment transaction boundaries; change it to
delegate each payment to a separate transactional execution so one payment's
failure won't affect others. For example, replace direct calls to
pollPgStatus(payment) with a safe delegating method (e.g.,
pollPgStatusSafely(paymentId)) that either is annotated `@Transactional` or uses
TransactionTemplate to load the PaymentModel by id via paymentRepository and
call pollPgStatus inside that transactional boundary; ensure you pass the
paymentId (not the entity) to avoid self-invocation/proxy issues and catch/log
exceptions per payment so failures are isolated and do not abort the whole
scheduled run.
- Around line 102-104: The history save is happening before downstream
processing which can leave inconsistent records if subsequent steps fail; update
PaymentRecoveryService so that
historyRepository.save(PaymentStatusHistory.create(...)) is performed after
handlePaymentSuccess(payment, ...) and handlePaymentFailure(payment, ...)
complete successfully, or alternatively wrap the handlers in a try/catch: on
success call historyRepository.save(...) with the normal targetStatus, on
exception call historyRepository.save(PaymentStatusHistory.create(...,
targetStatus, "CALLBACK", failureReason)) and rethrow or handle accordingly;
reference the existing historyRepository.save call, PaymentStatusHistory.create,
and the handlePaymentSuccess/handlePaymentFailure methods to locate and modify
the logic.
In
`@apps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentStatusHistory.java`:
- Around line 36-37: The reason field in PaymentStatusHistory is a plain String
with a 50-char column limit which risks truncation if future recovery details
are added; replace it with a typed enum (e.g., PaymentRecoveryReason) and store
it as a string (use `@Enumerated`(EnumType.STRING) on the reason property) or, if
you must keep a String, increase the `@Column`(length=...) to a safe value and
centralize valid values as constants; also add a DB migration to alter the
column and update usages that set/rely on PaymentStatusHistory.reason to use the
new enum or constants.
In
`@apps/commerce-api/src/main/java/com/loopers/infrastructure/scheduler/WalRecoveryScheduler.java`:
- Around line 82-86: Extract the current status into a local variable (e.g.,
fromStatus = payment.getStatus()) before calling
paymentRepository.updateStatusConditionally, then pass that fromStatus into
historyRepository.save(PaymentStatusHistory.create(...)) and use fromStatus in
the log.info call (replace payment.getStatus()) to ensure consistent and
explicit from-status recording and logging around updateStatusConditionally,
PaymentStatusHistory.create, and log.info.
In
`@apps/commerce-api/src/test/java/com/loopers/application/payment/CallbackMissFaultTest.java`:
- Around line 62-64: The test injects a real Clock via Clock.systemDefaultZone()
into CouponFacade which can make time-sensitive coupon logic flaky; replace that
with a deterministic fixed clock (e.g., create a Clock.fixed(...) with a chosen
Instant and ZoneId) and pass that fixed Clock into the CouponFacade constructor
(the instantiation that currently uses Clock.systemDefaultZone()) so
time-dependent coupon behavior is stable and reproducible across test runs.
In
`@apps/commerce-api/src/test/java/com/loopers/application/payment/DbFailureFaultTest.java`:
- Line 40: The test currently constructs WalRecoveryScheduler with a
FakePaymentStatusHistoryRepository but never asserts that a PaymentStatusHistory
was recorded; update the test that initializes walRecovery
(WalRecoveryScheduler) to hold the injected FakePaymentStatusHistoryRepository
as a field (or local variable) and after invoking the recovery path assert that
the fake repo contains the expected history entry for the recovered payment (use
FakePaymentStatusHistoryRepository.findAllByPaymentId(payment.getId()) or
equivalent) to verify a single PaymentStatusHistory was saved and WAL file
removed.
In
`@apps/commerce-api/src/test/java/com/loopers/application/payment/GhostPaymentFaultTest.java`:
- Around line 81-118: Add assertions in
GhostPaymentFaultTest.ghostPayment_timeout_polling_recovery to verify a
PaymentStatusHistory record was created when the payment transitioned from
UNKNOWN to PAID: after recoveryService.checkPendingPayments() and after loading
recoveredPayment (use result.paymentId() / recoveredPayment.getId()), query the
payment's status history (via PaymentStatusHistoryRepository or
paymentRepository method that returns histories) and assert there exists a
history entry with fromStatus = PaymentStatus.UNKNOWN, toStatus =
PaymentStatus.PAID and reason = "POLLING" (or enum/string constant used), e.g.,
check the last history entry matches these values.
In
`@apps/commerce-api/src/test/java/com/loopers/application/payment/ManualRecoveryTest.java`:
- Around line 50-55: Multiple tests duplicate the same anonymous
CouponIssueRequestRepository stub; extract it into a single
FakeCouponIssueRequestRepository implementing CouponIssueRequestRepository with
the same save(CouponIssueRequest) and findById(Long) behavior, then replace the
inline anonymous implementations in ManualRecoveryTest (and other tests like
PaymentRecoveryServiceTest, GhostPaymentFaultTest, CallbackDlqSchedulerTest) to
construct and pass new FakeCouponIssueRequestRepository() where the local
issueRequestRepository was used (e.g., the CouponFacade constructor call).
- Around line 63-85: Add three new tests alongside
manualConfirm_pgSuccess_transitionToPaid to cover missing failure and edge
cases: (1) a test that calls recoveryService.manualConfirm with a non-existent
paymentId and asserts it throws CoreException (use paymentRepository to ensure
non-existence and assert exception type/message), (2) a test simulating PG
lookup failure by configuring pgClient to throw or return an error/timeout when
queried for the transaction (use the same payment creation flow and verify
manualConfirm handles the PG error path appropriately — e.g., returns an error
result or raises a defined exception), and (3) a test where
pgClient.registerStatus returns a FAILED status and you verify recoveryService
restores inventory/coupons by asserting Order/Stock/Coupon state changes and
that PaymentModel transitions to the expected failed/recovered state (use
PaymentModel.create/markPending, paymentRepository.save,
recoveryService.manualConfirm, and assert on paymentRepository.findById and
related repositories). Ensure each test uses unique tx id values and verifies
both return value and persisted state changes.
In
`@apps/commerce-api/src/test/java/com/loopers/application/payment/PaymentCallbackTest.java`:
- Around line 65-69: The test currently constructs PaymentRecoveryService but
doesn't assert that PaymentStatusHistory entries are recorded; update the tests
(e.g., callback_success_paidPaymentAndOrder and the FAILED case) to extract the
FakePaymentStatusHistoryRepository instance used to create
PaymentRecoveryService, call its finder (e.g.,
findAllByPaymentId(payment.getId()) or equivalent) after invoking
PaymentRecoveryService callback handling, and assert the returned list is not
empty (and optionally contains the expected status entries). Ensure you
reference the existing FakePaymentStatusHistoryRepository instance used in the
test setup and PaymentRecoveryService to locate where to add these assertions.
In
`@apps/commerce-api/src/test/java/com/loopers/application/payment/PaymentRecoveryServiceTest.java`:
- Around line 100-126: Add a new test in PaymentRecoveryServiceTest that covers
UNKNOWN-status polling: create and save an Order and a PaymentModel, set its
status to UNKNOWN (use PaymentModel.markUnknown or equivalent) and set createdAt
to now, register the PG client to return an appropriate PgPaymentStatusResponse
for that transaction, invoke recoveryService.checkPendingPayments(), and assert
the expected recovery behavior (e.g., status transitions, updates or cleanup)
for UNKNOWN payments. Reference PaymentRecoveryService.checkPendingPayments() to
ensure the test exercises the immediate (no-threshold) polling path for UNKNOWN
states and use existing helpers like setCreatedAt(payment, ...) and
pgClient.registerStatus(...) to mirror other tests.
- Around line 64-67: The test is missing verification that
PaymentRecoveryService records transitions by calling historyRepository.save();
update PaymentRecoveryServiceTest to assert this: replace or wrap
FakePaymentStatusHistoryRepository with a Mockito mock or spy of
PaymentStatusHistoryRepository (or inject a mock into the PaymentRecoveryService
constructor), execute the recovery scenario as currently done, then verify via
Mockito.verify(mockHistoryRepository).save(...) (or verify(..., times(...))).
Reference PaymentRecoveryService, PaymentStatusHistoryRepository (or
FakePaymentStatusHistoryRepository) and the test class
PaymentRecoveryServiceTest when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 51b89b64-826a-47de-b91b-452d8fbd53ef
⛔ Files ignored due to path filters (2)
blog/week7-data-management-patterns.mdis excluded by!**/*.mdand included by**blog/week7-decomposition-analysis.mdis excluded by!**/*.mdand included by**
📒 Files selected for processing (19)
apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.javaapps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentModel.javaapps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentStatusHistory.javaapps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentStatusHistoryRepository.javaapps/commerce-api/src/main/java/com/loopers/infrastructure/payment/PaymentRepositoryImpl.javaapps/commerce-api/src/main/java/com/loopers/infrastructure/payment/PaymentStatusHistoryJpaRepository.javaapps/commerce-api/src/main/java/com/loopers/infrastructure/payment/PaymentStatusHistoryRepositoryImpl.javaapps/commerce-api/src/main/java/com/loopers/infrastructure/scheduler/WalRecoveryScheduler.javaapps/commerce-api/src/test/java/com/loopers/application/payment/CallbackMissFaultTest.javaapps/commerce-api/src/test/java/com/loopers/application/payment/DbFailureFaultTest.javaapps/commerce-api/src/test/java/com/loopers/application/payment/GhostPaymentFaultTest.javaapps/commerce-api/src/test/java/com/loopers/application/payment/ManualRecoveryTest.javaapps/commerce-api/src/test/java/com/loopers/application/payment/PaymentCallbackTest.javaapps/commerce-api/src/test/java/com/loopers/application/payment/PaymentRecoveryServiceTest.javaapps/commerce-api/src/test/java/com/loopers/domain/payment/PaymentModelTest.javaapps/commerce-api/src/test/java/com/loopers/domain/payment/PaymentStatusHistoryTest.javaapps/commerce-api/src/test/java/com/loopers/fake/FakePaymentStatusHistoryRepository.javaapps/commerce-api/src/test/java/com/loopers/infrastructure/scheduler/CallbackDlqSchedulerTest.javaapps/commerce-batch/src/main/java/com/loopers/batch/job/paymentrecovery/step/PaymentRecoveryTasklet.java
✅ Files skipped from review due to trivial changes (1)
- apps/commerce-api/src/test/java/com/loopers/domain/payment/PaymentStatusHistoryTest.java
| @Transient | ||
| private final List<StatusTransition> pendingTransitions = new ArrayList<>(); | ||
|
|
||
| public record StatusTransition(PaymentStatus from, PaymentStatus to, String reason, String detail) {} | ||
|
|
||
| public List<StatusTransition> getPendingTransitions() { | ||
| return Collections.unmodifiableList(pendingTransitions); | ||
| } | ||
|
|
||
| public void clearPendingTransitions() { | ||
| pendingTransitions.clear(); | ||
| } |
There was a problem hiding this comment.
전이 큐를 clear()로 소모하는 방식은 롤백 후 이력을 잃기 쉽다.
apps/commerce-api/src/main/java/com/loopers/infrastructure/payment/PaymentRepositoryImpl.java가 저장 직후 이 큐를 비우고 있어, 이후 같은 트랜잭션이 롤백되면 payment_status_history는 함께 취소되더라도 엔티티 메모리에는 전이 정보가 남지 않는다. 운영에서는 재시도 시 결제 상태 이력이 빠져 감사 추적과 장애 분석이 깨진다. 단순 clear() 대신 스냅샷을 원자적으로 가져가는 drainPendingTransitions()와 롤백 복원 전략을 두거나, 상태 변경과 이력 저장을 애플리케이션 서비스에서 같은 트랜잭션으로 명시 처리하는 방식이 더 안전하다. 롤백 뒤 동일 엔티티를 다시 저장했을 때 이력이 한 번만 남는 통합 테스트를 추가해야 한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentModel.java`
around lines 52 - 63, The pendingTransitions list is being cleared via
clearPendingTransitions() which loses transition history on transaction
rollback; replace this pattern by adding an atomic drain method (e.g.,
drainPendingTransitions()) on PaymentModel that returns and clears a snapshot of
pendingTransitions (List<StatusTransition>) so callers can persist the snapshot
and, on rollback, restore or retry from that snapshot; update callers (notably
PaymentRepositoryImpl) to use drainPendingTransitions() instead of
clearPendingTransitions(), and add a test that saves, forces a rollback, then
re-saves the same entity to assert payment_status_history contains the
transition only once.
| public static PaymentStatusHistory create(Long paymentId, PaymentStatus fromStatus, | ||
| PaymentStatus toStatus, String reason, String detail) { | ||
| PaymentStatusHistory h = new PaymentStatusHistory(); | ||
| h.paymentId = paymentId; | ||
| h.fromStatus = fromStatus; | ||
| h.toStatus = toStatus; | ||
| h.reason = reason; | ||
| h.detail = detail; | ||
| return h; | ||
| } |
There was a problem hiding this comment.
create() 팩토리 메서드에서 필수 필드에 대한 null 검증이 누락되어 있다.
paymentId, fromStatus, toStatus, reason은 DB 스키마상 nullable = false이지만 팩토리 메서드에서 검증하지 않는다. null이 전달되면 JPA 저장 시점에 ConstraintViolationException이 발생하여 디버깅이 어려워진다. 도메인 레이어에서 조기 검증하여 명확한 에러 메시지를 제공해야 한다.
🛡️ null 검증 추가 제안
public static PaymentStatusHistory create(Long paymentId, PaymentStatus fromStatus,
PaymentStatus toStatus, String reason, String detail) {
+ if (paymentId == null) {
+ throw new IllegalArgumentException("paymentId must not be null");
+ }
+ if (fromStatus == null || toStatus == null) {
+ throw new IllegalArgumentException("fromStatus and toStatus must not be null");
+ }
+ if (reason == null || reason.isBlank()) {
+ throw new IllegalArgumentException("reason must not be blank");
+ }
PaymentStatusHistory h = new PaymentStatusHistory();
h.paymentId = paymentId;
h.fromStatus = fromStatus;
h.toStatus = toStatus;
h.reason = reason;
h.detail = detail;
return h;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public static PaymentStatusHistory create(Long paymentId, PaymentStatus fromStatus, | |
| PaymentStatus toStatus, String reason, String detail) { | |
| PaymentStatusHistory h = new PaymentStatusHistory(); | |
| h.paymentId = paymentId; | |
| h.fromStatus = fromStatus; | |
| h.toStatus = toStatus; | |
| h.reason = reason; | |
| h.detail = detail; | |
| return h; | |
| } | |
| public static PaymentStatusHistory create(Long paymentId, PaymentStatus fromStatus, | |
| PaymentStatus toStatus, String reason, String detail) { | |
| if (paymentId == null) { | |
| throw new IllegalArgumentException("paymentId must not be null"); | |
| } | |
| if (fromStatus == null || toStatus == null) { | |
| throw new IllegalArgumentException("fromStatus and toStatus must not be null"); | |
| } | |
| if (reason == null || reason.isBlank()) { | |
| throw new IllegalArgumentException("reason must not be blank"); | |
| } | |
| PaymentStatusHistory h = new PaymentStatusHistory(); | |
| h.paymentId = paymentId; | |
| h.fromStatus = fromStatus; | |
| h.toStatus = toStatus; | |
| h.reason = reason; | |
| h.detail = detail; | |
| return h; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-api/src/main/java/com/loopers/domain/payment/PaymentStatusHistory.java`
around lines 42 - 51, The create() factory in PaymentStatusHistory must validate
required fields to fail fast; add null checks for paymentId, fromStatus,
toStatus, and reason inside PaymentStatusHistory.create(...) and throw an
IllegalArgumentException (or a custom unchecked validation exception) with clear
messages like "paymentId must not be null" for each missing field so the domain
layer surfaces a descriptive error before JPA persistence.
| private final Map<Long, PaymentStatusHistory> store = new ConcurrentHashMap<>(); | ||
| private long sequence = 1L; | ||
|
|
||
| @Override | ||
| public PaymentStatusHistory save(PaymentStatusHistory history) { | ||
| if (history.getId() == null || history.getId() == 0L) { | ||
| long id = sequence++; | ||
| setBaseEntityId(history, id); | ||
| } | ||
| setCreatedAtIfAbsent(history); | ||
| store.put(history.getId(), history); | ||
| return history; | ||
| } |
There was a problem hiding this comment.
sequence++ 연산이 thread-safe하지 않다.
ConcurrentHashMap을 사용하지만 sequence++는 원자적 연산이 아니다. 동시 저장 시 ID 충돌이 발생할 수 있다. 테스트 환경에서 낮은 확률이지만, 병렬 테스트 실행 시 플래키 테스트의 원인이 될 수 있다.
🔒️ AtomicLong 사용 제안
+import java.util.concurrent.atomic.AtomicLong;
+
public class FakePaymentStatusHistoryRepository implements PaymentStatusHistoryRepository {
private final Map<Long, PaymentStatusHistory> store = new ConcurrentHashMap<>();
- private long sequence = 1L;
+ private final AtomicLong sequence = new AtomicLong(1L);
`@Override`
public PaymentStatusHistory save(PaymentStatusHistory history) {
if (history.getId() == null || history.getId() == 0L) {
- long id = sequence++;
+ long id = sequence.getAndIncrement();
setBaseEntityId(history, id);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private final Map<Long, PaymentStatusHistory> store = new ConcurrentHashMap<>(); | |
| private long sequence = 1L; | |
| @Override | |
| public PaymentStatusHistory save(PaymentStatusHistory history) { | |
| if (history.getId() == null || history.getId() == 0L) { | |
| long id = sequence++; | |
| setBaseEntityId(history, id); | |
| } | |
| setCreatedAtIfAbsent(history); | |
| store.put(history.getId(), history); | |
| return history; | |
| } | |
| private final Map<Long, PaymentStatusHistory> store = new ConcurrentHashMap<>(); | |
| private final AtomicLong sequence = new AtomicLong(1L); | |
| `@Override` | |
| public PaymentStatusHistory save(PaymentStatusHistory history) { | |
| if (history.getId() == null || history.getId() == 0L) { | |
| long id = sequence.getAndIncrement(); | |
| setBaseEntityId(history, id); | |
| } | |
| setCreatedAtIfAbsent(history); | |
| store.put(history.getId(), history); | |
| return history; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-api/src/test/java/com/loopers/fake/FakePaymentStatusHistoryRepository.java`
around lines 16 - 28, The save method in FakePaymentStatusHistoryRepository uses
a non-atomic sequence long (sequence++) which can cause ID collisions under
concurrent saves; replace the long sequence with an AtomicLong (e.g., sequence
-> AtomicLong sequence) and use sequence.getAndIncrement() when assigning id
inside save(PaymentStatusHistory), keeping the rest of setBaseEntityId(history,
id), setCreatedAtIfAbsent(history) and store.put(...) logic the same so IDs are
generated atomically and thread-safe.
| @Override | ||
| public List<PaymentStatusHistory> findAllByPaymentId(Long paymentId) { | ||
| return store.values().stream() | ||
| .filter(h -> h.getPaymentId().equals(paymentId)) | ||
| .toList(); | ||
| } |
There was a problem hiding this comment.
findAllByPaymentId가 프로덕션 구현과 달리 정렬을 보장하지 않는다.
프로덕션의 PaymentStatusHistoryRepositoryImpl.findAllByPaymentId()는 createdAt 오름차순으로 정렬된 결과를 반환한다. Fake 구현은 정렬하지 않으므로, 이력 순서에 의존하는 테스트가 실패하거나 플래키해질 수 있다.
♻️ 정렬 추가 제안
`@Override`
public List<PaymentStatusHistory> findAllByPaymentId(Long paymentId) {
return store.values().stream()
.filter(h -> h.getPaymentId().equals(paymentId))
+ .sorted(Comparator.comparing(PaymentStatusHistory::getCreatedAt))
.toList();
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @Override | |
| public List<PaymentStatusHistory> findAllByPaymentId(Long paymentId) { | |
| return store.values().stream() | |
| .filter(h -> h.getPaymentId().equals(paymentId)) | |
| .toList(); | |
| } | |
| `@Override` | |
| public List<PaymentStatusHistory> findAllByPaymentId(Long paymentId) { | |
| return store.values().stream() | |
| .filter(h -> h.getPaymentId().equals(paymentId)) | |
| .sorted(Comparator.comparing(PaymentStatusHistory::getCreatedAt)) | |
| .toList(); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-api/src/test/java/com/loopers/fake/FakePaymentStatusHistoryRepository.java`
around lines 30 - 35, The fake repository method findAllByPaymentId currently
returns unsorted results from store, causing tests that rely on chronological
order to flake; update FakePaymentStatusHistoryRepository.findAllByPaymentId to
mirror production by sorting the filtered PaymentStatusHistory entries by their
createdAt in ascending order before returning (i.e., after .filter(h ->
h.getPaymentId().equals(paymentId)), sort by h.getCreatedAt() ascending and then
collect to list) so the fake matches
PaymentStatusHistoryRepositoryImpl.findAllByPaymentId behavior.
| entityManager.createNativeQuery( | ||
| "INSERT INTO payment_status_history (payment_id, from_status, to_status, reason, detail, created_at, updated_at) " + | ||
| "SELECT id, 'REQUESTED', 'FAILED', 'BATCH_RECOVERY', '배치 복구: PG 호출 누락 (1분 초과)', NOW(), NOW() " + | ||
| "FROM payments WHERE id IN :ids AND status = 'REQUESTED' AND deleted_at IS NULL" | ||
| ).setParameter("ids", targetIds).executeUpdate(); |
There was a problem hiding this comment.
새 SQL 경로에 맞춰 배치 테스트 스키마와 검증 케이스를 같이 올려야 한다.
이 변경은 payment_status_history 테이블을 전제로 하지만 apps/commerce-batch/src/test/resources/schema-batch-test.sql에는 아직 해당 테이블 DDL이 없다. 지금처럼 이 경로가 테스트되지 않으면 운영 배치 SQL 오류가 CI에서 걸러지지 않고 첫 실행에서 드러난다. 테스트 스키마에 테이블을 추가하고, execute()가 상태 변경과 이력 insert를 함께 수행하는 배치 통합 테스트를 최소 1개 이상 넣는 것이 좋다.
Also applies to: 89-93, 120-124
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-batch/src/main/java/com/loopers/batch/job/paymentrecovery/step/PaymentRecoveryTasklet.java`
around lines 63 - 67, The test schema is missing the payment_status_history DDL
referenced by the native query in PaymentRecoveryTasklet (the
createNativeQuery(...) calls that insert into payment_status_history), so add
the payment_status_history table definition to
apps/commerce-batch/src/test/resources/schema-batch-test.sql and update any
related test fixtures; then add at least one integration test that invokes the
PaymentRecoveryTasklet.execute() (or the enclosing job/step) to assert that
payments with status REQUESTED are updated to FAILED and a corresponding row is
inserted into payment_status_history (verify both status change and history
insert), and ensure similar additions/coverage for the other native-query sites
noted (around the other createNativeQuery calls at the same class).
| entityManager.createNativeQuery( | ||
| "INSERT INTO payment_status_history (payment_id, from_status, to_status, reason, detail, created_at, updated_at) " + | ||
| "SELECT id, 'REQUESTED', 'FAILED', 'BATCH_RECOVERY', '배치 복구: PG 호출 누락 (1분 초과)', NOW(), NOW() " + | ||
| "FROM payments WHERE id IN :ids AND status = 'REQUESTED' AND deleted_at IS NULL" | ||
| ).setParameter("ids", targetIds).executeUpdate(); | ||
|
|
||
| int count = entityManager.createNativeQuery( | ||
| "UPDATE payments SET status = 'FAILED', failure_reason = '배치 복구: PG 호출 누락 (1분 초과)' " + | ||
| "WHERE status = 'REQUESTED' AND created_at < NOW() - INTERVAL 1 MINUTE AND deleted_at IS NULL" | ||
| ).executeUpdate(); | ||
| log.info("[PaymentRecovery] REQUESTED → FAILED: {}건", requestedCount); | ||
| "WHERE id IN :ids AND status = 'REQUESTED' AND deleted_at IS NULL" | ||
| ).setParameter("ids", targetIds).executeUpdate(); |
There was a problem hiding this comment.
이력 INSERT가 상태 UPDATE보다 앞서 있어 경쟁 상황에서 거짓 이력이 남는다.
지금은 후보 ID를 먼저 뽑고 이력을 insert한 뒤 상태를 update한다. 그 사이 콜백이나 다른 복구 경로가 같은 결제를 먼저 확정하면 update는 0건이 될 수 있는데, payment_status_history에는 이미 FAILED 전이 이력이 남아 운영 데이터와 감사 로그가 어긋난다. 대상 행을 잠근 뒤 실제로 FAILED로 전환된 집합만 이력화하거나, update 결과로 확정된 ID 집합을 기준으로 이력을 적재하도록 순서를 바꾸는 편이 안전하다. 배치와 콜백이 동시에 같은 결제를 처리하는 통합 테스트를 추가해 허위 이력이 남지 않는지 확인해야 한다.
Also applies to: 89-98, 120-129
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-batch/src/main/java/com/loopers/batch/job/paymentrecovery/step/PaymentRecoveryTasklet.java`
around lines 63 - 72, The insert into payment_status_history is happening before
the payments UPDATE which can produce false history if another process updates
the payment in between; change PaymentRecoveryTasklet so you perform the UPDATE
first (using the existing entityManager.createNativeQuery(... UPDATE
...).setParameter("ids", targetIds).executeUpdate() or by selecting FOR UPDATE)
and then insert history only for the IDs actually updated (derive the updated ID
set from the UPDATE result or by selecting rows locked/updated), or
alternatively lock candidate rows (SELECT ... FOR UPDATE) before making changes;
apply the same change to the other similar blocks (lines around 89-98 and
120-129) and add an integration test that runs the batch and a concurrent
callback to assert no false FAILED history entries are created.
| List<Long> targetIds = ids.stream().map(Number::longValue).toList(); | ||
|
|
||
| entityManager.createNativeQuery( | ||
| "INSERT INTO payment_status_history (payment_id, from_status, to_status, reason, detail, created_at, updated_at) " + | ||
| "SELECT id, 'PENDING', 'FAILED', 'BATCH_RECOVERY', '배치 복구: 콜백 미수신 (5분 초과)', NOW(), NOW() " + | ||
| "FROM payments WHERE id IN :ids AND status = 'PENDING' AND deleted_at IS NULL" | ||
| ).setParameter("ids", targetIds).executeUpdate(); | ||
|
|
||
| // 2. PENDING — 5분 이상 경과 → FAILED | ||
| int pendingCount = entityManager.createNativeQuery( | ||
| int count = entityManager.createNativeQuery( | ||
| "UPDATE payments SET status = 'FAILED', failure_reason = '배치 복구: 콜백 미수신 (5분 초과)' " + | ||
| "WHERE status = 'PENDING' AND created_at < NOW() - INTERVAL 5 MINUTE AND deleted_at IS NULL" | ||
| ).executeUpdate(); | ||
| log.info("[PaymentRecovery] PENDING(5분+) → FAILED: {}건", pendingCount); | ||
| "WHERE id IN :ids AND status = 'PENDING' AND deleted_at IS NULL" | ||
| ).setParameter("ids", targetIds).executeUpdate(); | ||
|
|
||
| // 3. UNKNOWN — 일괄 FAILED 처리 (PG 확인 불가 상태) | ||
| int unknownCount = entityManager.createNativeQuery( | ||
| "UPDATE payments SET status = 'FAILED', failure_reason = '배치 복구: UNKNOWN 타임아웃' " + | ||
| "WHERE status = 'UNKNOWN' AND created_at < NOW() - INTERVAL 10 MINUTE AND deleted_at IS NULL" | ||
| ).executeUpdate(); | ||
| log.info("[PaymentRecovery] UNKNOWN(10분+) → FAILED: {}건", unknownCount); | ||
|
|
||
| // 4. FAILED 전환된 결제건의 재고 복원 (PENDING → FAILED 건만, order_item 기반) | ||
| if (pendingCount > 0) { | ||
| List<Object[]> failedPayments = entityManager.createNativeQuery( | ||
| "SELECT p.order_id FROM payments p " + | ||
| "WHERE p.status = 'FAILED' AND p.failure_reason LIKE '%콜백 미수신%' " + | ||
| "AND p.updated_at >= NOW() - INTERVAL 1 MINUTE AND p.deleted_at IS NULL" | ||
| ).getResultList(); | ||
|
|
||
| for (Object[] row : failedPayments) { | ||
| Long orderId = ((Number) row[0]).longValue(); | ||
| int restored = entityManager.createNativeQuery( | ||
| "UPDATE product p INNER JOIN order_item oi ON p.id = oi.product_id " + | ||
| "INNER JOIN orders o ON oi.order_id = o.id " + | ||
| "SET p.stock_quantity = p.stock_quantity + oi.quantity " + | ||
| "WHERE o.id = :orderId AND p.deleted_at IS NULL" | ||
| ).setParameter("orderId", orderId).executeUpdate(); | ||
| log.info("[PaymentRecovery] 재고 복원: orderId={}, items={}", orderId, restored); | ||
| } | ||
| log.info("[PaymentRecovery] PENDING(5분+) → FAILED: {}건", count); | ||
|
|
||
| // FAILED 전환된 결제건의 재고 복원 | ||
| if (count > 0) { | ||
| restoreStockForFailedPayments(targetIds); | ||
| } |
There was a problem hiding this comment.
배치 전용 실패 보상 구현이 과복원과 쿠폰 누락을 동시에 만든다.
현재는 count > 0이면 사전 조회한 targetIds 전체를 restoreStockForFailedPayments()에 넘기고, 이 메서드도 다시 payments를 읽을 때 상태 조건이 없어 update에서 탈락한 결제까지 복원 대상이 된다. 운영에서는 콜백이 먼저 성공 처리한 주문의 재고가 다시 늘어나고, apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentRecoveryService.java와 달리 쿠폰 복원도 빠져 동일한 실패 전환이라도 경로에 따라 주문 상태가 달라진다. 실제로 FAILED로 update된 결제 ID만 추출한 뒤, API와 배치가 공통으로 쓰는 실패 보상 컴포넌트에서 재고와 쿠폰을 함께 복원하도록 정리하는 편이 안전하다. PENDING 후보 중 1건이 중간에 PAID로 바뀌는 경쟁 상황과 couponIssueId가 있는 주문을 포함한 배치 통합 테스트를 추가해야 한다.
Also applies to: 136-149
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/commerce-batch/src/main/java/com/loopers/batch/job/paymentrecovery/step/PaymentRecoveryTasklet.java`
around lines 87 - 105, The batch currently passes the pre-fetched targetIds into
restoreStockForFailedPayments() causing over-restoration and missing coupon
restores; change PaymentRecoveryTasklet to capture only the IDs actually updated
to FAILED (e.g., have the INSERT/UPDATE native queries return or select the
affected payment IDs instead of using the original targetIds), pass that precise
list to the shared failure-compensation component (create/use a common component
used by PaymentRecoveryService that restores both stock and coupons), update
restoreStockForFailedPayments() usage to accept only the confirmed-failed IDs
(or replace it with the new shared compensator), and add/instrument integration
tests covering the race where a PENDING becomes PAID mid-batch and cases with
couponIssueId; apply the same fix to the other similar block in this file that
mirrors this logic.
📌 Summary
🧭 Context & Decision
문제 정의
선택지와 결정
A. ApplicationEvent 경계 판단 — "이걸 이벤트로 분리해야 하는가?"
좋아요 등록 시 3가지 부가 작업이 있다: (1)
Product.like_count증가, (2) Redis 캐시 무효화, (3)product_metrics집계. 이 중 어디까지를 같은 트랜잭션에 넣고, 어디부터를 이벤트로 분리할 것인가가 핵심 판단이었다.최종 결정:
@TransactionalEventListener(phase = AFTER_COMMIT)— 좋아요 INSERT가 커밋된 후에 like_count 증가, 캐시 무효화를 비동기로 처리한다.product_metrics집계는 Kafka Consumer에서 처리.트레이드오프: like_count 반영이 수 ms 지연될 수 있다. 그러나 좋아요 수가 "정확히 실시간"일 필요는 없고, 집계 실패로 좋아요 자체가 롤백되는 것이 더 큰 문제다.
LikeCountEventListener는 best-effort로 동작하며, 실패 시 Kafka Consumer가 최종적으로 보정한다.B. Transactional Outbox + CDC vs Kafka 직접 발행
Kafka로 이벤트를 전파할 때 **"비즈니스 트랜잭션과 메시지 발행의 원자성"**을 어떻게 보장할 것인가가 핵심이었다.
최종 결정:
EventOutboxINSERT + Debezium CDC.DomainEventPublisherImpl이 같은 트랜잭션에서EventOutbox를 INSERT하면, Debezium이 MySQL binlog를 읽어 Kafka 토픽에 발행한다. Outbox 테이블은 무상태 — 상태 컬럼 없이 INSERT만 하고, 1시간 후 Cleanup 배치가 삭제한다.트레이드오프: Debezium Connector라는 인프라 의존성이 추가된다. 그러나 Polling 방식 대비 전달 지연이 거의 없고, Outbox 테이블에 상태 관리가 불필요하여 운영이 단순하다. 결제 도메인의
PaymentOutbox는 이미 상태 기반 Polling을 쓰고 있으므로, 두 방식을 비교 경험할 수 있다.C. MetricsConsumer — BATCH Listener + INSERT-first 멱등성
Consumer가 이벤트를 수신할 때 **"중복 이벤트를 어떻게 걸러내는가"**와 **"배치 처리와 단건 처리 중 어느 것을 선택하는가"**가 핵심이었다.
event_handledINSERT IGNORE로 중복 차단최종 결정: BATCH Listener + INSERT-first 멱등성.
event_handled에INSERT IGNORE— 이미 처리된 eventId면 skipproduct_metrics에ON DUPLICATE KEY UPDATE— delta 기반 UPSERT트레이드오프:
event_handled테이블이 계속 증가한다. 7일 경과 후 Cleanup 배치(EventHandledCleanupTasklet)로 관리한다. 7일 이내에 같은 이벤트가 재전달되면 멱등성이 보장되고, 7일 이후 재전달은 현실적으로 발생하지 않는다.D. 선착순 쿠폰 — Redis Atomic vs Kafka 순차 처리
100장 한정 쿠폰에 1만 명이 동시 요청할 때, **"수량 초과 없이 정확히 100장만 발급"**하는 방법을 선택해야 했다.
WHERE issued_count < max_issuance_count)event_handledINSERT IGNORE + DB UNIQUE최종 결정: Kafka + CAS UPDATE. API는
CouponIssueRequest를 DB에 저장하고 Kafka에 발행만 한다.CouponIssueConsumer가 순차적으로:event_handledINSERT IGNORE → 중복 처리 방지couponCAS UPDATE (issued_count + 1 WHERE issued_count < max) → 수량 제어coupon_issueINSERT → UNIQUE(coupon_id, member_id)로 중복 발급 방지coupon_issue_requeststatus UPDATE → COMPLETED / REJECTED트레이드오프: 사용자 입장에서 즉시 발급 결과를 알 수 없다.
CouponIssueRequest의 status를 polling하는 구조로 보완한다. Redis 방식 대비 응답 지연이 있지만, 이벤트 파이프라인 인프라를 재활용하고 Consumer 스케일아웃이 가능하다는 장점이 있다.🏗️ Design Overview
Part 1: ApplicationEvent 경계 분리
LikeCreatedEventLikeCountEventListenerProduct.like_count증가 (async, best-effort)LikeRemovedEventLikeCountEventListenerProduct.like_count감소 (async, best-effort)LikeCreatedEventCacheEvictionEventListenerLikeRemovedEventCacheEvictionEventListenerPart 2: Kafka 이벤트 파이프라인
DomainEventPublisherEventOutboxMetricsConsumerproduct_metricsUPSERTOutboxCleanupTaskletEventHandledCleanupTaskletPart 3: 선착순 쿠폰 발급
CouponFacade.requestCouponIssue()CouponIssueConsumerCouponIssueRequestPart 4: 패턴 점검 기반 개선
🔍 MSA 디자인 패턴 적용 현황
이벤트 파이프라인에 적용된 6개의 MSA 디자인 패턴
catalog-events,order-events), MetricsConsumer/CouponIssueConsumer가 독립 구독product_metrics테이블 — Kafka Consumer가ON DUPLICATE KEY UPDATE로 사전 집계. MetricsReconcileJob이 주기적 보정ReadFrom.REPLICA_PREFERRED)📐 산술 근거
MetricsConsumer 처리량
maxPollRecords=3,000,fetchMinBytes=1MB,fetchMaxWait=5초선착순 쿠폰 동시성
affected=0→ REJECTED 처리issued_count롤백PaymentStatusHistory INSERT 부하
🔁 Flow Diagram
이벤트 파이프라인 전체 흐름
flowchart LR subgraph "commerce-api (같은 TX)" LF[LikeFacade.addLike] -->|publish| DEP[DomainEventPublisher] DEP -->|INSERT| OB[(EventOutbox)] DEP -->|발행| AE[ApplicationEvent] end subgraph "ApplicationEvent (AFTER_COMMIT)" AE -->|async| LCE[LikeCountEventListener] LCE -->|증가| PC[Product.like_count] AE -->|listen| CEE[CacheEvictionEventListener] CEE -->|무효화| RD[(Redis Cache)] end subgraph "Debezium CDC" OB -->|binlog| CDC[Debezium Connector] CDC -->|발행| KF[Kafka Topic] end subgraph "commerce-streamer" KF -->|BATCH consume| MC[MetricsConsumer] MC -->|INSERT IGNORE| EH[(event_handled)] MC -->|UPSERT| PM[(product_metrics)] end style DEP fill:#e8f5e9 style OB fill:#fff3e0 style KF fill:#e3f2fd선착순 쿠폰 발급 흐름
sequenceDiagram autonumber participant User participant API as CouponFacade participant DB as coupon_issue_request participant Kafka as Kafka (coupon-issue-requests) participant Consumer as CouponIssueConsumer participant Coupon as coupon (CAS UPDATE) participant Issue as coupon_issue (UNIQUE) User->>API: POST /api/v1/coupons/{id}/issue-request API->>DB: INSERT (status=PENDING) API->>Kafka: send(couponId, requestId, memberId) API-->>User: 202 Accepted (requestId) Kafka->>Consumer: consume (SINGLE) Consumer->>Consumer: event_handled INSERT IGNORE (멱등) Consumer->>Coupon: UPDATE issued_count + 1 WHERE issued_count < max alt CAS 성공 (수량 남음) Consumer->>Issue: INSERT (coupon_id, member_id) alt UNIQUE 통과 Consumer->>DB: UPDATE status = COMPLETED else UNIQUE 위반 (중복 발급) Consumer->>Coupon: issued_count - 1 (롤백) Consumer->>DB: UPDATE status = REJECTED end else CAS 실패 (수량 소진) Consumer->>DB: UPDATE status = REJECTED end User->>API: GET /api/v1/coupon-issue-requests/{requestId} API-->>User: status = COMPLETED / REJECTEDPaymentStatusHistory 기록 경로
sequenceDiagram autonumber participant Facade as PaymentFacade participant Entity as PaymentModel participant Repo as PaymentRepositoryImpl participant History as payment_status_history participant Recovery as PaymentRecoveryService participant Batch as PaymentRecoveryTasklet Note over Facade,History: 경로 1: Entity 메서드 — @Transient 자동 기록 Facade->>Entity: markPending() → markPaid() Note over Entity: pendingTransitions에 2개 전이 누적 Facade->>Repo: save(payment) Repo->>History: INSERT × 2 (누적된 전이 모두) Note over Recovery,History: 경로 2: JPQL — 명시적 기록 Recovery->>Recovery: updateStatusConditionally() alt affected > 0 Recovery->>History: INSERT (reason = CALLBACK / POLLING / WAL_RECOVERY) end Note over Batch,History: 경로 3: Native SQL — Companion INSERT Batch->>History: INSERT INTO ... SELECT (bulk) Batch->>Batch: UPDATE payments SET status🔑 핵심 코드 변경
1. DomainEventPublisherImpl — Outbox + Spring Event 이중 발행
같은 트랜잭션에서 (1)
EventOutboxINSERT → Debezium CDC가 Kafka로 전달, (2)ApplicationEvent발행 → 동기/비동기 리스너가 처리. 두 채널이 동시에 동작한다.LikeFacade는
DomainEventPublisher하나만 의존 (기존 3개 → 1개):2. LikeCountEventListener — AFTER_COMMIT + Async Best-Effort
AFTER_COMMIT— 좋아요 INSERT가 커밋된 후에만 실행.eventExecutor— 별도 스레드풀(core=2, max=4)에서 비동기 실행. 실패해도 좋아요는 이미 저장되어 있고, Kafka Consumer가 최종 보정한다.3. MetricsConsumer — BATCH + INSERT-first 멱등성
3,000건 배치 수신 → 건별 TX(1건 실패가 전체 롤백하지 않음) →
event_handledINSERT IGNORE로 중복 차단 →product_metricsUPSERT.4. CouponIssueConsumer — CAS UPDATE + UNIQUE 이중 방어
SINGLE Listener — 선착순이므로 건별 순차 처리. 3단계 방어: (1)
event_handled멱등성, (2) CAS UPDATE 수량 제어, (3) UNIQUE constraint 중복 발급 방지.5. CouponFacade.requestCouponIssue() — API → Kafka 발행
API는 발급 요청을 DB에 저장(PENDING)하고 Kafka에 발행만 한다. 실제 발급은
CouponIssueConsumer가 처리.partitionKey = couponId— 같은 쿠폰의 요청이 같은 파티션에 모여 순서가 보장된다.6. KafkaConfig — BATCH vs SINGLE Listener + Static Membership
MetricsConsumer: BATCH(3,000건) + concurrency=3 → 대량 집계에 최적화. CouponIssueConsumer: SINGLE + concurrency=1 → 건별 순차 처리로 수량 정합성 보장. 양쪽 모두
GROUP_INSTANCE_ID로 Static Membership — 재시작 시 불필요한 리밸런싱 방지.💬 리뷰 포인트
리뷰 포인트 1. 선착순 쿠폰에서 Kafka 발행과 DB 저장의 원자성
CouponFacade.requestCouponIssue()에서CouponIssueRequest를 DB에 INSERT한 후kafkaTemplate.send()로 발행합니다. 그러나 Kafka 발행은@Transactional바깥에서 실제 전송되므로, TX 커밋 후 Kafka 발행 실패 시 DB에는 PENDING 상태의 요청만 남고 Consumer가 처리하지 못합니다.EventOutbox+ Debezium CDC 패턴과 달리 여기서는kafkaTemplate.send()를 직접 사용하고 있습니다. Outbox 패턴을 선착순 쿠폰에도 적용해야 하는지, 아니면 PENDING 상태 요청을 주기적으로 재발행하는 보상 메커니즘이 더 적절한지 궁금합니다.리뷰 포인트 2. BATCH Listener에서 건별 TX와 manual Ack의 조합
MetricsConsumer는 3,000건을 배치로 수신한 뒤 건별 TX로 처리하고, 마지막에ack.acknowledge()를 호출합니다. 배치 중간에 Consumer가 죽으면 아직 ack되지 않은 전체 배치가 재전달됩니다. 이미 처리된 건은event_handledINSERT IGNORE로 skip되므로 데이터 오염은 없지만, 재처리 비용이 발생합니다.건별 ack vs 배치 ack의 트레이드오프에서 현재 선택(배치 ack)이 적절한지, 혹은 주기적 중간 커밋이 필요한지 궁금합니다.
리뷰 포인트 3. CouponIssueConsumer의 issued_count 롤백 안전성
UNIQUE 위반 시
issued_count - 1로 롤백하는데, 이 시점에 다른 요청이 CAS UPDATE로issued_count를 이미 변경했을 수 있습니다. SINGLE Listener + concurrency=1이므로 현재는 단일 Consumer가 순차 처리하여 안전하지만, Consumer를 스케일아웃(concurrency 증가)하면 race condition이 발생할 수 있습니다.이 구조가 스케일아웃 시에도 안전한지, 혹은 같은 couponId가 같은 파티션에 있어 단일 Consumer가 처리하므로 괜찮은지 궁금합니다.
📝 설계 문서
docs/design/08-event-pipeline.md🔄 변경 범위 요약