[7주차] 이벤트 기반 아키텍처, 카프카 - 최호석 by ghtjr410 · Pull Request #291 · Loopers-dev-lab/loop-pack-be-l2-vol3-java

ghtjr410 · 2026-03-27T07:48:59Z

이번 주차에 EDA와 Kafka를 학습하면서 수많은 물음표가 떠올랐습니다.
스스로 물음표들을 느낌표로 바꾸며 의사결정과 설계해나갔던 과정으로 작성하게 되었습니다.

환경

항목	값
서버	EC2 large (2 vCPU, 8GB RAM), 단일 인스턴스
DB	MySQL 8.0, 허용 커넥션 45
Hikari Pool	30 (DB 허용의 67%, 운영 여유 15개 = DBeaver + 장애 대응)
Tomcat Threads	40 (Hikari x 1.3, TX 분리로 커넥션 점유가 짧아서 스레드 > 커넥션 OK)
Kafka	단일 브로커, KRaft 모드, replication-factor=1
앱 구조	commerce-api (Producer) + commerce-streamer (Consumer)

트래픽 프로파일

API	초당 요청	Outbox 경유?	추가 작업
상품 조회	~100	X	Kafka fire-and-forget (readOnly TX 유지)
좋아요	~15	X	Kafka fire-and-forget (일시적 부정확, reconciliation)
결제 승인	~4	O	Outbox INSERT (TX2 안)
쿠폰 발급	스파이크	O	Outbox INSERT (같은 TX) + afterCommit 즉시 발행 + 스케줄러 보완
합계	~132

전체 아키텍처

graph TB
    subgraph "commerce-api (Producer)"
        Controller --> Facade
        Facade -->|핵심 로직| Service
        Facade -->|Outbox 경유| OutboxEventService

        Service -->|부가 로직| ApplicationEvent
        ApplicationEvent -->|"@Async + AFTER_COMMIT"| EventHandler

        EventHandler -->|fire-and-forget| Kafka

        OutboxEventService -->|같은 TX| DB[(outbox_events)]
        OutboxEventService -->|afterCommit 비동기| Kafka

        RelayScheduler -->|stale PENDING 수거| DB
        RelayScheduler -->|동기 발행| Kafka
    end

    subgraph "commerce-streamer (Consumer)"
        Kafka -->|배치 수신| Consumer
        Consumer --> IdempotentProcessor
        IdempotentProcessor -->|중복 체크| EventHandled[(event_handled)]
        IdempotentProcessor -->|비즈니스 로직| MetricsService
        IdempotentProcessor -->|비즈니스 로직| CouponProcessor
        IdempotentProcessor -->|감사 로그| EventLog[(event_log)]
    end

의사결정 기록

이 문서의 구조

왜 분리하는가 (D1~D3) — 핵심/부가 판별, 모놀리식에서의 분리 여부, 내부/외부 전달 기준
어떻게 발행하는가 (D4~D7) — Outbox INSERT 위치, 발행 전략 v1→v3 진화, AFTER_COMMIT, 토픽 설계
어떻게 중복을 막는가 (D8~D10) — Kafka/발행/소비 3계층 멱등성
실패하면 어떻게 되는가 (D11~D13) — DLT + 재시도, 결과적 일관성, 관측성
설정값 근거 — Consumer/Producer/스레드풀/서버/리소스 풀

1. 왜 분리하는가

D1. 핵심 vs 부가 분리 근거

판별 기준: "이 로직이 실패해도 핵심 비즈니스는 성공해야 하는가?"

분리 O (부가)	분리 X (핵심)
좋아요 집계, 조회수 집계	재고 확정/해제
결제 완료 로깅	주문 상태 전이
사용자 활동 기록	쿠폰 선점

부가 로직은 @TransactionalEventListener(AFTER_COMMIT) + @Async로 분리하여, 핵심 트랜잭션의 성공을 보장하면서 장애를 격리했다.

// LikeCountEventHandler.java
@TransactionalEventListener(phase = AFTER_COMMIT)
@Async
public void handleLiked(ProductLikedEvent event) {
    productService.incrementLikeCount(event.productId());
    productCacheManager.evictDetail(event.productId());
    publishToKafka("product.liked", event.productId(), event);
}

private void publishToKafka(String eventType, Long productId, Object event) {
    kafkaTemplate.send(CATALOG_EVENTS, String.valueOf(productId),
            outboxEventFactory.createPayload(eventType, event));
}

좋아요 요청 시 핵심(likes INSERT)과 부가(집계, Kafka 발행)가 분리되는 전체 흐름:

sequenceDiagram
    participant Client
    participant Facade
    participant DB
    participant EventHandler
    participant Kafka

    Client->>Facade: 좋아요 요청
    activate Facade
    Facade->>DB: likes 테이블 INSERT (원본 데이터)
    Note over Facade: TX 커밋
    Facade-->>Client: 200 OK
    deactivate Facade

    Note over Facade,EventHandler: AFTER_COMMIT + @Async (별도 스레드)
    EventHandler->>DB: products.like_count +1 (Atomic UPDATE, API 서빙용)
    EventHandler->>DB: 캐시 evict
    EventHandler-)Kafka: fire-and-forget (product.liked)
    Note over EventHandler: 실패해도 likes 원본은 이미 저장됨

    Note over Kafka: commerce-streamer가 수신
    Kafka->>DB: product_metrics.like_count +1 (집계/분석용, 별도 테이블)

product_metrics는 fire-and-forget 경로의 INCREMENT 기반 집계이므로 메시지 유실·중복 시 drift가 누적된다. 별도 Reconciliation을 두지 않은 이유는 분석/대시보드용 근사치 지표이며, 비즈니스 의사결정에 사용되는 정확한 좋아요 수는 products.like_count(Reconciliation 대상)를 참조하기 때문이다.

D2. 핵심 내 분리 여부 — "모놀리식에서 안 한다"

결제 성공 후 재고 확정, 주문 상태 전이, 쿠폰 사용 확정 등 후속 작업들을 이벤트로 분리할 수 있는지 검토했다.

분리한다면? — 중간 상태를 도입하면 가능하다.

graph LR
    subgraph "현재 — 모놀리식 (같은 DB, 같은 TX)"
        A1[재고 차감] --> A2[PG 결제] --> A3[주문·재고·쿠폰 확정]
        A4[Facade가 하나의 TX로 오케스트레이션]
    end

    subgraph "분리 시 — MSA (서버·DB 분리)"
        B1[재고 '예약'] --> B2[PG 결제] --> B3[payment.completed 이벤트 발행]
        B3 --> B4[재고 서비스: 예약 → 확정]
        B3 --> B5[주문 서비스: 상태 전이]
        B3 --> B6[쿠폰 서비스: 사용 확정]
    end

재고를 즉시 차감하는 대신 "예약" 중간 상태를 두면, 결제 성공 이벤트를 수신한 각 서비스가 독립적으로 확정 처리할 수 있다. 각 서비스가 자기 DB만 알면 되므로 Outbox + 이벤트 핸들러로 결과적 일관성을 달성한다.

결론: 현재는 분리하지 않는다.

지금은 모놀리식이므로 Facade가 오케스트레이션 역할을 하며, 같은 DB의 같은 TX 안에서 처리한다
같은 DB 안의 단순 UPDATE를 Kafka 셀프컨슘으로 분리하면 over-engineering
후속 작업 실패 확률 ≈ 0 (네트워크 홉 없음, 같은 DB 인스턴스)
이미 reconcilePending 스케줄러 + Payment 상태 머신으로 일관성 보장 중
서버와 DB가 분리되는 시점에 비로소 위 구조가 필요해지며, 설계안은 ADR에 기록해 둠

D3. ApplicationEvent vs Outbox — 내부/외부 전달 기준

판별 기준: 같은 JVM 내부에 이벤트를 보낼 때는 ApplicationEvent, 외부 서비스에 전달해야 할 때는 Kafka(Outbox)를 사용한다.

경로	전달 범위	선택
좋아요/조회수 집계	같은 JVM 내 핸들러가 처리	ApplicationEvent (유실 시 reconciliation 배치로 복구, 매일 02:00 실행, 최대 24시간 부정확 허용)
사용자 활동 로깅	같은 JVM 내 핸들러가 처리	ApplicationEvent
결제 완료 → 메트릭 집계	commerce-streamer(외부 앱)가 소비	Outbox → Kafka
쿠폰 발급 요청	commerce-streamer(외부 앱)가 소비	Outbox → Kafka

두 경로가 Facade에서 어떻게 갈라지는지 시각화하면:

graph LR
    subgraph "commerce-api JVM"
        A[Facade] -->|publishEvent| B[ApplicationEvent]
        B -->|AFTER_COMMIT + @Async| C[LikeCountHandler]
        B -->|AFTER_COMMIT + @Async| D[UserActivityHandler]
        C -.->|fire-and-forget| K[Kafka]

        A -->|saveAndPublish| E[OutboxEventService]
        E -->|같은 TX| F[(outbox_events)]
        E -->|afterCommit| K
    end

    subgraph "commerce-streamer JVM"
        K --> G[CatalogEventConsumer]
        K --> H[OrderEventConsumer]
        K --> I[CouponIssueConsumer]
    end

2. 어떻게 발행하는가

D4. Outbox INSERT 위치 — Facade 직접 호출

@TransactionalEventListener(BEFORE_COMMIT) 리스너에서 자동 INSERT하는 방식도 가능하지만, Facade를 읽었을 때 Outbox 발행 여부가 보이지 않는다. 이벤트 리스너를 통한 간접 호출은 불필요한 추상화라고 판단하여, Facade에서 outboxEventService.saveAndPublish()를 직접 호출하는 방식을 선택했다. Facade를 읽으면 "여기서 이벤트가 나간다"가 즉시 파악된다.

// PaymentFacade.java — Facade에서 직접 호출
@Transactional
public PaymentInfo completePayment(...) {
    // 비즈니스 로직
    Payment payment = paymentService.confirm(...);
    orderService.payOrder(...);
    stockService.confirmStock(...);

    // Outbox INSERT + 즉시 발행 — 같은 TX 안에서 원자적으로
    outboxEventService.saveAndPublish(
        "payment.completed", "payment", String.valueOf(payment.getId()),
        ORDER_EVENTS, new PaymentCompletedEvent(...)
    );
    return PaymentInfo.from(payment);
}

// OutboxEventService.java
@Transactional(propagation = Propagation.MANDATORY) // TX 없이 호출하면 즉시 예외
public void saveAndPublish(String eventType, String aggregateType, String aggregateId,
                           String topic, Object eventPayload) {
    OutboxEvent outboxEvent = outboxEventFactory.create(eventType, aggregateType, aggregateId, topic, eventPayload);
    outboxEventRepository.save(outboxEvent); // 같은 TX에서 INSERT

    afterCommit(() ->
            kafkaTemplate.send(outboxEvent.getTopic(), outboxEvent.getAggregateId(), outboxEvent.getPayload())
                    .whenComplete((result, ex) -> {
                        if (ex != null) {
                            log.warn("즉시 발행 실패, @Scheduled가 보완 예정: eventId={}", outboxEvent.getEventId(), ex);
                        } else {
                            outboxEventRepository.markPublishedByEventId(outboxEvent.getEventId());
                        }
                    }));
}

D5. 발행 전략 — 즉시 발행 + Scheduled 보완

3단계 진화를 거쳤다.

v1. 스케줄러 단독 (폴링) — 스케줄러가 주기적으로 PENDING을 수거하여 발행. 구현은 단순하지만 폴링 주기만큼 지연이 발생한다.

sequenceDiagram
    participant Client
    participant Facade
    participant DB
    participant Scheduler
    participant Kafka

    Client->>Facade: 결제 요청
    activate Facade
    Facade->>DB: 비즈니스 로직 + Outbox INSERT (PENDING)
    Note over Facade: TX 커밋
    Facade-->>Client: 200 OK
    deactivate Facade

    Note over Client,Kafka: ⏳ 최소 1초 지연 (폴링 주기)

    loop @Scheduled(1초)
        Scheduler->>DB: PENDING 이벤트 조회
        Scheduler->>Kafka: send().get(5초) 동기 대기
        Kafka-->>Scheduler: ACK
        Scheduler->>DB: SENT 마킹
    end

// OutboxRelayScheduler.java (v1 시절)
@Scheduled(fixedDelay = 1000)
public void relay() {
    List<OutboxEvent> events = outboxEventRepository.findPending(BATCH_SIZE);
    for (OutboxEvent event : events) {
        kafkaTemplate.send(event.getTopic(), event.getAggregateId(), event.getPayload())
                .get(5, TimeUnit.SECONDS); // 동기 대기
        event.markSent();
        outboxEventRepository.save(event);
    }
}

v2. 즉시 발행 + 셀프컨슘

TX 커밋 직후 비동기로 즉시 발행하고, Producer가 발행한 메시지를 다시 Consumer(셀프컨슘)로 수신하여 SENT를 마킹하는 구조. 지연은 해결됐지만 다음 단점들로 폐기.

SENT 마킹 전용 Consumer Group을 별도로 만들어야 함 (운영 부담)
셀프컨슘 Consumer의 장애 시 SENT 마킹이 밀리는 문제
SENT 마킹을 위해 Kafka를 한 번 더 경유 — 불필요한 네트워크 홉
v3의 whenComplete 콜백이면 Kafka ACK 시점에 바로 SENT 마킹 가능 — 더 단순하고 빠름

v3. 즉시 발행 + Scheduled 보완 (최종) — TX 커밋 직후 afterCommit에서 비동기로 즉시 발행하고, whenComplete 콜백으로 SENT 마킹. 즉시 발행이 실패하면 PENDING 유지 → 스케줄러가 1분마다 수거.

sequenceDiagram
    participant Client
    participant Facade
    participant DB
    participant Kafka
    participant Scheduler

    rect rgb(230, 245, 230)
        Note over Client,Kafka: 메인 경로 (99%)
        Client->>Facade: 결제 요청
        activate Facade
        Facade->>DB: 비즈니스 로직 + Outbox INSERT (PENDING)
        Note over Facade: TX 커밋
        Facade-->>Client: 200 OK
        deactivate Facade
        Facade-)Kafka: afterCommit → send() 비동기
        Kafka-->>DB: whenComplete ACK → SENT 마킹
        Note over Kafka,DB: 지연 ~0초
    end

    rect rgb(255, 240, 230)
        Note over Scheduler,Kafka: 보완 경로 (즉시 발행 실패 시)
        loop @Scheduled(1분)
            Scheduler->>DB: stale PENDING 수거
            Scheduler->>Kafka: send().get(5초) 동기
            Kafka-->>Scheduler: ACK
            Scheduler->>DB: SENT or retryCount++
        end
        Note over Scheduler: retryCount >= 10 → FAILED (운영자 개입)
    end

// OutboxRelayScheduler.java (v3 — 보완 전용)
@Scheduled(fixedDelay = 60000) // 1분 간격
public void compensatePendingEvents() {
    List<OutboxEvent> pendingEvents = outboxEventRepository.findPending(BATCH_SIZE);
    if (pendingEvents.isEmpty()) return;

    for (OutboxEvent event : pendingEvents) {
        try {
            kafkaTemplate.send(event.getTopic(), event.getAggregateId(), event.getPayload())
                    .get(5, TimeUnit.SECONDS); // 스케줄러는 사용자 대기 없으므로 동기 블로킹 안전
            event.markSent();
        } catch (Exception e) {
            event.incrementRetryCount();
            if (event.getRetryCount() >= MAX_RETRY_COUNT) {
                event.markFailed();
                log.error("Outbox FAILED: eventId={}, retryCount={}", event.getEventId(), event.getRetryCount(), e);
            }
        }
        outboxEventRepository.save(event);
    }
}

정상 케이스 지연: ~0초 (v1 대비 1초 → 0초)
즉시 발행 실패 시 PENDING 유지 → 스케줄러가 1분 내 수거
@Transactional(MANDATORY): TX 없는 컨텍스트에서 호출 시 즉시 예외 → Outbox가 비즈니스 TX 밖에서 저장되는 실수 방지

인지하고 있는 트레이드오프:

whenComplete 콜백: 콜백이 kafka-producer-network-thread에서 실행되므로, markPublishedByEventId()의 Hikari 커넥션 대기 시 모든 send() 콜백이 밀릴 수 있다. 현재 규모에서 SENT 마킹은 Hikari 30개 중 1개를 수 ms 점유 후 반환하므로 풀 고갈 가능성이 없고, 실패해도 PENDING 유지 → 스케줄러 보완으로 유실 불가능하므로 의도적으로 단순하게 유지했다. 중규모 전환 시 별도 비동기 스레드로 분리하거나 BlockingQueue + 배치 마킹으로 전환한다.

findPending 타이밍 윈도우: 즉시 발행 ACK 대기 중에 스케줄러가 같은 PENDING을 수거하면 불필요한 중복 발행이 발생한다. findPending 쿼리에 WHERE created_at < NOW() - 10초 조건을 추가하여, 즉시 발행이 진행 중일 수 있는 이벤트는 스킵하도록 개선했다.

현재 트래픽(~132 rps)에서는 이 방식이 적합하지만, 스케일 한계가 존재한다:

afterCommit에서 kafkaTemplate.send()를 호출하면 Kafka Producer의 내부 버퍼와 커넥션 풀을 Tomcat 스레드가 공유한다. 트래픽이 수천 rps 이상으로 올라가면 Producer 버퍼 경합이 발생하고, 비동기 콜백 처리를 위한 스레드 자원도 부족해진다.
이 경우 애플리케이션이 직접 발행하는 대신, Debezium 같은 CDC(Change Data Capture) 도입을 고려해야 한다. CDC는 DB의 binlog를 직접 읽어 Outbox 테이블의 INSERT를 감지하고 Kafka로 발행하므로, 애플리케이션의 스레드풀/커넥션 풀에 전혀 부담을 주지 않는다.

방식	적합 규모	장점	단점
afterCommit 즉시 발행 (현재)	~수백 rps	구현 단순, 인프라 추가 없음	Producer 버퍼/스레드 경합
CDC (Debezium)	수천 rps 이상	앱 자원 무부담, 스케일 독립	인프라 복잡도 증가 (Debezium + Kafka Connect)

D6. AFTER_COMMIT을 선택한 이유

phase	리스너 예외 시 발행자 TX	실행 시점
DEFAULT	롤백	TX 안에서 동기 실행
BEFORE_COMMIT	롤백	커밋 직전
AFTER_COMMIT	영향 없음	커밋 완료 후

부가 로직(좋아요 집계, 로깅)의 실패가 핵심 비즈니스(결제, 주문)를 롤백시키면 안 된다. AFTER_COMMIT으로 완전히 격리했다.

D7. Event vs Command 토픽

토픽	성격	네이밍	수신자
`catalog-events`	Event	과거시제 (`product.liked`)	불특정 다수
`order-events`	Event	과거시제 (`payment.completed`)	불특정 다수
`coupon-issue-requests`	Command	요청형	쿠폰 서비스 (특정)

쿠폰 발급은 "이런 일이 일어났다"가 아니라 "이것을 처리해라"이므로 Command 토픽으로 분류했다.

3. 어떻게 중복을 막는가

발행 측과 소비 측의 전략이 다르다:

발행 측(Producer): 무조건 한 번 이상 보낸다 (At-Least-Once). 유실보다 중복이 낫다.
소비 측(Consumer): Producer가 보낸 메시지를 믿지 않는다. 같은 메시지가 여러 번 올 수 있다고 가정하고 멱등 처리한다.
Kafka 레벨: 자체적으로 Exactly-Once를 지원하지만, Producer → Broker 구간에 한정되며 애플리케이션 레벨 중복은 커버하지 못한다.

graph LR
    subgraph "발행 측 — At-Least-Once"
        O[(outbox_events)] -->|"PENDING → 발행 → SENT"| P[Producer]
        P -->|"실패 시 재발행 (같은 eventId)"| O
    end

    subgraph "Kafka 레벨 — Exactly-Once (제한적)"
        P -->|"PID + seq number"| B[Broker]
        Note1["enable.idempotence=true<br/>acks=all"]
    end

    subgraph "소비 측 — 멱등 처리"
        B --> C[Consumer]
        C -->|"eventId 중복 체크"| E[(event_handled)]
        Note2["Producer를 믿지 않는다<br/>같은 메시지 재수신 가정"]
    end

D8. Kafka 레벨 멱등성 — 자체 Exactly-Once의 한계

spring.kafka.producer:
  acks: all                        # 모든 ISR 복제본 확인 후 ACK
  properties:
    enable.idempotence: true       # PID + sequence number로 브로커 중복 저장 방지

Kafka는 enable.idempotence=true 설정으로 Producer → Broker 구간의 중복을 방지한다. 네트워크 재시도로 같은 메시지가 두 번 전송되어도 Broker가 PID + sequence number로 걸러낸다.

하지만 이것만으로는 부족하다:

구간	Kafka 멱등성이 커버하는가?	예시
Producer → Broker 중복 전송	O	네트워크 타임아웃 후 재전송
애플리케이션이 같은 이벤트를 두 번 발행	X	Outbox 스케줄러가 SENT 마킹 전에 재실행
Consumer가 같은 메시지를 두 번 처리	X	리밸런싱으로 오프셋 커밋 전 재할당

따라서 Kafka 레벨 멱등성은 기본으로 켜두되, 발행 레벨(D9)과 소비 레벨(D10)에서 각각 추가 방어가 필요하다.

replication-factor=1의 내구성 한계: 현재 브로커 1대가 죽으면 Kafka 레벨에서는 메시지가 유실된다. 단, Outbox 테이블에 원본이 남아 있으므로 스케줄러가 재발행하여 복구된다. 운영 환경에서는 replication-factor=3으로 전환하여 Kafka 레벨 내구성을 확보할 예정이다.

D9. 발행 레벨 — At-Least-Once (무조건 한 번 이상 보낸다)

Outbox 테이블의 status 상태 머신으로 "유실 없는 발행"을 보장한다:

PENDING → SENT → (7일 후 삭제)
    ↓
  FAILED (retryCount >= 10)

eventId(UUID): 이벤트별 고유 식별자 — 비즈니스 TX 안에서 생성
즉시 발행 실패 → PENDING 유지 → 스케줄러가 같은 eventId로 재발행
SENT 마킹: @Modifying + @Transactional로 자체 TX에서 원자적 UPDATE
같은 비즈니스 이벤트가 다른 eventId로 발행되는 일이 없다
결과: 중복 발행은 가능하지만, 유실은 불가능 → Consumer가 멱등 처리로 중복을 걸러냄
클린업: SENT 레코드는 7일 후 자동 삭제 (OutboxCleanupScheduler, 매일 03:00). FAILED 레코드는 운영자 확인 후 수동 처리

D10. 소비 레벨 — Producer를 믿지 않는다 (멱등 처리)

At-Least-Once 발행이므로 같은 메시지가 여러 번 올 수 있다. Consumer는 이를 가정하고 event_handled 테이블로 중복을 필터링한다.

// IdempotentProcessor.java
@Transactional
public void process(String eventId, String eventType, String topic, String groupId, Runnable handler) {
    if (eventHandledRepository.existsByEventId(eventId)) {
        eventLogRepository.save(EventLog.skipped(eventId, eventType, topic, groupId));
        consumerMetrics.recordSkipped(topic, groupId, eventType);
        return; // 이미 처리됨 — SKIP
    }

    long start = System.currentTimeMillis();
    try {
        handler.run(); // 비즈니스 로직 실행
        eventHandledRepository.save(EventHandled.of(eventId, eventType)); // 같은 TX
        long duration = System.currentTimeMillis() - start;
        eventLogRepository.save(EventLog.processed(eventId, eventType, topic, groupId, duration));
        consumerMetrics.recordProcessed(topic, groupId, eventType, duration);
    } catch (Exception e) {
        eventLogRepository.save(EventLog.failed(eventId, eventType, topic, groupId, e.getMessage()));
        consumerMetrics.recordFailed(topic, groupId, eventType);
        throw e; // re-throw → ErrorHandler → DLT
    }
}

// CatalogEventConsumer.java
@KafkaListener(topics = CATALOG_EVENTS, groupId = "metrics-aggregation",
               containerFactory = BATCH_LISTENER)
public void consume(List<ConsumerRecord<String, Map<String, Object>>> records, Acknowledgment ack) {
    for (int i = 0; i < records.size(); i++) {
        ConsumerRecord<String, Map<String, Object>> record = records.get(i);
        Map<String, Object> value = record.value();
        String eventId = (String) value.get("eventId");
        String eventType = (String) value.get("eventType");

        try {
            idempotentProcessor.process(eventId, eventType, CATALOG_EVENTS, GROUP_ID, () -> {
                switch (eventType) {
                    case "product.liked" -> metricsService.incrementLikeCount(productId, 1);
                    case "product.unliked" -> metricsService.incrementLikeCount(productId, -1);
                    case "product.viewed" -> metricsService.incrementViewCount(productId, 1);
                }
            });
        } catch (Exception e) {
            throw new BatchListenerFailedException("Consumer 처리 실패", i);
        }
    }
    ack.acknowledge(); // 수동 커밋
}

최초 수신 시 비즈니스 로직을 실행하고, 같은 eventId가 재수신되면 SKIP하는 흐름:

sequenceDiagram
    participant Kafka
    participant Consumer
    participant IdempotentProcessor
    participant DB

    Kafka->>Consumer: 메시지 수신 (eventId=abc)
    Consumer->>IdempotentProcessor: process(eventId=abc, handler)
    IdempotentProcessor->>DB: existsByEventId("abc")?
    DB-->>IdempotentProcessor: false (최초)
    IdempotentProcessor->>IdempotentProcessor: handler.run() — 비즈니스 로직
    IdempotentProcessor->>DB: event_handled INSERT (같은 TX)
    Consumer->>Kafka: ack.acknowledge()

    Note over Kafka,DB: 같은 메시지 재수신 시 (리밸런싱, Outbox 재발행 등)

    Kafka->>Consumer: 메시지 재수신 (eventId=abc)
    Consumer->>IdempotentProcessor: process(eventId=abc, handler)
    IdempotentProcessor->>DB: existsByEventId("abc")?
    DB-->>IdempotentProcessor: true (이미 처리됨)
    Note over IdempotentProcessor: SKIP — handler 실행 안 함
    Consumer->>Kafka: ack.acknowledge()

비즈니스 로직 실행 + event_handled INSERT가 같은 TX → 원자적 보장
처리 성공했는데 기록이 안 남는 경우가 없다 (둘 다 커밋되거나, 둘 다 롤백)
At-Least-Once 발행 + 멱등 소비 = Exactly-Once 의미론
클린업: event_handled 레코드는 7일 후 자동 삭제 (EventHandledCleanupScheduler, 매일 04:00)

주의: Outbox FAILED의 수동 재발행은 반드시 7일 이내에 처리해야 한다. 초과 시 event_handled에서 삭제되어 멱등성 체크를 통과할 수 있다.

4. 실패하면 어떻게 되는가

D11. DLT + 재시도 전략

// KafkaConfig.java — DLT + 에러 핸들링
@Bean
public DeadLetterPublishingRecoverer deadLetterPublishingRecoverer(KafkaTemplate<Object, Object> kafkaTemplate) {
    return new DeadLetterPublishingRecoverer(kafkaTemplate, (record, ex) ->
            new TopicPartition(record.topic() + ".DLT", -1));
}

@Bean
public CommonErrorHandler commonErrorHandler(DeadLetterPublishingRecoverer recoverer) {
    DefaultErrorHandler errorHandler = new DefaultErrorHandler(recoverer, new FixedBackOff(1000L, 2L));
    errorHandler.addNotRetryableExceptions(
            JsonParseException.class,      // 메시지 포맷 오류 → 재시도 의미 없음
            JsonMappingException.class
    );
    return errorHandler;
}

Consumer 측 실패 흐름:

sequenceDiagram
    participant Kafka
    participant Consumer
    participant ErrorHandler
    participant DLT

    Kafka->>Consumer: 메시지 수신
    Consumer->>Consumer: handler.run() 실패
    Note over Consumer: 예외 발생

    Consumer->>ErrorHandler: 예외 전달

    alt 재시도 가능한 예외 (DB 일시 장애 등)
        ErrorHandler->>Consumer: 1초 후 1차 재시도
        Consumer->>Consumer: handler.run() 실패
        ErrorHandler->>Consumer: 1초 후 2차 재시도
        Consumer->>Consumer: handler.run() 실패
        ErrorHandler->>DLT: 3회 실패 → .DLT 토픽 이동
    else 재시도 불가 예외 (JsonParseException 등)
        ErrorHandler->>DLT: 즉시 .DLT 토픽 이동
        Note over ErrorHandler: 재시도해도 결과가 같으므로 스킵
    end

배치 처리 중 부분 실패 흐름:

배치 내 i번째 레코드에서 실패 → BatchListenerFailedException(i) throw
ack.acknowledge()에 도달하지 않음 → 오프셋 커밋 안 됨
ErrorHandler가 i번째만 재시도 → 최종 실패 시 DLT로 이동
다음 poll에서 0번째부터 다시 수신
0~(i-1)번째: 이미 event_handled에 기록 → SKIP
i+1 이후: 아직 미처리 → 정상 처리

Outbox 측 발행 실패 흐름:

sequenceDiagram
    participant Facade
    participant DB
    participant Kafka
    participant Scheduler

    Facade->>DB: Outbox INSERT (PENDING)
    Note over Facade: TX 커밋
    Facade-)Kafka: afterCommit → send() 비동기
    Kafka--xFacade: 발행 실패 (Kafka 장애)
    Note over DB: PENDING 유지

    loop @Scheduled(1분) — 보완 수거
        Scheduler->>DB: PENDING 조회
        Scheduler->>Kafka: send().get(5초)

        alt Kafka 복구됨
            Kafka-->>Scheduler: ACK
            Scheduler->>DB: SENT 마킹
        else 여전히 장애
            Scheduler->>DB: retryCount++
            Note over Scheduler: 10회 초과 시 FAILED → 운영자 개입
        end
    end

설정 요약:

구분	재시도	최종 실패 시
Consumer	FixedBackOff 1초 x 2회	`.DLT` 토픽 이동
Outbox	스케줄러 1분 간격, `.get(5초)`	retryCount >= 10 → FAILED

D11-1. DLQ 후속 처리 전략 (운영 구상)

현재 구현된 것은 DLT 토픽 자동 생성 + DeadLetterPublishingRecoverer로 실패 메시지 자동 라우팅까지다. DLT에 쌓인 메시지를 조회하거나 재발행하는 운영 도구는 아직 구현하지 않았다. 향후 운영 환경에서 DLT 메시지가 실제로 발생했을 때, 아래 절차로 처리할 계획이다.

graph TB
    subgraph "실패 감지 (구현 완료)"
        A[".DLT 토픽 메시지"] --> D{원인 분류}
        B["Outbox FAILED 레코드"] --> D
        C["EventLog FAILED 로그"] --> D
    end

    subgraph "원인별 대응 (운영 구상)"
        D -->|"메시지 포맷 오류<br/>(JsonParseException)"| E["Producer 코드 수정 후<br/>신규 메시지로 재발행"]
        D -->|"일시적 장애 복구 후<br/>(DB 타임아웃 등)"| F["DLT 메시지를<br/>원본 토픽에 재발행"]
        D -->|"비즈니스 로직 버그"| G["코드 수정 배포 후<br/>DLT 메시지 재처리"]
        D -->|"이미 보정 완료<br/>(Reconciliation 등)"| H["DLT 메시지 폐기"]
    end

원인	대응	Consumer 멱등성 보장
메시지 포맷 오류	Producer 코드 수정 후 신규 메시지 발행	새 eventId이므로 해당 없음
일시적 장애 후 복구	DLT 메시지를 원본 토픽에 재발행	O — 같은 eventId로 재수신, 이미 처리됐으면 SKIP
비즈니스 로직 버그	코드 수정 배포 후 DLT 메시지 재처리	O — event_handled에 없으면 정상 처리
Reconciliation으로 이미 보정	DLT 메시지 폐기	처리 불필요

DLT 재발행 시에도 Consumer의 IdempotentProcessor가 eventId로 중복을 필터링하므로, 안전하게 재처리할 수 있다. DLT 메시지 조회/재발행 Admin API와 모니터링 알림은 운영 환경 안정화 후 구현 예정이다.

D12. 결과적 일관성 보장 메커니즘

이벤트만으로는 100% 일관성을 보장할 수 없다. 3가지 보완 장치를 두었다:

계층	메커니즘	역할
발행	Outbox + RelayScheduler	At-Least-Once 발행 보장
도메인	Payment 상태 머신 + reconcilePending	미결 결제 자동 보정
데이터	Reconciliation 배치	좋아요 수를 `likes` 테이블 `COUNT(*)`로 주기적 동기화

sequenceDiagram
    participant Scheduler
    participant DB

    Note over Scheduler,DB: Reconciliation 배치 (주기적)
    Scheduler->>DB: SELECT product_id, COUNT(*) FROM likes GROUP BY product_id
    DB-->>Scheduler: 원본 데이터 기준 좋아요 수
    Scheduler->>DB: UPDATE products SET like_count = COUNT(*) 결과
    Note over Scheduler,DB: 이벤트 기반 실시간 집계의 drift를 원본 데이터로 보정

Reconciliation 설계 원칙: "이벤트 기반 실시간 집계를 신뢰하되, 주기적으로 원본 데이터와 대조하여 drift를 보정한다."

D13. 관측성 — 3계층 구조

계층	도구	목적
감사 추적	`EventLog` 테이블	이벤트별 PROCESSED/SKIPPED/FAILED 상태, 에러 메시지, 처리 시간
실시간 모니터링	`ConsumerMetrics` (Micrometer)	`consumer.event.processed/skipped/failed` Counter + Timer
최종 실패	DLT 토픽	재시도 불가 메시지 격리

sequenceDiagram
    participant Consumer
    participant IdempotentProcessor
    participant DB
    participant Metrics

    Consumer->>IdempotentProcessor: process(eventId, handler)

    alt 최초 처리
        IdempotentProcessor->>IdempotentProcessor: handler.run()
        IdempotentProcessor->>DB: event_handled INSERT
        IdempotentProcessor->>DB: EventLog (PROCESSED, duration)
        IdempotentProcessor->>Metrics: counter++ / timer record
    else 중복 수신
        IdempotentProcessor->>DB: EventLog (SKIPPED)
        IdempotentProcessor->>Metrics: skipped counter++
    else 처리 실패
        IdempotentProcessor->>DB: EventLog (FAILED, errorMessage)
        IdempotentProcessor->>Metrics: failed counter++
        Note over IdempotentProcessor: 예외 re-throw → ErrorHandler → DLT
    end

    Note over DB: EventLog 저장 실패 시 try-catch로 격리<br/>비즈니스 TX에 영향 없음

5. 설정값 근거

Kafka Consumer 설정

// KafkaConfig.java
public static final int MAX_POLLING_SIZE = 500;                // ~132 rps 기준 적정 배치
public static final int FETCH_MIN_BYTES = 1;                   // 메시지 도착 즉시 반환
public static final int FETCH_MAX_WAIT_MS = 1000;              // 1초 대기 후 반환
public static final int SESSION_TIMEOUT_MS = 60 * 1000;        // 1분
public static final int HEARTBEAT_INTERVAL_MS = 20 * 1000;     // 20초 (1/3 of session_timeout)
public static final int MAX_POLL_INTERVAL_MS = 2 * 60 * 1000;  // 2분

// 배치 리스너 팩토리
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
factory.setConcurrency(3);
factory.setBatchListener(true);

설정	값	근거
`max.poll.records`	500	~132 rps 기준 적정 배치. 3000에서 축소 — DB 부하 시 레코드당 40ms면 3000 x 40ms = 120초로 `max.poll.interval.ms`와 일치하여 리밸런싱 위험. 500이면 20초로 충분한 마진 확보
`fetch.min.bytes`	1 (byte)	메시지 도착 즉시 반환. 기존 1MB 설정 시 현재 트래픽에서 거의 항상 `fetch.max.wait.ms` 타임아웃에 걸려 모든 배치에 불필요한 지연이 추가됨. 배치 효율은 트래픽이 올라가면 자연스럽게 좋아짐
`fetch.max.wait.ms`	1초	메시지 없을 때 최대 대기 시간. 기존 5초에서 축소 — 즉시 발행으로 발행 지연 ~0초를 달성했는데 Consumer에서 5초를 다시 추가하는 건 비효율
`session.timeout.ms`	60초	리밸런싱 민감도. 너무 짧으면 GC pause로 불필요한 리밸런싱
`heartbeat.interval.ms`	20초	session.timeout의 1/3 (Kafka 권장)
`max.poll.interval.ms`	2분	500개 배치 처리 최대 허용 시간. 레코드당 40ms(DB 부하)여도 20초 → 충분한 마진
`ack-mode`	MANUAL	처리 완료 후 명시적 커밋 → 메시지 유실 방지
`concurrency`	3	토픽당 파티션 3개(KafkaTopicConfig)와 1:1 매칭

Kafka Producer 설정

설정	값	근거
`acks`	all	모든 ISR 복제 확인. 현재 replication-factor=1이라 실질적으로 acks=1과 동일하지만, 운영 환경 replication-factor=3 전환 시 설정 변경 없이 안전성 확보
`enable.idempotence`	true	PID + sequence number로 브로커 중복 저장 방지

비동기 스레드풀 설정

// AsyncConfig.java
@Override
public Executor getAsyncExecutor() {
    ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(5);
    executor.setMaxPoolSize(10);
    executor.setQueueCapacity(100);
    executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
    executor.setWaitForTasksToCompleteOnShutdown(true);
    executor.setAwaitTerminationSeconds(30);
    executor.setThreadNamePrefix("event-handler-");
    executor.initialize();
    return executor;
}

@Override
public AsyncUncaughtExceptionHandler getAsyncUncaughtExceptionHandler() {
    return (ex, method, params) ->
            log.error("비동기 핸들러 미처리 예외: method={}", method.getName(), ex);
}

설정	값	근거
`corePoolSize`	5	좋아요 ~15 rps + 조회 이벤트. 평상시 5스레드로 충분
`maxPoolSize`	10	스파이크 시 2배 확장. Hikari 30개 중 이벤트 핸들러가 과점유하지 않도록 제한
`queueCapacity`	100	core 포화 시 큐잉. 100개 이상이면 CallerRunsPolicy로 호출자 스레드에서 실행
`rejectedExecutionHandler`	CallerRunsPolicy	큐 가득 시 버리지 않고 호출자(Tomcat 스레드)가 직접 실행 → 메시지 유실 방지, 자연스러운 백프레셔
`awaitTerminationSeconds`	30	셧다운 시 진행 중 작업 완료 대기

CallerRunsPolicy 트레이드오프: AFTER_COMMIT 리스너는 TX 커밋 후이지만 컨트롤러 return 전에 호출된다. @Async가 정상이면 비동기 스레드풀에 제출하고 바로 return하지만, CallerRunsPolicy 발동 시 Tomcat 스레드가 부가 로직을 동기 실행하므로 클라이언트 응답이 지연될 수 있다. 현재 트래픽(좋아요 ~15 rps, core 5)에서 큐 100까지 차는 상황은 사실상 발생하지 않으며, 발생한다면 시스템 자체가 비정상이므로 메시지를 버리는 것(DiscardPolicy)보다 속도를 늦추면서라도 처리하는 게 낫다고 판단했다.

서버 설정

# application.yml
server:
  shutdown: graceful              # 진행 중 요청 완료 후 종료
  tomcat:
    threads:
      max: 40                     # Hikari 30 x 1.3
      min-spare: 10

# jpa.yml
datasource:
  mysql-jpa:
    main:
      maximum-pool-size: 30       # DB 허용 45의 67%
      minimum-idle: 20

설정	값	근거
`shutdown`	graceful	새 요청 거부 + 진행 중 요청 완료 대기 후 종료. `awaitTerminationSeconds`와 함께 이벤트 핸들러 작업 유실 방지
`threads.max`	40	Hikari 30 x 1.3. TX 분리(부가 로직 비동기 처리)로 커넥션 점유 시간이 짧아 스레드 > 커넥션 성립
`maximum-pool-size`	30	DB 허용 45의 67%. 나머지 15개는 DBeaver, 장애 대응, 모니터링 여유
`minimum-idle`	20	평상시 유휴 커넥션 유지. 트래픽 스파이크 시 30까지 확장

리소스 풀 설계

MySQL 허용 커넥션: 45
  ├─ Hikari Pool: 30 (67%) — 앱 전용
  └─ 여유: 15 — DBeaver, 장애 대응, 모니터링

Tomcat 스레드: 40 (Hikari x 1.3)
  └─ TX 분리(부가 로직 비동기 처리)로 커넥션 점유 시간이 짧아
     스레드 수 > 커넥션 수가 성립

비동기 스레드풀: core 5 / max 10 / queue 100
  └─ 이벤트 핸들러 전용. Hikari 풀을 과점유하지 않도록 제한
     큐 초과 시 CallerRunsPolicy로 Tomcat 스레드가 직접 실행

학습 레퍼런스

이번 과제를 진행하며 Kafka와 메시징 시스템의 기초를 별도 레포에서 학습했습니다.

messaging-lab — 메시징 시스템 기초 개념 학습 (Producer/Consumer 패턴, 메시지 보장 수준, 직렬화/역직렬화)
kafka-lab — Kafka 실습 (토픽/파티션/컨슈머 그룹, 배치 처리, 수동 커밋, 에러 핸들링, DLT 구성)

리뷰 포인트

리뷰 포인트 1: 스케줄드 폴링 단독의 한계를 개선하기 위해 즉시 발행 + 보완 폴링 구조로 전환했습니다

스케줄드 폴링 단독의 문제점

TX 커밋 → Outbox INSERT (PENDING) → 스케줄러가 N초마다 폴링 → send → SENT

처리량 상한이 "배치 크기 ÷ (배치 × ACK 시간)"으로 고정됩니다.
트래픽이 이 상한을 넘으면 설정 튜닝(배치 크기 증가, 폴링 주기 단축)이 필요한데, 둘 다 DB 부하를 올립니다.
정상 케이스에서도 폴링 주기만큼 지연이 발생합니다.

개선 — 즉시 발행 + 보완 폴링

TX 커밋 → Outbox INSERT (PENDING) → afterCommit에서 즉시 send (메인 경로)
                                   → 실패 시 PENDING 유지 → @Scheduled가 수거 (보완 경로)

afterCommit에서 send()가 논블로킹으로 즉시 리턴하므로,
정상 케이스 지연이 거의 없고 처리량 상한이 Kafka 브로커 한계까지 올라갑니다.
@scheduled는 실패 건만 수거하는 안전망 역할입니다.

메인 경로 — 즉시 발행 (OutboxEventService)

@Transactional(propagation = Propagation.MANDATORY)
public void saveAndPublish(String eventType, String aggregateType, 
                           String aggregateId, String topic, Object payload) {
    OutboxEvent outbox = outboxEventFactory.create(eventType, aggregateType, aggregateId, topic, payload);
    outboxEventRepository.save(outbox);

    afterCommit(() ->
        kafkaTemplate.send(outbox.getTopic(), outbox.getAggregateId(), outbox.getPayload())
            .whenComplete((result, ex) -> {
                if (ex == null) {
                    outboxEventRepository.markPublishedByEventId(outbox.getEventId());
                } else {
                    log.warn("즉시 발행 실패, @Scheduled가 보완 예정: eventId={}",
                             outbox.getEventId(), ex);
                }
            }));
}

보완 경로 — 스케줄드 폴링 (OutboxRelayScheduler)

@Scheduled(fixedDelay = 60000)
public void compensatePendingEvents() {
    List pending = outboxEventRepository.findPending(BATCH_SIZE);
    for (OutboxEvent event : pending) {
        try {
            kafkaTemplate.send(event.getTopic(), event.getAggregateId(), event.getPayload())
                .get(5, TimeUnit.SECONDS);
            event.markSent();
        } catch (Exception e) {
            event.incrementRetryCount();
            if (event.getRetryCount() >= MAX_RETRY) {
                event.markFailed();
            }
        }
        outboxEventRepository.save(event);
    }
}

시퀀스 다이어그램 — 메인 경로 (정상)

sequenceDiagram
    participant Client
    participant Facade
    participant DB
    participant OutboxService
    participant Kafka
    
    Client->>Facade: API 요청
    activate Facade
    Facade->>DB: 도메인 로직 + Outbox INSERT (같은 TX)
    DB-->>Facade: TX COMMIT
    Facade-->>Client: 200 OK
    deactivate Facade
    
    Note over OutboxService: afterCommit 실행 (API 응답 이후)
    OutboxService->>Kafka: send() 논블로킹
    Kafka-->>OutboxService: ACK (whenComplete)
    OutboxService->>DB: markPublished (PENDING → SENT)

시퀀스 다이어그램 — 보완 경로 (즉시 발행 실패 시)

sequenceDiagram
    participant Scheduler
    participant DB
    participant Kafka

    Note over Scheduler: @Scheduled 60초마다 실행
    Scheduler->>DB: SELECT * FROM outbox_events WHERE status = 'PENDING'
    DB-->>Scheduler: PENDING 건 목록
    
    loop 각 이벤트
        Scheduler->>Kafka: send().get(5초) — 동기 블로킹
        alt ACK 성공
            Kafka-->>Scheduler: ACK
            Scheduler->>DB: markSent (PENDING → SENT)
        else 실패
            Scheduler->>DB: retryCount++ (상한 초과 시 FAILED)
        end
    end

인지하고 있는 트레이드오프

whenComplete 콜백은 kafka-producer-network-thread에서 DB를 직접 호출합니다.
이 스레드에서 커넥션을 잡는 동안 Kafka 전체 발행이 정체될 수 있습니다.

다만 마킹이 실패해도 유실은 구조적으로 불가능합니다:
PENDING 유지 → @scheduled가 .get()으로 수거 → 중복 발행 → Consumer 멱등 방어.
whenComplete 마킹의 역할은 "보완 폴링이 불필요하게 재발행하는 걸 줄여주는 최적화"입니다.

트래픽 증가 시 대응 전략

현재     → whenComplete에서 직접 DB 호출. 단순하고 충분.
중규모   → BlockingQueue + 배치 마킹으로 kafka 스레드 보호.
대규모   → CDC(Debezium)로 전환. 릴레이 자체를 인프라에 위임.

스케줄드 폴링 단독의 한계를 개선하기 위해 이런 구조를 시도해봤는데,
실무에서도 이 전략이 유효한지, 그리고 트래픽 증가 시 대응 경로가 현실적인지 의견이 궁금합니다.

목적

이벤트 기반 아키텍처에서 Kafka와 Outbox 패턴을 도입해 외부 소비자에 대한 신뢰성을 보장하고, 내부 작업은 비동기화로 성능 최적화(대상 환경: EC2 2vCPU/8GB, 약 132 rps).

핵심 변경점

Outbox 패턴 구현(OutboxEvent, OutboxEventService, OutboxRelayScheduler) — 비즈니스 TX 내 Outbox 저장 → afterCommit에서 비차단 Kafka 전송(성공→SENT, 실패→PENDING) → 60s 주기 보상 스케줄러가 PENDING 폴링 후 동기 전송(.get(5s)), retryCount 증가·최대 횟수 초과 시 FAILED; 내부 이벤트는 ApplicationEvent+@TransactionalEventListener(AFTER_COMMIT)+@async로 처리(좋아요/조회 등), 소비자는 배치 리스너·수동 ACK·IdempotentProcessor(이벤트 중복 방지)와 DLT(DefaultErrorHandler+DeadLetterPublishingRecoverer)로 구성; 운영용 스케줄러들(Outbox 정리 7일, 이벤트핸들드/로그 정리, 좋아요 동기화, 주문 CREATED 상태 10분 경과 자동 만료 및 재고/쿠폰 복원)과 관련 테스트·조정(스레드/커넥션 톤다운) 포함.

리스크/주의사항

MAX_RETRY_COUNT 초과로 FAILED 전환 시 수동 개입·모니터링/재처리 절차 필요; 배치 소비자에서 한 레코드 실패 시 전체 배치 롤백/재처리(포이즌필 가능) 및 DLT 운영 정책 미비; Tomcat/Hikari 쓰레드·커넥션 축소로 인한 큐잉·지연 가능성 — 실제 부하에서 검증 필요. 추가적으로 ApplicationEvent 기반 핸들러는 fire-and-forget으로 신뢰성 낮음(누락 허용 범위 확인 필요).

테스트/검증 방법

통합/연동 테스트(OutboxEventServiceIntegrationTest, OutboxRelaySchedulerIntegrationTest, CouponAsyncApiE2ETest, CouponIssueConcurrencyTest 등)로 TX 보존→전송→상태전환·멱등성·동시성 검증; Awaitility 기반 최종 일관성 검증(좋아요 카운트 등); 실제 트래픽 환경에서 스레드풀·커넥션 포화 및 Kafka 레이턴시·DLT 발생률 모니터링 권장.

질문: FAILED 된 Outbox/ DLT 트래픽에 대한 운영 알람·재처리 절차(수동/자동)가 이미 마련되어 있나요? 또한 현재 설정(톰캣 40, Hikari 30)이 목표 rps(≈132)에서 충분한지 부하테스트 결과가 있으면 공유해 주세요.

- TX 없는 컨텍스트에서 호출 시 즉시 예외 발생 - Outbox INSERT가 비즈니스 TX 밖에서 실수로 호출되는 것을 방지

- CouponIssueRequestTest (api): 생성, isPending - CouponIssueRequestTest (streamer): complete, reject, 상태 전이 - OutboxEventTest: 생성, markSent, markFailed, incrementRetryCount - EventLogTest: processed, skipped, failed, 500자 truncate - OutboxEventFactoryTest: 생성, UUID 형식, JSON 직렬화, 실패

- LikeEventHandlerIntegrationTest: 핸들러 실패 격리 (좋아요 성공 보장) - OutboxEventServiceIntegrationTest: Outbox 저장, MANDATORY TX 전파, 즉시 발행 - OutboxRelaySchedulerIntegrationTest: .get() 성공/실패 분기, 10회 초과 FAILED, 빈 큐

- IdempotentProcessorIntegrationTest: 멱등 처리, 중복 스킵, EventLog, 예외 재전파 - MetricsServiceIntegrationTest: UPSERT 좋아요/조회/판매 메트릭

- CatalogEventConsumerTest: product.liked/viewed 파싱, 미지원 이벤트 스킵 - OrderEventConsumerTest: payment.completed 파싱, payment.failed 로그만, 미지원 스킵

- OutboxCleanupSchedulerIntegrationTest: SENT 7일 경과 삭제, PENDING 미삭제 - EventHandledCleanupSchedulerIntegrationTest: 7일 경과 event_handled 삭제

- 비동기 발급 요청: 200+PENDING, 중복 요청 기존 상태, 404, 401 - 발급 상태 조회: PENDING 확인, 미요청 404 - KafkaTemplate @MockBean으로 Kafka 의존 제거

@Modifying

- streamer에 경량 Coupon Entity 추가 (발급 관련 필드만) - 네이티브 쿼리 → JPQL @Modifying 원자적 UPDATE (만료+삭제 검증 유지) - 중복 발급 체크를 DataIntegrityViolation 대신 사전 조회로 변경 - CouponIssueProcessorIntegrationTest: 발급 성공/수량 소진/중복 방지/상태 검증

- 수량 100장에 200건 동시 요청 → 발급 수 정확히 100개 검증 - 20스레드 동시 실행, CountDownLatch로 동기화 - Atomic UPDATE(JPQL)로 수량 초과 방지 확인

- reconcileLikeCountFromMetrics → reconcileLikeCountFromLikes - likes 테이블 COUNT(*) 기준 UPDATE (product_metrics 의존 제거) - LikeCountReconciliationSchedulerIntegrationTest: 불일치 동기화, likes 없으면 0

coderabbitai · 2026-03-27T07:49:21Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

상품 좋아요·조회, 결제, 쿠폰 발급 흐름을 이벤트 기반으로 전환하고 Outbox 패턴으로 Kafka에 발행한다. commerce-api는 이벤트 발행·Outbox 기록·스케줄러·비동기 API를 추가하고, commerce-streamer는 Kafka 컨슈머·idempotency·쿠폰 발급 처리·메트릭 집계를 담당한다.

Changes

Cohort / File(s)	Summary
쿠폰 비동기 발급 API & 도메인 `apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponFacade.java`, `apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponIssueRequestInfo.java`, `apps/commerce-api/src/main/java/com/loopers/interfaces/api/coupon/CouponV1Controller.java`, `apps/commerce-api/src/main/java/com/loopers/interfaces/api/coupon/CouponV1Dto.java`, `apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java`, `apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequestRepository.java`, `apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueStatus.java`, `apps/commerce-api/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestJpaRepository.java`, `apps/commerce-api/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestRepositoryImpl.java`	비동기 쿠폰 발급 요청 엔드포인트(요청·상태조회)와 이를 추적하는 도메인 엔티티·레포지토리를 추가했다.
Outbox 인프라(엔티티·팩토리·서비스·리포지토리) `apps/commerce-api/src/main/java/com/loopers/support/outbox/OutboxEvent.java`, `.../OutboxEventStatus.java`, `.../OutboxEventFactory.java`, `.../OutboxEventJpaRepository.java`, `.../OutboxEventRepositoryImpl.java`, `.../OutboxEventService.java`	트랜잭션 내 Outbox 레코드 생성, 커밋 후 Kafka 전송(비동기 콜백으로 SENT 마킹) 흐름을 구현했다.
Outbox 스케줄러·정리·릴레이 `apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxRelayScheduler.java`, `.../OutboxCleanupScheduler.java`	미전송 이벤트 보상(릴레이)과 오래된 SENT 레코드 삭제 스케줄러를 추가했다.
이벤트 타입·비동기 핸들러·퍼블리셔 `apps/commerce-api/src/main/java/com/loopers/application/event/LikeCountEventHandler.java`, `.../ProductLikedEvent.java`, `.../ProductUnlikedEvent.java`, `.../ProductViewedEvent.java`, `.../PaymentCompletedEvent.java`, `.../PaymentFailedEvent.java`, `.../PaymentCanceledEvent.java`, `.../UserActivityEventHandler.java`, `apps/commerce-api/src/main/java/com/loopers/application/like/LikeFacade.java`, `apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java`	좋아요/언라이크/조회/결제 이벤트 타입을 추가하고, LikeFacade/ProductFacade/PaymentProcessor에서 ApplicationEvent를 발행하도록 변경했다. 트랜잭션 후 비동기 핸들러가 집계·캐시 무효화·Kafka 전송을 수행한다.
Payment → Outbox 연동 `apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentProcessor.java`	결제 상태(완료/실패/취소) 발생 시 ApplicationEvent 발행과 OutboxEventService.saveAndPublish 호출을 추가했다.
좋아요 카운트 정합성(리콘실리엄) `apps/commerce-api/src/main/java/com/loopers/domain/product/ProductRepository.java`, `.../infrastructure/product/ProductJpaRepository.java`, `.../infrastructure/product/ProductRepositoryImpl.java`, `.../interfaces/scheduler/LikeCountReconciliationScheduler.java`	likes 테이블 기반의 일괄 like_count 동기화 메서드와 이를 수행하는 스케줄러를 추가했다.
비동기 설정·Kafka 토픽 등록 `apps/commerce-api/src/main/java/com/loopers/support/config/AsyncConfig.java`, `apps/commerce-api/src/main/java/com/loopers/support/config/KafkaTopicConfig.java`, `apps/commerce-api/src/main/resources/application.yml`	@async용 ThreadPoolTaskExecutor 설정 추가, Kafka NewTopic(DLT 포함) 빈 등록, application.yml에 kafka.yml import 및 Tomcat thread max 축소를 반영했다.
commerce-streamer: 컨슈머·idempotency·쿠폰 처리·메트릭 `apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CatalogEventConsumer.java`, `.../CouponIssueConsumer.java`, `.../OrderEventConsumer.java`, `apps/commerce-streamer/src/main/java/com/loopers/application/idempotent/IdempotentProcessor.java`, `.../coupon/CouponIssueProcessor.java`, `.../application/metrics/ConsumerMetrics.java`, `.../application/metrics/MetricsService.java`, `apps/commerce-streamer/src/main/java/com/loopers/domain/`, `apps/commerce-streamer/src/main/java/com/loopers/infrastructure/`	Kafka 배치 소비자와 idempotent 처리, 쿠폰 발급 처리 로직, 이벤트 로그·idempotency·메트릭 엔티티 및 저장소를 추가했다.
Kafka 설정·토픽 상수·모듈 의존 `modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaConfig.java`, `modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaTopics.java`, `modules/kafka/src/main/resources/kafka.yml`, `apps/commerce-api/build.gradle.kts`	DeadLetterPublishingRecoverer/DefaultErrorHandler 추가, 소비자 튜닝 변경, Kafka 토픽 상수 추가, producer idempotence/acks 설정, commerce-api에 kafka 모듈 의존 추가를 적용했다.
설정 변경: 리소스 조정 `modules/jpa/src/main/resources/jpa.yml`, `apps/commerce-streamer/src/main/resources/application.yml`, `apps/commerce-api/src/main/resources/application.yml`	JPA 커넥션풀 및 Tomcat thread 축소와 commerce-streamer용 Kafka consumer 기본 그룹·auto-offset-reset 정책 추가를 반영했다.
테스트: 단위·통합·동시성 보강 `apps/commerce-api/src/test/...`, `apps/commerce-streamer/src/test/...` (여러 파일)	Outbox, 쿠폰 비동기 E2E, 컨슈머/idempotency, 쿠폰 동시성, 메트릭·릴레이·정리 스케줄러 등 다양한 테스트를 추가하고 기존 테스트를 Awaitility로 비동기 검증하도록 수정했다.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,200,255,0.5)
    participant User as 사용자
    participant API as commerce-api Controller
    participant Facade as CouponFacade
    participant Repo as CouponIssueRequestRepository
    participant OutboxSvc as OutboxEventService
    participant DB as Database
    participant Kafka as Kafka(COUPON_ISSUE_REQUESTS)
    end

    User->>API: POST /api/v1/coupons/{id}/issue-async
    API->>Facade: issueAsync(couponId, userId)
    Facade->>Repo: findByCouponIdAndUserId(couponId,userId)
    alt existing request
        Repo-->>Facade: existing request
        Facade-->>API: return existing status
    else new request
        Facade->>Repo: save(new CouponIssueRequest PENDING)
        Repo->>DB: INSERT coupon_issue_requests
        Facade->>OutboxSvc: saveAndPublish(eventType, aggregateType, aggregateId, topic, payload)
        OutboxSvc->>DB: INSERT outbox_events (PENDING) within transaction
        DB-->>OutboxSvc: transaction commit
        OutboxSvc->>Kafka: afterCommit -> kafka.send(...) (async)
        Kafka-->>OutboxSvc: send callback
        OutboxSvc->>DB: markPublishedByEventId(...) on success
        Facade-->>API: return created request info
    end

sequenceDiagram
    rect rgba(200,255,200,0.5)
    participant Kafka as Kafka(COUPON_ISSUE_REQUESTS)
    participant Consumer as CouponIssueConsumer
    participant Idem as IdempotentProcessor
    participant Processor as CouponIssueProcessor
    participant ReqRepo as CouponIssueRequestRepository
    participant CouponRepo as CouponRepository
    participant IssuedRepo as IssuedCouponJpaRepository
    participant DB as Database
    end

    Kafka->>Consumer: batch record(eventId,payload)
    Consumer->>Idem: process(eventId,eventType,topic,groupId, handler)
    Idem->>DB: existsByEventId(eventId)
    alt already handled
        Idem-->>Consumer: record SKIPPED, metrics
    else first time
        Idem->>Processor: handler.run() => process(eventId,couponId,userId)
        Processor->>ReqRepo: findByEventId(eventId)
        alt request missing or not PENDING
            Processor-->>Idem: no-op
        else
            Processor->>IssuedRepo: existsByCouponIdAndUserId(couponId,userId)
            alt duplicate exists
                Processor->>ReqRepo: reject(request,"이미 발급")
            else
                Processor->>CouponRepo: issueIfAvailable(couponId)
                alt issueIfAvailable == 0
                    Processor->>ReqRepo: reject(request,"수량 소진")
                else
                    Processor->>IssuedRepo: save(IssuedCoupon)
                    Processor->>ReqRepo: complete(request)
                end
            end
        end
        Idem->>DB: save(EventHandled), save(EventLog.processed)
        Idem-->>Consumer: processed
    end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Loopers-dev-lab/loop-pack-be-l2-vol3-java#158: 쿠폰 비동기 발급·CouponIssueRequest 및 Processor/Consumer 관련 변경과 중복되는 도메인·API 수정이 있어 강한 코드 레벨 연관성이 있다.
Loopers-dev-lab/loop-pack-be-l2-vol3-java#286: Kafka/outbox 인프라, KafkaTopics 및 컨슈머/스케줄러 변경이 본 PR과 직접적으로 중복되며 연관성이 높다.
Loopers-dev-lab/loop-pack-be-l2-vol3-java#220: LikeFacade/ProductFacade의 like-count 처리 방식 변경 및 리콘실리언 스케줄러 관련 변경과 코드 레벨 연관성이 있다.

운영 관점 주요 지적

1. Outbox 이벤트 복구·관찰성 부족

왜 문제인지(운영): OutboxRelayScheduler가 재전송을 시도하지만 전송 실패 원인·횟수·최근 오류가 기록되지 않아 장애 원인 분석과 자동 대응이 어렵다.
수정안: Outbox 엔티티에 lastError·lastErrorAt 필드와 실패 카운트 컬럼을 추가하고, markFailed 전환 시 관련 메트릭(outbox.failed, outbox.retry_count)을 노출하도록 한다. FAILED 이벤트에 대한 보존·알림 정책(예: 30일 보존 후 수동검토 알림)을 수립한다.
추가 테스트: Kafka 장애 시 saveAndPublish 호출 후 OutboxRelayScheduler가 retry 증가·FAILED 전환 및 메트릭 노출을 올바르게 수행하는 통합 테스트를 작성한다.

2. Idempotent 처리에서 발생 가능한 레이스 조건

왜 문제인지(운영): IdempotentProcessor의 exists->save 흐름은 다중 인스턴스 동시 처리 시 race로 동일 이벤트가 중복 처리될 수 있다.
수정안: EventHandled.eventId에 DB 레벨 고유 제약을 추가하거나 INSERT … ON CONFLICT(upsert) 전략으로 원자적 검사·저장을 수행해 중복을 DB에서 차단한다. 트랜잭션 내 고유 제약 위반 시 핸들링(예: 스킵 로직과 로그 기록)을 구현한다.
추가 테스트: 동일 eventId를 여러 스레드/프로세스에서 동시 처리할 때 단일 실행만 허용되는 E2E 동시성 테스트를 작성한다.

3. 비동기 핸들러의 Kafka 전송 실패 정책 미비

왜 문제인지(운영): LikeCountEventHandler와 UserActivityEventHandler는 Kafka 전송 실패를 로그로만 처리하므로 이벤트 누락이 발생해 downstream 불일치가 생길 수 있다.
수정안: 중요 이벤트는 Outbox 패턴으로 전환하거나 전송 실패 시 Outbox에 백업하여 릴레이 스케줄러가 보완하도록 변경한다. 실패 시 표준화된 메트릭과 알림을 추가한다.
추가 테스트: Kafka 전송 실패 시 이벤트가 Outbox에 저장되고 OutboxRelayScheduler가 재전송하여 markPublished 되는 시나리오를 검증하는 통합 테스트를 작성한다.

4. 쿠폰 발급의 원자성 및 중복 검사 타이밍

왜 문제인지(운영): CouponIssueProcessor의 중복 검사(existsByCouponIdAndUserId)와 수량 차감(issueIfAvailable) 사이에 병행 처리가 끼어들면 중복 발급 또는 잘못된 거절이 발생할 수 있다.
수정안: issued_coupons에 DB 유니크 제약을 명확히 두고, 중복 여부는 DB 제약과 issueIfAvailable의 영향(affected rows)을 기반으로 결정하도록 변경한다. 필요 시 발급 저장을 시도하고 고유 제약 위반 시 REJECTED로 처리한다.
추가 테스트: 높은 동시성에서 발급 완료 수가 maxIssueCount를 초과하지 않으며 REJECTED 사유가 "이미 발급"/"수량 소진"으로 구분되어 기록되는 동시성 E2E 테스트를 추가한다.

5. Kafka 배치 소비자 부분 실패 정책 검증 필요

왜 문제인지(운영): 배치 내 단일 레코드 실패 시 BatchListenerFailedException으로 배치 전체가 재처리되어 지연이 발생할 수 있다. 이미 처리된 레코드는 idempotency로 보호되지만 재시도·지연이 확대될 수 있다.
수정안: 부분 실패를 허용하거나 DeadLetterPublishingRecoverer를 활용해 실패 레코드만 DLT로 전송하도록 정책을 명확히 하고 재시도 백오프·횟수를 검증한다. 컨슈머 설정에서 non-retryable 예외와 재시도 횟수를 문서화한다.
추가 테스트: 배치 내 파싱 오류 레코드가 DLT로 전송되고 나머지 레코드가 정상 처리되는 시나리오를 검증하는 통합 테스트를 작성한다.

6. 모니터링·알림 체계 보강 필요

왜 문제인지(운영): ConsumerMetrics·EventLog는 추가되었으나 핵심 지표(outbox.pending, outbox.failed, consumer.failed, idempotent.duplicate 등)가 대시보드·알림으로 연결되지 않으면 문제를 신속히 감지하기 어렵다.
수정안: Prometheus/Grafana 연동으로 주요 카운터·타이머를 수집하고 알림 규칙을 설정한다. Outbox/Consumer 임계치 초과 시 자동 경보를 구성한다.
추가 테스트: 이벤트 처리 실패율을 시뮬레이션해 알림이 트리거되는지 확인하는 통합 테스트를 작성한다.

7. 리소스 재조정(스레드풀·DB 커넥션) 영향 분석 필요

왜 문제인지(운영): Tomcat 및 JPA 풀 축소(server.tomcat.threads.max, maximum-pool-size 등)는 트래픽 급증 시 동시 처리 용량을 제한해 요청 지연·consumer lag·데드락을 유발할 수 있다.
수정안: 부하 테스트를 통해 프로덕션 트래픽에 맞는 스레드·커넥션 설정을 산출하고 필요 시 autoscaling 또는 수평 확장을 도입한다. 비동기 executor 설정은 비동기 핸들러 요구량에 기반해 조정한다.
추가 테스트: 동시 요청/이벤트 처리량을 측정하는 부하 테스트를 실행해 응답 지연 및 consumer lag를 관찰한다.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Copilot

Pull request overview

Outbox 기반 이벤트 발행을 “스케줄 폴링 단독”에서 “afterCommit 즉시 발행 + 실패 보완 폴링”으로 전환하고, 이를 소비하는 commerce-streamer(메트릭/쿠폰) 측 컨슈머 및 관련 도메인/인프라를 추가합니다.

Changes:

commerce-api: OutboxEvent(저장/발행/보완 릴레이/정리) 도입 및 결제/좋아요 이벤트 발행 경로 추가
commerce-streamer: Kafka 배치 컨슈머 + 멱등 처리(EventHandled/EventLog) + 메트릭 upsert + 쿠폰 발급 처리 추가
Kafka 공통 모듈: producer idempotence/acks, consumer 설정 보정, DLT error handler 추가

Reviewed changes

Copilot reviewed 95 out of 95 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
modules/kafka/src/main/resources/kafka.yml	Kafka producer/consumer 설정 업데이트(acks/idempotence, deserializer 보정)
modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaTopics.java	토픽 상수 정의 추가
modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaConfig.java	배치 리스너/에러핸들러(DLT) 구성 추가
apps/commerce-streamer/src/main/resources/application.yml	앱 이름/consumer group-id 등 설정 추가
apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java	handled/log 레코드 정리 스케줄러 추가
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CatalogEventConsumer.java	catalog-events 배치 컨슈머 추가
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/OrderEventConsumer.java	order-events 배치 컨슈머 추가
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CouponIssueConsumer.java	coupon-issue-requests 배치 컨슈머 추가
apps/commerce-streamer/src/main/java/com/loopers/application/idempotent/IdempotentProcessor.java	멱등 처리 + EventHandled/EventLog 기록 추가
apps/commerce-streamer/src/main/java/com/loopers/application/metrics/MetricsService.java	메트릭 증가 서비스 추가
apps/commerce-streamer/src/main/java/com/loopers/application/metrics/ConsumerMetrics.java	Micrometer 기반 처리/실패/스킵 메트릭 기록 추가
apps/commerce-streamer/src/main/java/com/loopers/application/coupon/CouponIssueProcessor.java	쿠폰 발급 처리(중복/수량/발급레코드) 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/idempotent/EventHandled.java	멱등 처리 마커 엔티티 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/idempotent/EventHandledRepository.java	멱등 저장소 인터페이스 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/log/EventLog.java	소비 로그 엔티티 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/log/EventLogRepository.java	소비 로그 저장소 인터페이스 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/log/EventLogStatus.java	소비 로그 상태 enum 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/metrics/ProductMetrics.java	product_metrics 엔티티 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/metrics/ProductMetricsRepository.java	메트릭 저장소 인터페이스 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/Coupon.java	streamer용 경량 Coupon 엔티티 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponRepository.java	Coupon 저장소 인터페이스 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java	쿠폰 발급 요청 엔티티 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponIssueRequestRepository.java	쿠폰 발급 요청 저장소 인터페이스 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponIssueStatus.java	쿠폰 발급 요청 상태 enum 추가
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/IssuedCoupon.java	issued_coupons 엔티티 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/idempotent/EventHandledJpaRepository.java	EventHandled JPA 리포지토리 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/idempotent/EventHandledRepositoryImpl.java	EventHandled 저장소 구현 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/log/EventLogJpaRepository.java	EventLog JPA 리포지토리 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/log/EventLogRepositoryImpl.java	EventLog 저장소 구현 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/metrics/ProductMetricsJpaRepository.java	product_metrics upsert 쿼리 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/metrics/ProductMetricsRepositoryImpl.java	메트릭 저장소 구현 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/CouponJpaRepository.java	쿠폰 atomic update 쿼리 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/CouponRepositoryImpl.java	Coupon 저장소 구현 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestJpaRepository.java	쿠폰 발급 요청 JPA 리포지토리 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestRepositoryImpl.java	쿠폰 발급 요청 저장소 구현 추가
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/IssuedCouponJpaRepository.java	issued_coupons JPA 리포지토리 추가
apps/commerce-streamer/src/test/java/com/loopers/interfaces/scheduler/EventHandledCleanupSchedulerIntegrationTest.java	handled/log 정리 스케줄러 통합 테스트 추가
apps/commerce-streamer/src/test/java/com/loopers/interfaces/consumer/CatalogEventConsumerTest.java	catalog 컨슈머 단위 테스트 추가
apps/commerce-streamer/src/test/java/com/loopers/interfaces/consumer/OrderEventConsumerTest.java	order 컨슈머 단위 테스트 추가
apps/commerce-streamer/src/test/java/com/loopers/domain/log/EventLogTest.java	EventLog 도메인 테스트 추가
apps/commerce-streamer/src/test/java/com/loopers/domain/coupon/CouponIssueRequestTest.java	CouponIssueRequest 도메인 테스트 추가
apps/commerce-streamer/src/test/java/com/loopers/concurrency/CouponIssueConcurrencyTest.java	쿠폰 동시 발급 통합 테스트 추가
apps/commerce-streamer/src/test/java/com/loopers/application/metrics/MetricsServiceIntegrationTest.java	메트릭 서비스 통합 테스트 추가(현재 assertion 부족)
apps/commerce-streamer/src/test/java/com/loopers/application/idempotent/IdempotentProcessorIntegrationTest.java	멱등 처리 통합 테스트 추가(현재 assertion 부족)
apps/commerce-streamer/src/test/java/com/loopers/application/coupon/CouponIssueProcessorIntegrationTest.java	쿠폰 발급 프로세서 통합 테스트 추가
apps/commerce-api/src/main/resources/application.yml	commerce-api에서 kafka.yml import 추가
apps/commerce-api/build.gradle.kts	commerce-api에 modules:kafka 의존성 추가
apps/commerce-api/src/main/java/com/loopers/support/outbox/OutboxEvent.java	Outbox 이벤트 엔티티 추가
apps/commerce-api/src/main/java/com/loopers/support/outbox/OutboxEventStatus.java	Outbox 상태 enum 추가
apps/commerce-api/src/main/java/com/loopers/support/outbox/OutboxEventRepository.java	Outbox 저장소 인터페이스 추가
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventFactory.java	Outbox payload 직렬화 팩토리 추가
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventJpaRepository.java	Outbox JPA 리포지토리 추가(마킹/삭제 쿼리)
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventRepositoryImpl.java	Outbox 저장소 구현 추가
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventService.java	afterCommit 즉시 발행 + ACK 시 마킹 추가
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxRelayScheduler.java	PENDING 보완 릴레이 스케줄러 추가
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxCleanupScheduler.java	SENT 레코드 정리 스케줄러 추가
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/LikeCountReconciliationScheduler.java	likeCount 재동기화 스케줄러 추가
apps/commerce-api/src/main/java/com/loopers/infrastructure/product/ProductRepositoryImpl.java	reconcileLikeCountFromLikes 위임 추가
apps/commerce-api/src/main/java/com/loopers/infrastructure/product/ProductJpaRepository.java	likeCount reconcile 네이티브 업데이트 쿼리 추가
apps/commerce-api/src/main/java/com/loopers/domain/product/ProductRepository.java	reconcileLikeCountFromLikes 계약 추가
apps/commerce-api/src/main/java/com/loopers/support/config/AsyncConfig.java	`@Async` executor 설정 추가
apps/commerce-api/src/main/java/com/loopers/support/config/KafkaTopicConfig.java	토픽/DLT 토픽 생성 Bean 추가
apps/commerce-api/src/main/java/com/loopers/application/like/LikeFacade.java	afterCommit 캐시 eviction 대신 도메인 이벤트 발행으로 전환
apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java	상품 조회 이벤트 발행 추가
apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentProcessor.java	결제 결과 이벤트 발행 + Outbox 발행 추가
apps/commerce-api/src/main/java/com/loopers/application/event/*Event.java	Product/Payment/Coupon 이벤트 레코드 추가
apps/commerce-api/src/main/java/com/loopers/application/event/LikeCountEventHandler.java	AFTER_COMMIT + `@Async` 집계/캐시/카프카 발행 핸들러 추가
apps/commerce-api/src/main/java/com/loopers/application/event/UserActivityEventHandler.java	사용자 활동 로깅 핸들러 추가
apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponIssueRequestInfo.java	쿠폰 비동기 요청 상태 DTO 추가
apps/commerce-api/src/main/java/com/loopers/interfaces/api/coupon/CouponV1Dto.java	비동기 발급 요청 응답 DTO 추가
apps/commerce-api/src/main/java/com/loopers/interfaces/api/coupon/CouponV1Controller.java	비동기 발급 요청/상태조회 엔드포인트 추가
apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java	coupon_issue_requests 엔티티(uk_coupon_user) 추가
apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequestRepository.java	쿠폰 발급 요청 저장소 인터페이스 추가
apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueStatus.java	쿠폰 요청 상태 enum 추가
apps/commerce-api/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestJpaRepository.java	쿠폰 요청 조회 JPA 리포지토리 추가
apps/commerce-api/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestRepositoryImpl.java	쿠폰 요청 저장소 구현 추가
apps/commerce-api/src/test/java/com/loopers/support/outbox/OutboxEventTest.java	OutboxEvent 도메인 테스트 추가
apps/commerce-api/src/test/java/com/loopers/infrastructure/outbox/OutboxEventFactoryTest.java	OutboxEventFactory 테스트 추가
apps/commerce-api/src/test/java/com/loopers/infrastructure/outbox/OutboxEventServiceIntegrationTest.java	OutboxEventService 통합 테스트 추가
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OutboxRelaySchedulerIntegrationTest.java	Outbox 릴레이 스케줄러 통합 테스트 추가(일부 assertion 부족)
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OutboxCleanupSchedulerIntegrationTest.java	Outbox 정리 스케줄러 통합 테스트 추가(현재 assertion 부족)
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/LikeCountReconciliationSchedulerIntegrationTest.java	likeCount reconcile 스케줄러 통합 테스트 추가
apps/commerce-api/src/test/java/com/loopers/interfaces/api/like/LikeApiE2ETest.java	비동기 반영 대기(Awaitility)로 E2E 안정화
apps/commerce-api/src/test/java/com/loopers/concurrency/LikeConcurrencyTest.java	비동기 반영 대기(Awaitility)로 동시성 테스트 안정화
apps/commerce-api/src/test/java/com/loopers/application/event/LikeEventHandlerIntegrationTest.java	이벤트 핸들러 실패 격리 통합 테스트 추가
apps/commerce-api/src/test/java/com/loopers/interfaces/api/coupon/CouponAsyncApiE2ETest.java	쿠폰 비동기 발급 E2E 테스트 추가
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/DemoKafkaConsumer.java	데모 컨슈머 제거

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-27T07:55:12Z

apps/commerce-api/src/main/java/com/loopers/application/event/LikeCountEventHandler.java

+    private void publishToKafka(String eventType, Long productId, Object event) {
+        try {
+            String payload = objectMapper.writeValueAsString(event);
+            kafkaTemplate.send(KafkaTopics.CATALOG_EVENTS, String.valueOf(productId), payload);
+        } catch (Exception e) {


publishToKafka() publishes the raw ProductLiked/UnlikedEvent JSON directly to catalog-events, but CatalogEventConsumer expects an envelope with eventId/eventType/payload. This contract mismatch will cause parsing failures (or empty eventType) in the streamer and prevent metrics updates. Consider publishing the same envelope format used by the consumers (and generating an eventId), or changing consumers to match the produced schema.

Copilot · 2026-03-27T07:55:13Z

.../commerce-streamer/src/main/java/com/loopers/application/idempotent/IdempotentProcessor.java

+    public boolean process(String eventId, String eventType, String topic, String groupId, Runnable handler) {
+        if (eventHandledRepository.existsByEventId(eventId)) {
+            log.debug("이미 처리된 이벤트 스킵: eventId={}, topic={}, groupId={}", eventId, topic, groupId);
+            eventLogRepository.save(EventLog.skipped(eventId, eventType, topic, groupId));
+            consumerMetrics.recordSkipped(topic, groupId, eventType);
+            return false;
+        }
+
+        long startTime = System.currentTimeMillis();
+        try {
+            handler.run();
+            eventHandledRepository.save(EventHandled.create(eventId, eventType));
+


Idempotency check is existsByEventId() followed by handler.run() and then save(EventHandled). This is race-prone: concurrent deliveries of the same event can both pass the existence check and run the handler twice, then one will fail on PK/unique constraint. Prefer an atomic approach (e.g., insert EventHandled first and treat duplicate-key as "already processed", or use a DB-level unique constraint on (eventId, groupId) and handle conflicts) so the handler runs at most once.

Copilot · 2026-03-27T07:55:13Z

...-api/src/test/java/com/loopers/interfaces/scheduler/OutboxRelaySchedulerIntegrationTest.java

+        @Test
+        void SENT된_이벤트의_sentAt이_설정된다() {
+            OutboxEvent event = savePendingEvent();
+            when(kafkaTemplate.send(any(), any(), any())).thenReturn(CompletableFuture.completedFuture(null));
+
+            outboxRelayScheduler.compensatePendingEvents();
+
+            // PENDING으로 조회 안 되므로 직접 조회 — findPending은 PENDING만 반환
+            // sentAt 검증은 SENT로 전환된 것을 통해 간접 확인
+            assertThat(outboxEventRepository.findPending(10)).isEmpty();
+        }


SENT된_이벤트의_sentAt이_설정된다 test currently only checks that findPending() is empty, which doesn't validate sentAt at all (it would be empty even if sentAt is null but status changed). Consider asserting on the actual row (e.g., fetch by eventId/status or via JPA repository) and verify status=SENT and sentAt != null.

Copilot · 2026-03-27T07:55:13Z

modules/kafka/src/main/resources/kafka.yml

    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
-      retries: 3
+      acks: all
+      properties:
+        enable.idempotence: true


Producer value-serializer is JsonSerializer, but much of the code publishes pre-serialized JSON strings (e.g., OutboxEventService sends outboxEvent.getPayload()). JsonSerializer will serialize a String as a JSON string literal (quoting/escaping), so consumers that parse the bytes as a JSON object will fail. Either switch producer value-serializer to StringSerializer when sending JSON strings, or change publishers to send structured objects (envelope/DTO) and keep JsonSerializer end-to-end.

Copilot · 2026-03-27T07:55:13Z

apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventService.java

+        outboxEventRepository.save(outboxEvent);
+
+        afterCommit(() ->
+                kafkaTemplate.send(outboxEvent.getTopic(), outboxEvent.getAggregateId(), outboxEvent.getPayload())


The consumers in commerce-streamer expect an envelope JSON with fields like eventId, eventType, and a nested payload string, but the Outbox publisher currently sends only outboxEvent.getPayload() (the serialized domain event) as the Kafka value. That means eventId/eventType will be missing at the consumer and the switch will fall through, effectively dropping events. Consider sending a consistent envelope DTO (including eventId/eventType/payload) or sending the OutboxEvent itself, and align consumer parsing/tests accordingly.

Suggested change

kafkaTemplate.send(outboxEvent.getTopic(), outboxEvent.getAggregateId(), outboxEvent.getPayload())

kafkaTemplate.send(outboxEvent.getTopic(), outboxEvent.getAggregateId(), outboxEvent)

Copilot · 2026-03-27T07:55:15Z

apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CatalogEventConsumer.java

+    private void processRecord(ConsumerRecord<String, byte[]> record) throws Exception {
+        JsonNode node = objectMapper.readTree(record.value());
+        String eventId = node.path("eventId").asText();
+        String eventType = node.path("eventType").asText();
+        JsonNode payload = objectMapper.readTree(node.path("payload").asText());
+


processRecord() assumes the Kafka message is an envelope JSON containing eventId, eventType, and a stringified payload. With the current publishers (OutboxEventService/LikeCountEventHandler) sending different shapes (often just the payload JSON), node.path("eventId")/path("eventType") will be empty and the message will be skipped. Align the produced message schema with what the consumer parses, or adjust parsing to support the actual produced schema.

Copilot · 2026-03-27T07:55:15Z

...pi/src/test/java/com/loopers/interfaces/scheduler/OutboxCleanupSchedulerIntegrationTest.java

+        @Test
+        void SENT_상태이고_7일_경과한_레코드가_삭제된다() {
+            saveSentEvent(ZonedDateTime.now().minusDays(8));
+
+            outboxCleanupScheduler.cleanup();
+
+            // SENT + 8일 전 → 삭제됨. PENDING 조회로 간접 확인 불가하므로 새 PENDING 추가 후 확인
+            // cleanup은 SENT만 삭제하므로 PENDING은 영향 없음
+        }
+
+        @Test
+        void SENT_상태이고_7일_미만인_레코드는_삭제되지_않는다() {
+            saveSentEvent(ZonedDateTime.now().minusDays(3));
+
+            outboxCleanupScheduler.cleanup();
+
+            // 3일 전 SENT → 삭제 안 됨
+        }


This test doesn't assert anything about the cleanup result (only comments), so it can pass even if cleanup deletes the wrong rows or nothing at all. Add assertions by querying the outbox table (e.g., count/find by status+sentAt) to verify that SENT older than retention is removed and recent SENT remains.

Copilot · 2026-03-27T07:55:16Z

apps/commerce-streamer/src/main/java/com/loopers/domain/idempotent/EventHandled.java

+    @Id
+    @Column(name = "event_id", length = 36)
+    private String eventId;
+
+    @Column(name = "event_type", nullable = false, length = 100)
+    private String eventType;


idempotencyKey is built as GROUP_ID + ":" + eventId, but EventHandled.eventId is mapped with length=36. With UUID eventIds (36 chars) and non-trivial groupIds, this will exceed the column length and cause persistence errors/truncation at runtime. Either store eventId and groupId as separate columns (preferred) or increase the column length to accommodate the composed key.

Copilot · 2026-03-27T07:55:16Z

apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/OrderEventConsumer.java

+        switch (eventType) {
+            case "payment.completed" -> {
+                Long productId = payload.path("orderId").asLong();
+                BigDecimal amount = new BigDecimal(payload.path("amount").asText());
+                idempotentProcessor.process(idempotencyKey, eventType, TOPIC, GROUP_ID,
+                        () -> metricsService.incrementSales(productId, 1, amount));
+            }


The consumer parses payload.orderId but stores it in a variable named productId and uses it as the key for sales metrics. However the produced Payment* events (from PaymentProcessor) contain orderId and do not include productId, so this will aggregate sales by orderId (wrong dimension) or require a lookup. Either include productId in the event payload, or change the metrics model/consumer logic to derive productId correctly.

Copilot · 2026-03-27T07:55:16Z

modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaTopics.java

+package com.loopers.confg.kafka;
+
+public final class KafkaTopics {
+
+    public static final String CATALOG_EVENTS = "catalog-events";
+    public static final String ORDER_EVENTS = "order-events";


This new package name com.loopers.confg.kafka looks like a typo/inconsistency with the existing com.loopers.config.* convention used by other modules (e.g. modules/jpa/src/main/java/com/loopers/config/jpa/JpaConfig.java). Consider renaming to com.loopers.config.kafka to keep package naming consistent across modules.

coderabbitai

Actionable comments posted: 2

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java (1)
86-99: ⚠️ Potential issue | 🟠 Major

조회 이벤트가 캐시 미스에서만 발행되어 활동 로그가 과소집계된다.

Line 88-90에서 캐시 히트 시 즉시 반환되어 Line 97 이벤트 발행이 건너뛰어진다. 운영 관점에서 캐시 적중률이 올라갈수록 조회 이벤트/지표가 실제 트래픽보다 낮아져 모니터링, 추천, 분석 정확도를 떨어뜨린다. 캐시 히트/미스 모두에서 이벤트를 발행하도록 위치를 조정해야 한다. 추가 테스트로 캐시 워밍 후 연속 getActiveDetail 호출 시 호출 횟수만큼 ProductViewedEvent가 발행되는지 검증해야 한다.
수정 예시
 `@Transactional`(readOnly = true)
 public ProductInfo getActiveDetail(Long productId) {
     Optional<ProductInfo> cached = productCacheManager.getDetail(productId);
     if (cached.isPresent()) {
-        return cached.get();
+        ProductInfo info = cached.get();
+        eventPublisher.publishEvent(new ProductViewedEvent(null, productId));
+        return info;
     }

     Product product = productService.getActiveProduct(productId);
     Brand brand = brandService.getBrand(product.getBrandId());
     Stock stock = stockService.getStock(productId);
     ProductInfo info = ProductInfo.from(product, brand.getName(), stock.getQuantity());
     productCacheManager.putDetail(productId, info);
     eventPublisher.publishEvent(new ProductViewedEvent(null, productId));
     return info;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java`
around lines 86 - 99, The ProductViewedEvent is only published on cache miss
because getActiveDetail returns early on cache hit; update getActiveDetail so
eventPublisher.publishEvent(new ProductViewedEvent(null, productId)) is invoked
for both cache hits and misses (e.g., call publishEvent immediately after
obtaining cached.get() before returning, and keep the existing publish on miss
after putDetail), and add a unit/integration test that calls getActiveDetail
repeatedly after cache warming to assert ProductViewedEvent is published on each
call; refer to getActiveDetail, productCacheManager.getDetail,
productCacheManager.putDetail, eventPublisher.publishEvent, and
ProductViewedEvent to locate changes.

🟠 Major comments (19)

modules/kafka/src/main/resources/kafka.yml-17-19 (1)
17-19: ⚠️ Potential issue | 🟠 Major

보상 경로 타임아웃(5초)이 producer 타임아웃(20초)보다 짧아 중복 발행 위험이 존재한다.

kafka.yml의 request.timeout.ms: 20000(20초)과 OutboxRelayScheduler의 .get(5, TimeUnit.SECONDS) 사이 불일치로 다음 시나리오가 발생할 수 있다:

보상 경로가 5초 후 타임아웃 판정 → retryCount 증가 및 재발행 시도

하지만 producer의 첫 시도는 여전히 진행 중 → 20초 내 ACK 도착

결과: 같은 메시지가 중복으로 Kafka에 적재되거나, outbox 상태(PENDING/FAILED)와 실제 발행 결과가 어긋남

수정안:

보상 경로의 타임아웃을 delivery.timeout.ms 또는 request.timeout.ms와 동일하거나 그 이상으로 조정 (권장: 25초 이상)

상태 전이 로직을 단건 성공으로만 최종 종료하도록 보강 (중복 마킹 방지)

필수 테스트:
Kafka 브로커 응답을 6~10초 지연시키는 통합 테스트를 추가하여, 단건만 소비되고 outbox 상태가 일관되게 SENT로 종료되는지 검증한다. 현재 테스트는 5초 이상의 지연 시나리오를 다루지 않는다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/kafka/src/main/resources/kafka.yml` around lines 17 - 19, Outbox
retry timeout is shorter than the Kafka producer timeout causing duplicate
publishes; update the OutboxRelayScheduler call that uses .get(5,
TimeUnit.SECONDS) to use a timeout >= the producer's
request.timeout.ms/delivery.timeout.ms (recommend >=25s) so it won't retry while
the producer is still waiting, and harden the outbox state-transition logic in
the OutboxRelay handling to only mark a message SENT on a confirmed single
successful publish (idempotency guard) to prevent duplicate SENT/FAILED flips;
adjust kafka.yml request.timeout.ms/delivery.timeout.ms to match the new
scheduler timeout and add an integration test that delays broker responses 6–10s
to assert only one publish and final state SENT.
apps/commerce-streamer/src/main/resources/application.yml-18-21 (1)
18-21: ⚠️ Potential issue | 🟠 Major

profiles.active: local hardcoding과 kafka consumer 설정을 환경별로 분리하라.

두 가지 문제가 있다.

profiles.active: local hardcoding (라인 22-23): 배포 환경에 관계없이 항상 local 프로파일로 설정되어 dev/qa/prd에서도 local 환경 설정이 강제된다. 이는 환경별 설정이 완전히 무시되는 문제다.

kafka consumer 설정이 base config에만 존재 (라인 20-21): group-id: commerce-streamer와 auto-offset-reset: earliest가 모든 환경에 적용되어, 동일 브로커 공유 시 신규 group-id 배포 순간 전체 토픽 재처리, 중복 집계, 쿠폰 재처리 부하가 발생할 수 있다.

수정안:

profiles.active: local 제거 또는 환경 변수(spring.profiles.active)로 변경

auto-offset-reset: earliest는 local/test 프로파일 섹션으로만 이동

group-id를 환경 변수 또는 프로파일별 값으로 분리 (dev/qa/prd는 각각 commerce-streamer-dev, commerce-streamer-qa, commerce-streamer-prd 등)

프로파일별 섹션(현재 36-58)에 kafka consumer 설정을 추가하여 각 환경의 group-id와 offset 정책을 명시

추가 테스트:
프로파일별 설정 바인딩 테스트로 local/test에서만 earliest를 사용하고, dev/qa/prd에서는 환경별 group-id와 latest 또는 명시적 offset 정책을 갖는지 검증하라.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/commerce-streamer/src/main/resources/application.yml` around lines 18 -
21, Remove the hardcoded profiles.active: local and make spring.profiles.active
driven by environment variables; move kafka.consumer.group-id and
kafka.consumer.auto-offset-reset out of the base section and into each profile
block (e.g., local/test/dev/qa/prd) so local/test use auto-offset-reset:
earliest while dev/qa/prd use explicit group-ids like
commerce-streamer-dev/qa/prd and a safer offset policy (e.g., latest); update
the profile sections referenced in the file to include kafka.consumer entries
per environment and ensure group-id can also be templated via an environment
variable for overrides; finally add a simple binding/test to assert that
local/test resolve auto-offset-reset=earliest and that dev/qa/prd resolve
distinct group-ids and the intended offset policy.
apps/commerce-streamer/src/test/java/com/loopers/application/metrics/MetricsServiceIntegrationTest.java-39-83 (1)
39-83: ⚠️ Potential issue | 🟠 Major

검증 없는 통합 테스트로 회귀를 잡지 못하는 상태다

현재 테스트는 서비스 호출만 수행하고 결과 단정이 없어, 집계 로직이 깨져도 CI가 통과할 수 있다. 운영에서 메트릭 왜곡이 발생해도 사전 탐지가 어렵다. 각 케이스마다 저장된 메트릭 값을 조회해 like/view/sales_count와 sales_amount를 명시적으로 검증하도록 수정이 필요하다. 추가로 delta=0, 음수 누적, 비정상 입력 같은 경계/실패 케이스 테스트를 보강해야 한다.

As per coding guidelines **/*Test*.java: 단위 테스트는 경계값/실패 케이스/예외 흐름을 포함하는지 점검한다. 통합 테스트는 격리 수준, 플래키 가능성, 테스트 데이터 준비/정리 전략을 점검한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-streamer/src/test/java/com/loopers/application/metrics/MetricsServiceIntegrationTest.java`
around lines 39 - 83, Tests in MetricsServiceIntegrationTest currently call
metricsService.increment* methods but don't assert results; update each test
(e.g., the methods incrementLikeCount하면_like_count가_증가한다,
음수_delta면_like_count가_감소한다, 존재하지_않는_productId면_UPSERT로_레코드가_생성된다,
incrementViewCount하면_view_count가_증가한다,
incrementSales하면_sales_count와_sales_amount가_증가한다) to retrieve the persisted
metrics (via the same MetricsService or repository method) and assert expected
values for like_count, view_count, sales_count and sales_amount after each
operation; also add explicit tests for delta=0, cumulative negative deltas, and
invalid inputs (e.g., null productId or negative amounts) and ensure test
setup/cleanup (BeforeEach/AfterEach or test transaction rollback) so tests are
isolated and deterministic.
apps/commerce-api/src/main/java/com/loopers/application/event/PaymentCanceledEvent.java-4-5 (1)
4-5: ⚠️ Potential issue | 🟠 Major

결제 취소 이벤트의 필수 식별자 검증이 없어 보상 플로우 장애로 이어질 수 있다

paymentId/orderId/userId가 null이어도 이벤트가 생성되면 후속 처리에서 반복 실패해 보상 처리 지연과 운영 알람 노이즈가 발생할 수 있다. canonical constructor에서 필수 필드 검증을 강제해야 한다. 추가로 각 필드 null 예외 테스트와 정상 이벤트 생성 테스트를 추가해야 한다.
수정 예시
 package com.loopers.application.event;
 
+import java.util.Objects;
+
 public record PaymentCanceledEvent(Long paymentId, Long orderId, Long userId) {
+    public PaymentCanceledEvent {
+        Objects.requireNonNull(paymentId, "paymentId must not be null");
+        Objects.requireNonNull(orderId, "orderId must not be null");
+        Objects.requireNonNull(userId, "userId must not be null");
+    }
 }
As per coding guidelines **/*.java: null 처리, 방어적 복사, 불변성 구현 안정성을 점검한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/application/event/PaymentCanceledEvent.java`
around lines 4 - 5, The PaymentCanceledEvent record currently allows null
paymentId/orderId/userId which can break compensation flows; add a canonical
constructor in PaymentCanceledEvent that validates paymentId, orderId, and
userId are non-null (throwing NullPointerException or IllegalArgumentException
with clear messages) to enforce invariants, and add unit tests that assert each
field null causes an exception and that a fully populated event constructs
successfully; reference the record PaymentCanceledEvent and its canonical
constructor for the change and add tests covering null paymentId, null orderId,
null userId, and a valid instantiation.
apps/commerce-api/src/main/java/com/loopers/application/event/ProductViewedEvent.java-3-4 (1)
3-4: ⚠️ Potential issue | 🟠 Major

이벤트 식별자 null 허용으로 비정상 메시지 적체를 유발할 수 있다

생성 시점 검증이 없어 null 값이 outbox/Kafka로 전파되면 소비 단계에서 반복 실패를 일으켜 재시도 적체나 DLQ 증가로 이어질 수 있다. 레코드 canonical constructor에서 null 검증(필요 시 양수 검증 포함)으로 조기 실패시키는 수정이 필요하다. 추가로 각 필드 null 입력 예외 테스트와 정상 생성 테스트를 함께 추가해야 한다.
수정 예시
 package com.loopers.application.event;
 
+import java.util.Objects;
+
 public record ProductViewedEvent(Long userId, Long productId) {
+    public ProductViewedEvent {
+        Objects.requireNonNull(userId, "userId must not be null");
+        Objects.requireNonNull(productId, "productId must not be null");
+    }
 }
As per coding guidelines **/*.java: null 처리, 방어적 복사, 불변성 구현 안정성을 점검한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/application/event/ProductViewedEvent.java`
around lines 3 - 4, ProductViewedEvent의 레코드 생성자가 userId/productId에 대한 검증이 없어
null(또는 음수) 값이 outbox/Kafka로 전파될 수 있으므로, ProductViewedEvent의 canonical
constructor에서 userId와 productId가 null인지(및 필요시 >0인지) 검사하고 적절한 예외(예:
NullPointerException 또는 IllegalArgumentException)를 던지도록 수정하세요; 대상 식별자는 record
ProductViewedEvent(Long userId, Long productId)이며 검증 추가 후 userId_null,
productId_null, 그리고 정상 생성 케이스에 대한 단위 테스트를 추가해 유효성 실패와 성공 동작을 각각 검증하십시오.
apps/commerce-api/src/main/java/com/loopers/application/event/ProductLikedEvent.java-4-5 (1)
4-5: ⚠️ Potential issue | 🟠 Major

이벤트 payload 무결성 검증 부재로 소비 실패 재시도 리스크가 있다

현재 구조에서는 null 식별자 이벤트가 생성 가능해, 운영 중 소비 실패가 반복되고 보상 경로 부하가 커질 수 있다. canonical constructor에서 필수 필드 null 검증을 추가해 생성 단계에서 차단해야 한다. 추가로 null 입력 예외 테스트와 정상 케이스 테스트를 보강해야 한다.
수정 예시
 package com.loopers.application.event;
 
+import java.util.Objects;
+
 public record ProductLikedEvent(Long userId, Long productId) {
+    public ProductLikedEvent {
+        Objects.requireNonNull(userId, "userId must not be null");
+        Objects.requireNonNull(productId, "productId must not be null");
+    }
 }
As per coding guidelines **/*.java: null 처리, 방어적 복사, 불변성 구현 안정성을 점검한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/application/event/ProductLikedEvent.java`
around lines 4 - 5, ProductLikedEvent currently allows null userId/productId
which risks consumer failures; add a canonical constructor for the record
ProductLikedEvent(Long userId, Long productId) that validates both userId and
productId are non-null and throws a clear NullPointerException with context when
invalid, ensuring immutability is preserved; update unit tests to include a
null-input test for each field (expecting NPE) and a normal-case test asserting
successful construction and getters, and ensure any future mutable fields would
use defensive copies (noting Long is immutable).
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventFactory.java-18-24 (1)
18-24: ⚠️ Potential issue | 🟠 Major

입력값 검증 강화 및 예외 타입 일원화가 필요하다.

eventPayload가 null이거나 직렬화 불가능한 값으로 들어오면 현재 구현은 "null" 문자열을 outbox에 저장하여 독 메시지를 발생시킬 수 있다. 또한 IllegalArgumentException을 사용하므로 commerce-api의 일관된 오류 처리 경로(CoreException)를 벗어난다. eventType, aggregateType, aggregateId, topic, eventPayload를 먼저 검증하고, 직렬화 실패는 cause를 보존한 CoreException(ErrorType.INTERNAL_ERROR, "이벤트 직렬화 실패", cause) 형태로 감싸야 한다. 추가로 null payload 거부, 직렬화 실패 시 CoreException 발생, 상위 계층에서 일관된 오류 응답이 반환되는지 테스트로 검증해야 한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventFactory.java`
around lines 18 - 24, Validate all inputs at the start of
OutboxEventFactory.create: ensure eventType, aggregateType, aggregateId, topic,
and eventPayload are non-null/non-empty (reject null payload explicitly rather
than allowing "null" string) and throw a CoreException with appropriate
ErrorType for invalid input; when calling
objectMapper.writeValueAsString(eventPayload) catch exceptions and rethrow as
new CoreException(ErrorType.INTERNAL_ERROR, "이벤트 직렬화 실패", cause) preserving the
original cause; update callers/tests to expect CoreException on
null/serialization failure and confirm the higher-level error handling path
returns the consistent response.
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/LikeCountReconciliationScheduler.java-21-25 (1)
21-25: ⚠️ Potential issue | 🟠 Major

캐시 무효화와 성공 로그를 트랜잭션 커밋 후로 이동해야 한다.

현재 reconcile() 메서드는 @Transactional로 표시되어 있으나, DB 갱신 이후 evictAllLists()와 성공 로그를 커밋 이전에 실행한다. 이로 인해 다음의 문제가 발생한다:

캐시 무효화 직후부터 커밋까지의 윈도우에 다른 요청이 오면, DB에서 방금 갱신된 likeCount를 읽어 캐시에 다시 적재한다.

DB 트랜잭션이 롤백되더라도 캐시는 갱신된 값을 유지하며 "동기화 완료" 로그만 기록된다.

캐시와 DB 간 데이터 불일치가 발생한다.

기존 TransactionHelper.afterCommit()를 사용하여 캐시 무효화와 성공 로그를 커밋 후 단계로 옮긴다:
`@Scheduled`(cron = "0 0 2 * * *")
`@Transactional`
public void reconcile() {
    int updated = productRepository.reconcileLikeCountFromLikes();
    if (updated > 0) {
        TransactionHelper.afterCommit(() -> {
            productCacheManager.evictAllLists();
            log.info("likeCount reconciliation 완료: {}건 동기화", updated);
        });
    }
}
더불어 다음을 확인하는 통합 테스트를 추가한다:

DB 갱신 중 캐시 재적재가 발생하는 경우, 커밋 후 캐시가 올바르게 무효화되는지

트랜잭션 롤백 시 캐시가 유지되고 성공 로그가 기록되지 않는지
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/LikeCountReconciliationScheduler.java`
around lines 21 - 25, The reconcile() method currently evicts caches and logs
success before the surrounding `@Transactional` commit, causing potential cache/DB
inconsistency; change it so that after calling
productRepository.reconcileLikeCountFromLikes() you only register a
TransactionHelper.afterCommit(() -> { productCacheManager.evictAllLists();
log.info(..., updated); }) when updated > 0, leaving the DB update inside the
transaction and deferring cache invalidation and success logging until after
commit; also add integration tests that simulate (1) concurrent cache reload
during DB update to assert cache is correctly evicted after commit, and (2) a
forced rollback to assert cache remains unchanged and the success log is not
emitted.
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxCleanupScheduler.java-23-29 (1)
23-29: ⚠️ Potential issue | 🟠 Major

대량 삭제 시 트랜잭션 타임아웃 및 락 경합 위험

deleteSentBefore()가 단일 트랜잭션 내에서 7일 이전 전체 SENT 레코드를 삭제한다. 레코드 수가 많으면 장시간 테이블 락이 유지되어 다른 outbox 작업(INSERT, UPDATE)이 블로킹될 수 있다.

운영 관점: 트래픽 증가 시 outbox 테이블 크기가 급증하면 새벽 3시 cleanup 작업이 수십 분 지속될 수 있고, 이 시간 동안 이벤트 발행이 지연된다.
♻️ 배치 삭제 방식 제안
     `@Scheduled`(cron = "0 0 3 * * *")
     `@Transactional`
     public void cleanup() {
         ZonedDateTime before = ZonedDateTime.now().minusDays(RETENTION_DAYS);
-        outboxEventRepository.deleteSentBefore(before);
-        log.info("Outbox cleanup 완료: {}일 이전 SENT 레코드 삭제", RETENTION_DAYS);
+        int totalDeleted = 0;
+        int deleted;
+        do {
+            deleted = outboxEventRepository.deleteSentBeforeBatch(before, 1000);
+            totalDeleted += deleted;
+        } while (deleted > 0);
+        log.info("Outbox cleanup 완료: {}일 이전 SENT 레코드 {}건 삭제", RETENTION_DAYS, totalDeleted);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxCleanupScheduler.java`
around lines 23 - 29, The cleanup() job in OutboxCleanupScheduler currently
calls outboxEventRepository.deleteSentBefore(before) in one transaction which
can hold locks and time out for large datasets; change cleanup() to delete in
small batches: query IDs (or entities) older than 'before' using a paged finder
(e.g., findIdsSentBefore/ZonedDateTime) with a fixed batchSize, then in a loop
delete each batch and let each batch run in its own transaction (remove or avoid
the class-level `@Transactional` on cleanup and use a batch-scoped `@Transactional`
or TransactionTemplate/REQUIRES_NEW for each delete) so locks are short-lived
and you commit between batches; reference methods/classes:
OutboxCleanupScheduler.cleanup(), outboxEventRepository.deleteSentBefore(...)
and add/find a paged finder like outboxEventRepository.findIdsSentBefore(...)
and a batch delete like outboxEventRepository.deleteAllByIdInBatch(...).
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxRelayScheduler.java-39-55 (1)
39-55: ⚠️ Potential issue | 🟠 Major

보완 스케줄러가 stale 이벤트만 수거하지 않아 메인 발행 경로와 경쟁한다.

findPending(BATCH_SIZE)가 모든 PENDING row를 바로 집계하므로, afterCommit 비동기 전송이 아직 진행 중인 최신 row나 다른 인스턴스가 이미 집은 row까지 다시 발행할 수 있다. 운영에서는 중복 publish가 늘고, DB 마킹이 조금만 늦어도 보완 경로가 동일 이벤트를 재전송하므로 조회 단계에 grace period를 넣고 원자적 claim 방식(PROCESSING, SKIP LOCKED, lease 등)으로 stale row만 가져오도록 바꾸는 편이 안전하다. 추가로 "즉시 발행 in-flight row는 scheduler가 건너뛴다"와 "동시 두 scheduler 실행에도 단 한 번만 send 된다" 시나리오를 통합 테스트로 검증해야 한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxRelayScheduler.java`
around lines 39 - 55, compensatePendingEvents currently calls
outboxEventRepository.findPending(BATCH_SIZE) and can pick up in-flight or
concurrently-claimed rows; change the retrieval to only select stale, claimable
rows by adding a grace period and an atomic claim step (e.g., update status to
PROCESSING with a lease/processed_at and use SELECT ... FOR UPDATE SKIP LOCKED
or a WHERE status = 'PENDING' AND updated_at < now() - grace AND claimed_by IS
NULL RETURNING id to atomically mark), then operate only on claimed rows and
persist final status (SENT or FAILED) from the compensatePendingEvents loop;
update OutboxEventRepository methods (replace findPending) to implement the
claim-and-return behavior and add tests that simulate concurrent schedulers and
in-flight async sends to assert single-send semantics.
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxRelayScheduler.java-49-65 (1)
49-65: ⚠️ Potential issue | 🟠 Major

InterruptedException을 일반 실패로 삼키면 종료 신호를 잃는다.

Future.get()의 InterruptedException까지 catch (Exception)으로 처리하면 배포나 종료 시 스케줄러 스레드의 interrupt 상태가 사라지고, 실제 종료 요청이 Kafka 전송 실패로 오분류된다. InterruptedException은 별도로 잡아 Thread.currentThread().interrupt()를 복원한 뒤 루프를 중단하고, 나머지 예외만 재시도 카운트 대상으로 분리해야 한다. 추가로 send().get()이 InterruptedException을 던질 때 interrupt flag가 복원되고 이후 이벤트를 더 처리하지 않는지 테스트를 넣어야 한다.
수정 예시
-            } catch (Exception e) {
+            } catch (InterruptedException e) {
+                Thread.currentThread().interrupt();
+                log.warn("Outbox relay interrupted");
+                break;
+            } catch (Exception e) {
                 event.incrementRetryCount();
                 if (event.getRetryCount() >= MAX_RETRY_COUNT) {
                     event.markFailed();
                     log.error("Outbox FAILED: eventId={}, retryCount={}", event.getEventId(), event.getRetryCount(), e);
                 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxRelayScheduler.java`
around lines 49 - 65, The loop in OutboxRelayScheduler that calls
kafkaTemplate.send(...).get(...) currently catches all Exceptions and swallows
InterruptedException; change the try/catch so InterruptedException is caught
separately, call Thread.currentThread().interrupt(), break out of the
pendingEvents processing loop, and do not treat it as a retryable failure; keep
the existing catch(Exception) to increment retryCount, mark failed when
retryCount >= MAX_RETRY_COUNT, save via outboxEventRepository.save(event), and
increment failCount for other errors; also add a test that simulates
send().get() throwing InterruptedException and asserts the interrupt flag is
restored and no further events are processed.
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/LikeCountReconciliationSchedulerIntegrationTest.java-64-78 (1)
64-78: ⚠️ Potential issue | 🟠 Major

비동기 핸들러와 경쟁하는 구조라 통합 테스트가 플래키해질 수 있다

운영 관점에서 플래키 테스트는 CI 신뢰도를 떨어뜨려 실제 회귀를 가린다. 수정안은 likeService.like(...) 대신 likes 테이블을 동기적으로 준비하거나, 테스트에서 async 실행기를 제어해 reconcile() 시점의 상태를 결정적으로 만드는 방식이다. 추가 테스트로 동일 시나리오를 반복 실행해도 항상 동일 결과를 보장하는 안정성 검증을 포함해야 한다.
As per coding guidelines: 통합 테스트는 격리 수준, 플래키 가능성, 테스트 데이터 준비/정리 전략을 점검한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/LikeCountReconciliationSchedulerIntegrationTest.java`
around lines 64 - 78, The test is flaky due to async like handling racing with
scheduler.reconcile; instead of calling likeService.like(...) use a synchronous
preparation (insert rows directly into the likes table via LikeRepository or a
test utility) or stub/control the async executor used by LikeEventHandler so the
async increments won't run before you assert; ensure you explicitly set product
like count (productService.incrementLikeCount(...) or
productRepository.save(...)) to create the mismatch deterministically, then call
scheduler.reconcile() and assert, and add a small repeated-loop variant of the
test to verify stability across multiple runs.
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventService.java-37-38 (1)
37-38: ⚠️ Potential issue | 🟠 Major

인프라 계층의 트랜잭션 경계 선언을 애플리케이션 계층으로 이동해야 한다

운영 관점에서 인프라 서비스에 전파 정책(MANDATORY)을 고정하면 호출 유스케이스별 롤백/재시도 정책과 충돌해 장애 분석이 어려워진다. 수정안은 이 메서드의 @Transactional을 제거하고, 호출하는 ApplicationService 유스케이스 단위에서 트랜잭션 경계를 선언하는 방식이다. 추가 테스트로 트랜잭션 없는 호출 실패 검증은 애플리케이션 계층 테스트로 이동하고, 정상 경로에서 Outbox INSERT + afterCommit 발행이 유지되는 통합 테스트를 보강해야 한다.
Based on learnings: Do not annotate infrastructure layer implementations with Transactional (even readOnly = true). Transaction management should be handled by the Application layer (ApplicationService).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventService.java`
around lines 37 - 38, Remove the infrastructure-level transaction boundary on
OutboxEventService by deleting the `@Transactional`(propagation =
Propagation.MANDATORY) annotation from the saveAndPublish method in class
OutboxEventService; ensure callers (ApplicationService use-cases) declare
transactional boundaries instead, move any tests that assert behavior when no
transaction is present from the infrastructure tests into the corresponding
application-layer unit/integration tests, and add/strengthen an integration test
at the application layer that verifies a successful path results in an Outbox
INSERT and an afterCommit publish/dispatch.
apps/commerce-api/src/main/java/com/loopers/application/event/LikeCountEventHandler.java-53-59 (1)
53-59: ⚠️ Potential issue | 🟠 Major

Kafka 비동기 전송 실패를 현재 코드가 감지하지 못한다

kafkaTemplate.send() 메서드는 CompletableFuture를 반환하지만, 현재 코드는 이 future의 완료 콜백을 처리하지 않는다. 따라서 브로커의 ACK 실패, 타임아웃, 파티션 오류 등 비동기 전송 오류가 로그나 지표에 기록되지 않아 메시지 유실을 감지할 수 없다.

56줄의 kafkaTemplate.send() 호출에 whenComplete() 콜백을 연결하여 비동기 실패를 명시적으로 처리해야 한다:
수정 예시
 private void publishToKafka(String eventType, Long productId, Object event) {
     try {
         String payload = objectMapper.writeValueAsString(event);
-        kafkaTemplate.send(KafkaTopics.CATALOG_EVENTS, String.valueOf(productId), payload);
+        kafkaTemplate.send(KafkaTopics.CATALOG_EVENTS, String.valueOf(productId), payload)
+                .whenComplete((result, ex) -> {
+                    if (ex != null) {
+                        log.warn("Kafka 직접 발행 실패 (fire-and-forget): eventType={}, productId={}",
+                                eventType, productId, ex);
+                    }
+                });
     } catch (Exception e) {
         log.warn("Kafka 직接 발행 실패 (fire-and-forget): eventType={}, productId={}", eventType, productId, e);
     }
 }
테스트 시 Kafka 브로커 장애를 시뮬레이션하여 경고 로그가 남는지 검증하고, 필요 시 메트릭 수집이나 재시도 로직을 추가해야 한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/application/event/LikeCountEventHandler.java`
around lines 53 - 59, The publishToKafka method currently calls
kafkaTemplate.send(...) but ignores the returned CompletableFuture; attach a
whenComplete callback to the Future returned by
kafkaTemplate.send(KafkaTopics.CATALOG_EVENTS, String.valueOf(productId),
payload) inside publishToKafka to explicitly handle async failures and
successes, logging failures with full exception details (use log.warn or
log.error with eventType, productId and the throwable) and consider recording a
metric or triggering retry logic from that callback; ensure you keep the
existing try/catch for synchronous serialization errors
(objectMapper.writeValueAsString) while moving the async failure handling into
the whenComplete handler for the send result.
apps/commerce-api/src/test/java/com/loopers/application/event/LikeEventHandlerIntegrationTest.java-52-68 (1)
52-68: ⚠️ Potential issue | 🟠 Major

테스트가 실제로 핸들러 실패 시나리오를 검증하지 않는다.

테스트 이름은 "LikeCountEventHandler가_예외를_던져도"이지만, 실제로 핸들러에서 예외를 발생시키지 않는다. @Async 핸들러가 별도 스레드에서 실행되므로 현재 테스트는 단순히 정상 케이스만 검증하고 있다. 실패 격리를 제대로 검증하려면 핸들러를 Mock하여 예외를 던지도록 설정하거나, 테스트용 프로파일로 실패하는 핸들러를 주입해야 한다.

개선 방안

@MockBean으로 LikeCountEventHandler를 주입하고 예외를 던지도록 설정

또는 테스트용 프로파일에서 실패하는 핸들러 구현체를 제공

테스트 후 비동기 작업 완료를 대기한 뒤 상태 검증

현재 테스트는 테스트 이름과 실제 검증 내용이 불일치하므로, 최소한 테스트 이름을 "정상_좋아요_저장_테스트"로 변경하거나 실제 실패 시나리오를 구현해야 한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/test/java/com/loopers/application/event/LikeEventHandlerIntegrationTest.java`
around lines 52 - 68, The test names in LikeEventHandlerIntegrationTest claim to
verify failure isolation for LikeCountEventHandler but never trigger a handler
exception; update the test to actually simulate a handler failure by injecting a
failing handler (e.g., annotate LikeEventHandlerIntegrationTest with `@MockBean`
LikeCountEventHandler and configure the mock to throw an exception when handling
the event) or provide a test-profile failing implementation, invoke
likeFacade.like(...) as now, and then wait for asynchronous handlers to complete
before asserting likes exist via likeRepository.existsByUserIdAndProductId(...);
alternatively, if you prefer not to change behavior, rename the tests to reflect
they only verify normal persistence.
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CouponIssueConsumer.java-44-56 (1)
44-56: ⚠️ Potential issue | 🟠 Major

JSON 필드 누락 또는 잘못된 타입에 대한 방어 코드가 없다.

asLong(), asText()는 필드가 없거나 타입이 맞지 않을 때 기본값(0L, "")을 반환한다. couponId나 userId가 0L일 경우 잘못된 데이터로 쿠폰 발급이 진행될 수 있다. 필수 필드 검증을 추가하고, 검증 실패 시 명확한 예외를 던져 DLQ로 이동하도록 해야 한다.
수정안
     private void processRecord(ConsumerRecord<String, byte[]> record) throws Exception {
         JsonNode node = objectMapper.readTree(record.value());
-        String eventId = node.path("eventId").asText();
-        String eventType = node.path("eventType").asText();
+        String eventId = getRequiredText(node, "eventId");
+        String eventType = getRequiredText(node, "eventType");
         JsonNode payload = objectMapper.readTree(node.path("payload").asText());

-        Long couponId = payload.path("couponId").asLong();
-        Long userId = payload.path("userId").asLong();
+        Long couponId = getRequiredLong(payload, "couponId");
+        Long userId = getRequiredLong(payload, "userId");
         String idempotencyKey = GROUP_ID + ":" + eventId;

         idempotentProcessor.process(idempotencyKey, eventType, TOPIC, GROUP_ID,
                 () -> couponIssueProcessor.process(eventId, couponId, userId));
     }
+
+    private String getRequiredText(JsonNode node, String field) {
+        JsonNode value = node.path(field);
+        if (value.isMissingNode() || value.isNull() || value.asText().isBlank()) {
+            throw new IllegalArgumentException("필수 필드 누락: " + field);
+        }
+        return value.asText();
+    }
+
+    private Long getRequiredLong(JsonNode node, String field) {
+        JsonNode value = node.path(field);
+        if (value.isMissingNode() || value.isNull() || !value.isNumber()) {
+            throw new IllegalArgumentException("필수 필드 누락 또는 잘못된 타입: " + field);
+        }
+        return value.asLong();
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CouponIssueConsumer.java`
around lines 44 - 56, In processRecord (CouponIssueConsumer.processRecord) add
defensive validation after parsing: verify that eventId and eventType are
non-empty and that payload contains positive couponId and userId (not 0L), and
if any check fails throw a clear exception (e.g., IllegalArgumentException with
a descriptive message) so the record goes to DLQ; ensure you still use
objectMapper for parsing but replace direct asLong()/asText() usage with
presence/type checks on JsonNode (for example checking
hasNonNull()/isNumber()/isTextual()) before calling idempotentProcessor.process
and couponIssueProcessor.process.
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java-53-62 (1)
53-62: ⚠️ Potential issue | 🟠 Major

상태 전이 메서드에서 현재 상태 검증이 누락되었다.

complete()와 reject()가 현재 상태를 검증하지 않아, 이미 COMPLETED된 요청을 다시 reject하거나 그 반대 상황이 발생할 수 있다. 이는 데이터 정합성 문제를 야기한다.
수정안: 상태 검증 추가
     public void complete() {
+        if (this.status != CouponIssueStatus.PENDING) {
+            throw new IllegalStateException("PENDING 상태에서만 완료 처리가 가능하다. 현재 상태: " + this.status);
+        }
         this.status = CouponIssueStatus.COMPLETED;
         this.processedAt = ZonedDateTime.now();
     }

     public void reject(String reason) {
+        if (this.status != CouponIssueStatus.PENDING) {
+            throw new IllegalStateException("PENDING 상태에서만 거절 처리가 가능하다. 현재 상태: " + this.status);
+        }
         this.status = CouponIssueStatus.REJECTED;
         this.rejectReason = reason;
         this.processedAt = ZonedDateTime.now();
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java`
around lines 53 - 62, Add state validation in CouponIssueRequest: in the
complete() and reject(String) methods check the current this.status before
changing it (using the CouponIssueStatus enum) and throw an
IllegalStateException (or a domain-specific exception) if the transition is
invalid (e.g., cannot complete if already REJECTED or COMPLETED, cannot reject
if already COMPLETED or REJECTED). Update complete() and reject() to only set
this.status, this.rejectReason, and this.processedAt when the current state
allows the transition; reference the methods complete(), reject(String), field
processedAt, field rejectReason, and enum CouponIssueStatus to implement the
checks.
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CatalogEventConsumer.java-44-61 (1)
44-61: ⚠️ Potential issue | 🟠 Major

CouponIssueConsumer와 동일하게 JSON 필드 검증이 누락되었다.

asLong() 호출 시 필드가 없으면 0L이 반환되어 잘못된 productId로 메트릭이 업데이트될 수 있다. 또한 processRecord() 로직이 CouponIssueConsumer와 거의 동일하므로, 공통 유틸리티 클래스로 추출하여 중복을 제거하는 것을 권장한다.
수정안
     private void processRecord(ConsumerRecord<String, byte[]> record) throws Exception {
         JsonNode node = objectMapper.readTree(record.value());
         String eventId = node.path("eventId").asText();
         String eventType = node.path("eventType").asText();
         JsonNode payload = objectMapper.readTree(node.path("payload").asText());

+        if (eventId.isBlank()) {
+            throw new IllegalArgumentException("eventId 누락");
+        }
+
         String idempotencyKey = GROUP_ID + ":" + eventId;

         switch (eventType) {
-            case "product.liked" -> idempotentProcessor.process(idempotencyKey, eventType, TOPIC, GROUP_ID,
-                    () -> metricsService.incrementLikeCount(payload.path("productId").asLong(), 1));
+            case "product.liked" -> {
+                Long productId = getRequiredLong(payload, "productId");
+                idempotentProcessor.process(idempotencyKey, eventType, TOPIC, GROUP_ID,
+                        () -> metricsService.incrementLikeCount(productId, 1));
+            }
             // ... 나머지 case도 동일하게 수정
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CatalogEventConsumer.java`
around lines 44 - 61, processRecord is missing validation for required JSON
fields so payload.path("productId").asLong() can silently return 0 and update
wrong metrics; update processRecord to explicitly verify presence and numeric
type of payload.productId (throw or skip and log when missing/invalid) before
calling idempotentProcessor.process and
metricsService.incrementLikeCount/incrementViewCount, and refactor the nearly
identical logic shared with CouponIssueConsumer into a shared utility (e.g., a
CatalogEventHandler or EventProcessor helper used by both processRecord
implementations) to remove duplication while preserving idempotency handling via
idempotentProcessor.process.
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/OrderEventConsumer.java-54-55 (1)
54-55: ⚠️ Potential issue | 🟠 Major

orderId를 productId로 사용하는 것은 기능 결함이다. 주문별로 상품 판매량을 추적하는 것이 아니라 상품별로 추적해야 한다.

현재 코드는 payload.path("orderId").asLong()의 결과를 productId 변수에 할당하여 메트릭을 업데이트한다. 그러나 PaymentProcessor에서 발행하는 이벤트 페이로드에는 상품 ID가 없고 주문 ID만 존재한다. 주문은 여러 상품을 포함할 수 있으므로, 주문 ID로 상품별 판매 통계를 집계하는 것은 잘못된 결과를 생성한다. 주문 정보에서 상품 ID를 추출하여 상품별로 메트릭을 업데이트해야 한다.

또한 new BigDecimal(payload.path("amount").asText())는 "amount" 필드가 누락되거나 유효하지 않은 형식일 경우 NumberFormatException을 발생시킨다. 예외를 처리하지 않으면 배치 전체가 실패하게 된다. amount 필드의 존재 여부와 유효성을 먼저 검증한 후 변환하도록 수정해야 한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/OrderEventConsumer.java`
around lines 54 - 55, The code in OrderEventConsumer incorrectly assigns
payload.path("orderId").asLong() to productId and blindly constructs BigDecimal
from payload.path("amount").asText(), causing mis-aggregation and possible
NumberFormatException; update OrderEventConsumer to (1) obtain product IDs from
the order details rather than using orderId (e.g., parse payload.order.items or
fetch order by orderId and iterate its item.productId values) and update
per-product metrics for each productId found, and (2) validate the "amount"
field exists and is a valid numeric string before creating a BigDecimal (use a
safe parse/try-catch or Optional validation and skip/log invalid events rather
than letting NumberFormatException crash the batch), referencing the productId
and amount variables and the payload.path(...) usages to locate and replace the
incorrect logic.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8e172bfc-aa79-4fef-a47c-bf18a20e480b

📥 Commits

Reviewing files that changed from the base of the PR and between 58b3ce9 and 057d32c.

📒 Files selected for processing (95)

apps/commerce-api/build.gradle.kts
apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponFacade.java
apps/commerce-api/src/main/java/com/loopers/application/coupon/CouponIssueRequestInfo.java
apps/commerce-api/src/main/java/com/loopers/application/event/CouponIssueRequestedEvent.java
apps/commerce-api/src/main/java/com/loopers/application/event/LikeCountEventHandler.java
apps/commerce-api/src/main/java/com/loopers/application/event/PaymentCanceledEvent.java
apps/commerce-api/src/main/java/com/loopers/application/event/PaymentCompletedEvent.java
apps/commerce-api/src/main/java/com/loopers/application/event/PaymentFailedEvent.java
apps/commerce-api/src/main/java/com/loopers/application/event/ProductLikedEvent.java
apps/commerce-api/src/main/java/com/loopers/application/event/ProductUnlikedEvent.java
apps/commerce-api/src/main/java/com/loopers/application/event/ProductViewedEvent.java
apps/commerce-api/src/main/java/com/loopers/application/event/UserActivityEventHandler.java
apps/commerce-api/src/main/java/com/loopers/application/like/LikeFacade.java
apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentProcessor.java
apps/commerce-api/src/main/java/com/loopers/application/product/ProductFacade.java
apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java
apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequestRepository.java
apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueStatus.java
apps/commerce-api/src/main/java/com/loopers/domain/product/ProductRepository.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestJpaRepository.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestRepositoryImpl.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventFactory.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventJpaRepository.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventRepositoryImpl.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/outbox/OutboxEventService.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/product/ProductJpaRepository.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/product/ProductRepositoryImpl.java
apps/commerce-api/src/main/java/com/loopers/interfaces/api/coupon/CouponV1Controller.java
apps/commerce-api/src/main/java/com/loopers/interfaces/api/coupon/CouponV1Dto.java
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/LikeCountReconciliationScheduler.java
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxCleanupScheduler.java
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OutboxRelayScheduler.java
apps/commerce-api/src/main/java/com/loopers/support/config/AsyncConfig.java
apps/commerce-api/src/main/java/com/loopers/support/config/KafkaTopicConfig.java
apps/commerce-api/src/main/java/com/loopers/support/outbox/OutboxEvent.java
apps/commerce-api/src/main/java/com/loopers/support/outbox/OutboxEventRepository.java
apps/commerce-api/src/main/java/com/loopers/support/outbox/OutboxEventStatus.java
apps/commerce-api/src/main/resources/application.yml
apps/commerce-api/src/test/java/com/loopers/application/event/LikeEventHandlerIntegrationTest.java
apps/commerce-api/src/test/java/com/loopers/concurrency/LikeConcurrencyTest.java
apps/commerce-api/src/test/java/com/loopers/domain/coupon/CouponIssueRequestTest.java
apps/commerce-api/src/test/java/com/loopers/infrastructure/outbox/OutboxEventFactoryTest.java
apps/commerce-api/src/test/java/com/loopers/infrastructure/outbox/OutboxEventServiceIntegrationTest.java
apps/commerce-api/src/test/java/com/loopers/interfaces/api/coupon/CouponAsyncApiE2ETest.java
apps/commerce-api/src/test/java/com/loopers/interfaces/api/like/LikeApiE2ETest.java
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/LikeCountReconciliationSchedulerIntegrationTest.java
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OutboxCleanupSchedulerIntegrationTest.java
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OutboxRelaySchedulerIntegrationTest.java
apps/commerce-api/src/test/java/com/loopers/support/outbox/OutboxEventTest.java
apps/commerce-streamer/src/main/java/com/loopers/application/coupon/CouponIssueProcessor.java
apps/commerce-streamer/src/main/java/com/loopers/application/idempotent/IdempotentProcessor.java
apps/commerce-streamer/src/main/java/com/loopers/application/metrics/ConsumerMetrics.java
apps/commerce-streamer/src/main/java/com/loopers/application/metrics/MetricsService.java
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/Coupon.java
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponIssueRequestRepository.java
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponIssueStatus.java
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/CouponRepository.java
apps/commerce-streamer/src/main/java/com/loopers/domain/coupon/IssuedCoupon.java
apps/commerce-streamer/src/main/java/com/loopers/domain/idempotent/EventHandled.java
apps/commerce-streamer/src/main/java/com/loopers/domain/idempotent/EventHandledRepository.java
apps/commerce-streamer/src/main/java/com/loopers/domain/log/EventLog.java
apps/commerce-streamer/src/main/java/com/loopers/domain/log/EventLogRepository.java
apps/commerce-streamer/src/main/java/com/loopers/domain/log/EventLogStatus.java
apps/commerce-streamer/src/main/java/com/loopers/domain/metrics/ProductMetrics.java
apps/commerce-streamer/src/main/java/com/loopers/domain/metrics/ProductMetricsRepository.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestJpaRepository.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/CouponIssueRequestRepositoryImpl.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/CouponJpaRepository.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/CouponRepositoryImpl.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/coupon/IssuedCouponJpaRepository.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/idempotent/EventHandledJpaRepository.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/idempotent/EventHandledRepositoryImpl.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/log/EventLogJpaRepository.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/log/EventLogRepositoryImpl.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/metrics/ProductMetricsJpaRepository.java
apps/commerce-streamer/src/main/java/com/loopers/infrastructure/metrics/ProductMetricsRepositoryImpl.java
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CatalogEventConsumer.java
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/CouponIssueConsumer.java
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/DemoKafkaConsumer.java
apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/OrderEventConsumer.java
apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java
apps/commerce-streamer/src/main/resources/application.yml
apps/commerce-streamer/src/test/java/com/loopers/application/coupon/CouponIssueProcessorIntegrationTest.java
apps/commerce-streamer/src/test/java/com/loopers/application/idempotent/IdempotentProcessorIntegrationTest.java
apps/commerce-streamer/src/test/java/com/loopers/application/metrics/MetricsServiceIntegrationTest.java
apps/commerce-streamer/src/test/java/com/loopers/concurrency/CouponIssueConcurrencyTest.java
apps/commerce-streamer/src/test/java/com/loopers/domain/coupon/CouponIssueRequestTest.java
apps/commerce-streamer/src/test/java/com/loopers/domain/log/EventLogTest.java
apps/commerce-streamer/src/test/java/com/loopers/interfaces/consumer/CatalogEventConsumerTest.java
apps/commerce-streamer/src/test/java/com/loopers/interfaces/consumer/OrderEventConsumerTest.java
apps/commerce-streamer/src/test/java/com/loopers/interfaces/scheduler/EventHandledCleanupSchedulerIntegrationTest.java
modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaConfig.java
modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaTopics.java
modules/kafka/src/main/resources/kafka.yml

💤 Files with no reviewable changes (1)

apps/commerce-streamer/src/main/java/com/loopers/interfaces/consumer/DemoKafkaConsumer.java

coderabbitai · 2026-03-27T08:06:05Z

apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java

+    @Column(name = "reject_reason", length = 200)
+    private String rejectReason;
+
+    @Column(name = "created_at", nullable = false)
+    private ZonedDateTime createdAt;
+
+    @Column(name = "processed_at")
+    private ZonedDateTime processedAt;
+
+    protected CouponIssueRequest() {
+    }
+
+    private CouponIssueRequest(String eventId, Long couponId, Long userId) {
+        this.eventId = eventId;
+        this.couponId = couponId;
+        this.userId = userId;
+        this.status = CouponIssueStatus.PENDING;
+        this.createdAt = ZonedDateTime.now();
+    }
+
+    public static CouponIssueRequest create(String eventId, Long couponId, Long userId) {
+        return new CouponIssueRequest(eventId, couponId, userId);
+    }
+
+    public boolean isPending() {
+        return this.status == CouponIssueStatus.PENDING;
+    }


⚠️ Potential issue | 🔴 Critical

상태 전이 메서드가 없어 현재 PR 상태로는 컴파일되지 않는다.

이 엔티티는 create()와 isPending()만 제공하는데, 같은 PR의 CouponIssueProcessor는 이미 request.reject(...)와 request.complete()를 호출하고 있다. 지금 상태로는 빌드가 깨지고, rejectReason과 processedAt도 정상적으로 갱신될 경로가 없다. reject(String reason)와 complete()를 엔티티에 추가해 상태, 사유, 처리 시각을 한 곳에서 갱신하도록 바꾸고, 상태 전이 단위 테스트도 함께 보강해야 한다. 추가로 reject/complete 호출 시 상태값, rejectReason, processedAt이 기대대로 바뀌는 테스트를 넣어야 한다.

As per coding guidelines **/domain/**/*.java: 엔티티/값 객체/DTO 경계를 명확히 하고, 불변성과 캡슐화를 점검한다.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/java/com/loopers/domain/coupon/CouponIssueRequest.java` around lines 40 - 66, Add state-transition methods to the CouponIssueRequest entity: implement public void reject(String reason) and public void complete() in class CouponIssueRequest so that reject(...) sets status = CouponIssueStatus.REJECTED, rejectReason = reason, and processedAt = ZonedDateTime.now(), and complete() sets status = CouponIssueStatus.COMPLETED, processedAt = ZonedDateTime.now(); keep fields encapsulated (no public setters) and use these methods where CouponIssueProcessor calls request.reject(...) and request.complete(); then add unit tests that construct a request via CouponIssueRequest.create(...), call reject("reason") and complete(), and assert status, rejectReason, and processedAt are updated as expected.

coderabbitai · 2026-03-27T08:06:05Z

apps/commerce-streamer/src/main/java/com/loopers/application/coupon/CouponIssueProcessor.java

+    public void process(String eventId, Long couponId, Long userId) {
+        CouponIssueRequest request = couponIssueRequestRepository.findByEventId(eventId)
+                .orElse(null);
+        if (request == null || !request.isPending()) {
+            return;
+        }
+
+        // Layer 1: 중복 발급 체크 (UK 예외 대신 사전 조회)
+        if (issuedCouponJpaRepository.existsByCouponIdAndUserId(couponId, userId)) {
+            request.reject("이미 발급된 쿠폰입니다");
+            couponIssueRequestRepository.save(request);
+            return;
+        }
+
+        // Layer 2: Atomic UPDATE (수량 차감 + 만료/삭제 검증)
+        int affected = couponRepository.issueIfAvailable(couponId);
+        if (affected == 0) {
+            request.reject("발급 가능 수량이 모두 소진되었습니다");
+            couponIssueRequestRepository.save(request);
+            return;
+        }
+
+        // Layer 3: 발급 레코드 생성
+        issuedCouponJpaRepository.save(IssuedCoupon.create(couponId, userId));


⚠️ Potential issue | 🔴 Critical

저장된 요청이 아니라 Kafka payload 값을 기준으로 처리하고 있다.

eventId로 CouponIssueRequest를 찾은 뒤에도 이후 로직이 couponId, userId 파라미터를 계속 사용한다. 운영 중 직렬화 오류나 잘못된 재발행으로 payload 값이 어긋나면 다른 쿠폰 재고를 차감하고 다른 사용자에게 발급한 뒤 원래 요청은 완료 처리하는 데이터 오염이 발생한다. 조회가 성공한 뒤에는 request.getCouponId()/request.getUserId()만 사용하거나, payload와 불일치하면 즉시 실패 처리하도록 바꾸는 편이 안전하다. 추가로 동일 eventId에 불일치하는 payload가 들어와도 저장된 요청 값만 사용하거나 요청을 거절하는 테스트를 넣어야 한다.

수정 예시

CouponIssueRequest request = couponIssueRequestRepository.findByEventId(eventId) .orElse(null); if (request == null || !request.isPending()) { return; } + Long requestedCouponId = request.getCouponId(); + Long requestedUserId = request.getUserId(); // Layer 1: 중복 발급 체크 (UK 예외 대신 사전 조회) - if (issuedCouponJpaRepository.existsByCouponIdAndUserId(couponId, userId)) { + if (issuedCouponJpaRepository.existsByCouponIdAndUserId(requestedCouponId, requestedUserId)) { request.reject("이미 발급된 쿠폰입니다"); couponIssueRequestRepository.save(request); return; } // Layer 2: Atomic UPDATE (수량 차감 + 만료/삭제 검증) - int affected = couponRepository.issueIfAvailable(couponId); + int affected = couponRepository.issueIfAvailable(requestedCouponId); if (affected == 0) { request.reject("발급 가능 수량이 모두 소진되었습니다"); couponIssueRequestRepository.save(request); return; } // Layer 3: 발급 레코드 생성 - issuedCouponJpaRepository.save(IssuedCoupon.create(couponId, userId)); + issuedCouponJpaRepository.save(IssuedCoupon.create(requestedCouponId, requestedUserId));

As per coding guidelines **/*Service*.java: 유스케이스 단위로 책임이 정리되어 있는지, 부수 효과가 명확한지 점검한다.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

public void process(String eventId, Long couponId, Long userId) {

CouponIssueRequest request = couponIssueRequestRepository.findByEventId(eventId)

.orElse(null);

if (request == null || !request.isPending()) {

return;

}

// Layer 1: 중복 발급 체크 (UK 예외 대신 사전 조회)

if (issuedCouponJpaRepository.existsByCouponIdAndUserId(couponId, userId)) {

request.reject("이미 발급된 쿠폰입니다");

couponIssueRequestRepository.save(request);

return;

}

// Layer 2: Atomic UPDATE (수량 차감 + 만료/삭제 검증)

int affected = couponRepository.issueIfAvailable(couponId);

if (affected == 0) {

request.reject("발급 가능 수량이 모두 소진되었습니다");

couponIssueRequestRepository.save(request);

return;

}

// Layer 3: 발급 레코드 생성

issuedCouponJpaRepository.save(IssuedCoupon.create(couponId, userId));

public void process(String eventId, Long couponId, Long userId) {

CouponIssueRequest request = couponIssueRequestRepository.findByEventId(eventId)

.orElse(null);

if (request == null || !request.isPending()) {

return;

}

Long requestedCouponId = request.getCouponId();

Long requestedUserId = request.getUserId();

// Layer 1: 중복 발급 체크 (UK 예외 대신 사전 조회)

if (issuedCouponJpaRepository.existsByCouponIdAndUserId(requestedCouponId, requestedUserId)) {

request.reject("이미 발급된 쿠폰입니다");

couponIssueRequestRepository.save(request);

return;

}

// Layer 2: Atomic UPDATE (수량 차감 + 만료/삭제 검증)

int affected = couponRepository.issueIfAvailable(requestedCouponId);

if (affected == 0) {

request.reject("발급 가능 수량이 모두 소진되었습니다");

couponIssueRequestRepository.save(request);

return;

}

// Layer 3: 발급 레코드 생성

issuedCouponJpaRepository.save(IssuedCoupon.create(requestedCouponId, requestedUserId));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/commerce-streamer/src/main/java/com/loopers/application/coupon/CouponIssueProcessor.java` around lines 23 - 46, The code in process(...) reads a saved CouponIssueRequest via couponIssueRequestRepository.findByEventId(eventId) but continues to use the incoming payload parameters couponId and userId, which risks data corruption if payload and stored request diverge; change process to derive couponId and userId from the retrieved request (use request.getCouponId() and request.getUserId()) after a successful find, and if the incoming payload differs from request values immediately reject the request via request.reject(...) and save it, then proceed to use only request-derived values when calling couponRepository.issueIfAvailable(...), issuedCouponJpaRepository.save(...), and any other downstream logic; ensure all references to the method parameters couponId/userId are replaced and add a unit test that simulates mismatched payload to verify the request is rejected.

- Hikari maximum-pool-size 40 → 30 (DB 허용 45의 67%) - Hikari minimum-idle 30 → 20 - Tomcat threads.max 200 → 40 (Hikari x 1.3)

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

apps/commerce-api/src/main/resources/application.yml (1)
5-5: 공통 문서에서 스레드 상한을 고정하면 환경별 용량 제어가 약해진다.

Line 5 설정은 local/test/dev/qa/prd에 동일 적용되어, 환경별 트래픽·DB 한계가 다른 상황에서 과소/과대 할당 리스크가 생긴다. 운영 관점에서 장애 시 튜닝 반경이 커진다.
수정안으로 기본값은 보수적으로 두고, prd(또는 고부하 프로파일)에서만 server.tomcat.threads.max를 명시 오버라이드하는 구성이 안전하다.
추가 테스트로 132 rps 기준 부하에서 tomcat.threads.busy, hikaricp.connections.pending, p95 latency, 5xx/timeout을 프로파일별로 비교 검증해야 한다.
수정안 예시
 server:
   tomcat:
     threads:
-      max: 40 # Hikari 30 x 1.3 — TX 분리로 커넥션 점유가 짧아 스레드 > 커넥션 OK
+      max: 20 # 기본값(local/test)

 ---
 spring:
   config:
     activate:
       on-profile: prd
+server:
+  tomcat:
+    threads:
+      max: 40
As per coding guidelines **/application*.yml: 타임아웃, 커넥션 풀, 로깅 레벨 등 운영에 영향을 주는 설정 변경은 근거와 영향 범위를 요구한다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/commerce-api/src/main/resources/application.yml` at line 5, The current
fixed "max: 40" value in application.yml hardcodes thread/connection limits
across all environments; change this to a conservative default (or remove the
hard override) in the common application.yml and move the higher,
environment-specific override into a profile file (e.g., application-prod.yml)
using the server.tomcat.threads.max property (and corresponding Hikari pool
setting referenced by "max: 40") so only prod/high-load profiles increase
threads/pool size; after changing, verify under a 132 RPS load that
server.tomcat.threads.busy, hikaricp.connections.pending, p95 latency and
5xx/timeouts behave acceptably across profiles and adjust the prod override
accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/commerce-api/src/main/resources/application.yml`:
- Line 24: The dev/qa/prd profile blocks in kafka.yml are empty causing reliance
on the root ${BOOTSTRAP_SERVERS} env var; update the kafka profile sections
(dev, qa, prd) to include explicit bootstrap-servers defaults for each
environment (or alternatively implement an explicit env-var validation that
fails fast if BOOTSTRAP_SERVERS is missing) by editing the kafka.yml entries
referenced as bootstrap-servers and ${BOOTSTRAP_SERVERS}; also add integration
tests that verify (a) the application fails to start when BOOTSTRAP_SERVERS is
unset for env-var-only mode and (b) Kafka producer/consumer beans initialize
successfully when the profile-specific bootstrap-servers (or env var) are set.

---

Nitpick comments:
In `@apps/commerce-api/src/main/resources/application.yml`:
- Line 5: The current fixed "max: 40" value in application.yml hardcodes
thread/connection limits across all environments; change this to a conservative
default (or remove the hard override) in the common application.yml and move the
higher, environment-specific override into a profile file (e.g.,
application-prod.yml) using the server.tomcat.threads.max property (and
corresponding Hikari pool setting referenced by "max: 40") so only
prod/high-load profiles increase threads/pool size; after changing, verify under
a 132 RPS load that server.tomcat.threads.busy, hikaricp.connections.pending,
p95 latency and 5xx/timeouts behave acceptably across profiles and adjust the
prod override accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0664e462-06ea-4230-a5d5-f01253419e9f

📥 Commits

Reviewing files that changed from the base of the PR and between 057d32c and a130ebf.

⛔ Files ignored due to path filters (1)

blog.md is excluded by !**/*.md and included by **

📒 Files selected for processing (3)

apps/commerce-api/src/main/resources/application.yml
apps/commerce-streamer/src/main/resources/application.yml
modules/jpa/src/main/resources/jpa.yml

✅ Files skipped from review due to trivial changes (1)

modules/jpa/src/main/resources/jpa.yml

🚧 Files skipped from review as they are similar to previous changes (1)

apps/commerce-streamer/src/main/resources/application.yml

coderabbitai · 2026-03-27T13:25:46Z

apps/commerce-api/src/main/resources/application.yml

    import:
      - jpa.yml
      - redis.yml
+      - kafka.yml


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # application.yml에서 kafka import가 어떤 프로파일 범위에 적용되는지 확인 rg -n -C2 'spring:|config:|import:|kafka\.yml|on-profile' apps/commerce-api/src/main/resources/application.yml # kafka.yml에서 필수 환경변수 기본값 유무 확인 rg -n -C2 'bootstrap-servers|BOOTSTRAP_SERVERS' modules/kafka/src/main/resources/kafka.yml

Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java

Length of output: 978

🏁 Script executed:

cat -n modules/kafka/src/main/resources/kafka.yml

Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java

Length of output: 1664

🏁 Script executed:

# Check if dev/qa/prd kafka bootstrap-servers are defined in other config files find . -name "*.yml" -o -name "*.yaml" | xargs rg -l "bootstrap-servers|BOOTSTRAP_SERVERS" | head -20 # Also check for environment-specific application files fd -e yml -e yaml "application.*" apps/commerce-api/src/main/resources/

Repository: Loopers-dev-lab/loop-pack-be-l2-vol3-java

Length of output: 182

dev/qa/prd 프로파일 섹션에 bootstrap-servers 설정이 없어 환경변수 의존성이 불완전하다.

kafka.yml의 프로파일별 구조를 보면 local, test 환경은 localhost:19092로 기본값이 설정되어 있지만, dev, qa, prd 프로파일 섹션(40~46줄)은 완전히 비어 있다. 이로 인해 dev/qa/prd 환경에서는 root 섹션의 ${BOOTSTRAP_SERVERS} 환경변수에만 의존하게 되며, 배포 단계에서 이 변수가 누락되면 애플리케이션은 시작에 실패한다.

운영 관점에서 각 환경의 Kafka 부트스트랩 서버 주소는 명확히 정의되어야 한다. 다음 중 하나로 개선해야 한다:

권장안: dev/qa/prd 프로파일 섹션에 각각 bootstrap-servers 기본값을 명시한다(dev의 경우 개발용 Kafka 엔드포인트, qa/prd의 경우 해당 환경용 엔드포인트).

환경변수만 사용한다면, 배포 게이트웨이에서 dev/qa/prd 배포 시 BOOTSTRAP_SERVERS 환경변수 설정 검증을 필수화한다.

추가로 dev/qa/prd 환경에서 (1) BOOTSTRAP_SERVERS 미설정 상태의 부팅 실패 여부, (2) 변수 설정 후 Kafka 프로듀서/컨슈머 빈 초기화 성공 여부를 검증하는 통합 테스트를 추가한다.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/main/resources/application.yml` at line 24, The dev/qa/prd profile blocks in kafka.yml are empty causing reliance on the root ${BOOTSTRAP_SERVERS} env var; update the kafka profile sections (dev, qa, prd) to include explicit bootstrap-servers defaults for each environment (or alternatively implement an explicit env-var validation that fails fast if BOOTSTRAP_SERVERS is missing) by editing the kafka.yml entries referenced as bootstrap-servers and ${BOOTSTRAP_SERVERS}; also add integration tests that verify (a) the application fails to start when BOOTSTRAP_SERVERS is unset for env-var-only mode and (b) Kafka producer/consumer beans initialize successfully when the profile-specific bootstrap-servers (or env var) are set.

- findPending 쿼리에 created_at < NOW() - 10초 조건 추가하여 즉시 발행 진행 중인 이벤트의 불필요한 중복 수거 방지 - CallerRunsPolicy 응답 지연 트레이드오프 명시 - whenComplete 콜백 실행 컨텍스트 트레이드오프 명시 - 클린업 전략, replication-factor=1 내구성 한계, BatchListenerFailedException 부분 실패 흐름 문서화

- max.poll.records 3000 → 500 (리밸런싱 마진 확보) - fetch.min.bytes 1MB → 1byte (불필요한 5초 대기 제거) - fetch.max.wait.ms 5초 → 1초 (소비 지연 최소화) - 좋아요 시퀀스 다이어그램 수정 (products vs product_metrics 구분 명시) - 쿠폰 발급 경로 설명 정정 (즉시 발행 + 스케줄러 보완) - event_handled 삭제와 Outbox FAILED 재발행 간 타이밍 주의사항 추가 - fire-and-forget 유실 시 최대 24시간 부정확 허용 명시 - BatchListenerFailedException i+1 이후 흐름 명시

- 의사결정 기록 상단에 5개 섹션 네비게이션 목차 추가 - 다이어그램/코드 블록 전에 "이것이 무엇을 보여주는지" 1줄 요약 선행 - D5 트레이드오프를 blockquote로 분리하여 시각적 강조 - D10 클린업 주의사항을 별도 blockquote로 분리 - 배치 부분 실패 흐름을 번호 리스트로 구조화

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java (1)
14-18: @EnableScheduling은 구성 클래스에 한 번만 선언하는 것이 권장된다.

스케줄러 컴포넌트마다 @EnableScheduling을 선언하면 중복이 발생한다. Spring이 중복을 무시하지만, 관례상 @Configuration 클래스에 한 번만 선언하여 스케줄링 활성화 지점을 명확히 하는 것이 유지보수에 유리하다.
♻️ 권장 수정안

별도의 SchedulingConfig 클래스를 생성하거나 기존 config 클래스에 @EnableScheduling을 이동:
`@Configuration`
`@EnableScheduling`
public class SchedulingConfig {
}
이후 이 클래스에서는 @EnableScheduling 제거:
 `@Slf4j`
 `@Component`
-@EnableScheduling
 `@RequiredArgsConstructor`
 public class EventHandledCleanupScheduler {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java`
around lines 14 - 18, The EventHandledCleanupScheduler class currently carries
`@EnableScheduling` which should be declared only once in a configuration class;
remove the `@EnableScheduling` annotation from the EventHandledCleanupScheduler
class and create (or update) a configuration class such as SchedulingConfig
annotated with `@Configuration` and `@EnableScheduling` to centralize scheduling
setup so scheduling is enabled in one place for the application.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java`:
- Line 33: The code uses ZonedDateTime.now() in EventHandledCleanupScheduler to
compute "before" which relies on the system default TZ; change this to use an
explicit ZoneId (e.g., ZoneId.of("UTC") or a configured/database-aligned ZoneId)
so the retention cutoff is deterministic across environments: replace
ZonedDateTime.now() with ZonedDateTime.now(yourZoneId) when computing "before"
(used with RETENTION_DAYS) and ensure ZoneId is obtained from configuration or a
constant matching the DB timezone.
- Around line 30-37: The cleanup method in EventHandledCleanupScheduler
currently deletes all old rows in one transaction and uses system default
timezone; change it to perform batched deletes by repeatedly calling a new
repository method that accepts (ZonedDateTime before, int limit) and returns int
(e.g., eventHandledRepository.deleteHandledBefore(before, batchSize)) until the
returned count is 0, summing and logging total deleted rows per table; do the
same for eventLogRepository (add deleteLogsBefore(before, limit) returning int).
Remove the class-level `@Transactional` on cleanup (or ensure each batch delete
runs in its own `@Transactional` scope) and replace ZonedDateTime.now() with
ZonedDateTime.now(ZoneId.of("UTC") or a configurable ZoneId) and include
per-batch and final totals in log messages; ensure repository methods are
annotated `@Modifying` and `@Transactional` as needed.

---

Nitpick comments:
In
`@apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java`:
- Around line 14-18: The EventHandledCleanupScheduler class currently carries
`@EnableScheduling` which should be declared only once in a configuration class;
remove the `@EnableScheduling` annotation from the EventHandledCleanupScheduler
class and create (or update) a configuration class such as SchedulingConfig
annotated with `@Configuration` and `@EnableScheduling` to centralize scheduling
setup so scheduling is enabled in one place for the application.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 574720d4-cd93-4e40-9b27-7d557c2359e0

📥 Commits

Reviewing files that changed from the base of the PR and between 21886fc and 7fe545c.

⛔ Files ignored due to path filters (1)

PR.md is excluded by !**/*.md and included by **

📒 Files selected for processing (2)

apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java
modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaConfig.java

🚧 Files skipped from review as they are similar to previous changes (1)

modules/kafka/src/main/java/com/loopers/confg/kafka/KafkaConfig.java

coderabbitai · 2026-03-27T13:57:39Z

...ce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java

+    @Scheduled(cron = "0 0 4 * * *")
+    @Transactional
+    public void cleanup() {
+        ZonedDateTime before = ZonedDateTime.now().minusDays(RETENTION_DAYS);
+        eventHandledRepository.deleteHandledBefore(before);
+        eventLogRepository.deleteLogsBefore(before);
+        log.info("event_handled + event_log cleanup 완료: {}일 이전 레코드 삭제", RETENTION_DAYS);
+    }


⚠️ Potential issue | 🟠 Major

대량 삭제 시 단일 트랜잭션 성능 이슈 및 관측성 부족 문제가 있다.

운영 관점에서 다음 문제들이 우려된다:

대량 삭제 성능: 7일 치 레코드를 단일 트랜잭션에서 삭제하면 테이블 락 경합, 언두 로그 증가, MySQL 복제 지연이 발생할 수 있다. 트래픽 프로파일이 약 132 rps라면 7일간 수백만 건 이상 누적될 수 있다.

관측성 부족: 삭제된 레코드 수를 로깅하지 않아 작업 정상 완료 여부를 판단하기 어렵다. JPA 레포지토리의 deleteByHandledAtBefore가 int를 반환하지만 현재 구현에서는 이를 활용하지 않는다.

타임존 명시 부재: ZonedDateTime.now()는 시스템 기본 타임존을 사용하여 환경별로 다른 결과를 낼 수 있다.

🛠️ 권장 수정안

+ private static final int BATCH_SIZE = 10000; + `@Scheduled`(cron = "0 0 4 * * *") - `@Transactional` public void cleanup() { - ZonedDateTime before = ZonedDateTime.now().minusDays(RETENTION_DAYS); - eventHandledRepository.deleteHandledBefore(before); - eventLogRepository.deleteLogsBefore(before); - log.info("event_handled + event_log cleanup 완료: {}일 이전 레코드 삭제", RETENTION_DAYS); + ZonedDateTime before = ZonedDateTime.now(ZoneOffset.UTC).minusDays(RETENTION_DAYS); + + int totalHandled = deleteInBatches(() -> eventHandledRepository.deleteHandledBefore(before, BATCH_SIZE)); + int totalLogs = deleteInBatches(() -> eventLogRepository.deleteLogsBefore(before, BATCH_SIZE)); + + log.info("event_handled + event_log cleanup 완료: cutoff={}, event_handled={}건, event_log={}건 삭제", + before, totalHandled, totalLogs); + } + + private int deleteInBatches(IntSupplier deleteBatch) { + int total = 0; + int deleted; + do { + deleted = deleteBatch.getAsInt(); + total += deleted; + } while (deleted == BATCH_SIZE); + return total; }

레포지토리 메서드도 배치 삭제와 삭제 건수 반환을 지원하도록 수정이 필요하다:

`@Modifying` `@Transactional` `@Query`("DELETE FROM EventHandled e WHERE e.id IN (SELECT e2.id FROM EventHandled e2 WHERE e2.handledAt < :before ORDER BY e2.id LIMIT :limit)") int deleteHandledBefore(`@Param`("before") ZonedDateTime before, `@Param`("limit") int limit);

추가 테스트:

100만 건 이상의 테스트 데이터로 삭제 성능 및 DB 락 대기 시간 측정

삭제 중 동시 INSERT 발생 시 데드락 여부 확인

복제 지연 모니터링 추가

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java` around lines 30 - 37, The cleanup method in EventHandledCleanupScheduler currently deletes all old rows in one transaction and uses system default timezone; change it to perform batched deletes by repeatedly calling a new repository method that accepts (ZonedDateTime before, int limit) and returns int (e.g., eventHandledRepository.deleteHandledBefore(before, batchSize)) until the returned count is 0, summing and logging total deleted rows per table; do the same for eventLogRepository (add deleteLogsBefore(before, limit) returning int). Remove the class-level `@Transactional` on cleanup (or ensure each batch delete runs in its own `@Transactional` scope) and replace ZonedDateTime.now() with ZonedDateTime.now(ZoneId.of("UTC") or a configurable ZoneId) and include per-batch and final totals in log messages; ensure repository methods are annotated `@Modifying` and `@Transactional` as needed.

coderabbitai · 2026-03-27T13:57:39Z

...ce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java

+    @Scheduled(cron = "0 0 4 * * *")
+    @Transactional
+    public void cleanup() {
+        ZonedDateTime before = ZonedDateTime.now().minusDays(RETENTION_DAYS);


🛠️ Refactor suggestion | 🟠 Major

ZonedDateTime.now() 대신 명시적 타임존 사용을 권장한다.

시스템 기본 타임존에 의존하면 로컬 개발 환경, CI 서버, 운영 서버 간 동작이 달라질 수 있다. 배포 환경의 타임존 설정에 따라 삭제 기준 시점이 달라지는 것은 운영 사고로 이어질 수 있다.

🛠️ 수정안

- ZonedDateTime before = ZonedDateTime.now().minusDays(RETENTION_DAYS); + ZonedDateTime before = ZonedDateTime.now(ZoneOffset.UTC).minusDays(RETENTION_DAYS);

또는 DB에 저장된 시간의 타임존과 일치하는 ZoneId를 명시적으로 사용한다.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

ZonedDateTime before = ZonedDateTime.now().minusDays(RETENTION_DAYS);

ZonedDateTime before = ZonedDateTime.now(ZoneOffset.UTC).minusDays(RETENTION_DAYS);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/commerce-streamer/src/main/java/com/loopers/interfaces/scheduler/EventHandledCleanupScheduler.java` at line 33, The code uses ZonedDateTime.now() in EventHandledCleanupScheduler to compute "before" which relies on the system default TZ; change this to use an explicit ZoneId (e.g., ZoneId.of("UTC") or a configured/database-aligned ZoneId) so the retention cutoff is deterministic across environments: replace ZonedDateTime.now() with ZonedDateTime.now(yourZoneId) when computing "before" (used with RETENTION_DAYS) and ensure ZoneId is obtained from configuration or a constant matching the DB timezone.

주문 생성 후 결제를 진행하지 않은 경우 재고 점유가 영구 잠금되는 문제 해결. Atomic UPDATE로 동시성 제어, Order.hasCoupon() 사실 제공 메서드 추가.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (5)

apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OrderExpirationScheduler.java (2)
22-26: 만료 기준 시간(10분)을 외부 설정으로 분리하라.

스케줄러 실행 주기(order.expiration.interval-ms)는 설정 가능하나, 만료 기준 시간(10분)은 하드코딩되어 있다. 운영 중 트래픽 특성에 따라 만료 기준을 조정해야 할 경우 재배포가 필요하다.
♻️ 수정 제안
+import org.springframework.beans.factory.annotation.Value;
+
 `@Slf4j`
 `@Component`
 `@RequiredArgsConstructor`
 public class OrderExpirationScheduler {

     private final OrderService orderService;
     private final OrderFacade orderFacade;

+    `@Value`("${order.expiration.threshold-minutes:10}")
+    private int expirationThresholdMinutes;
+
     `@Scheduled`(fixedDelayString = "${order.expiration.interval-ms:60000}")
     public void expireCreatedOrders() {
         List<Order> expired = orderService.findCreatedOlderThanWithItems(
-                ZonedDateTime.now().minusMinutes(10));
+                ZonedDateTime.now().minusMinutes(expirationThresholdMinutes));
         if (expired.isEmpty()) return;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OrderExpirationScheduler.java`
around lines 22 - 26, The hardcoded 10-minute expiration in
OrderExpirationScheduler.expireCreatedOrders should be configurable: add a
property (e.g., order.expiration.minutes or order.expiration.duration with a
sensible default of 10) and read it into the scheduler (via `@Value` or
`@ConfigurationProperties`) instead of ZonedDateTime.now().minusMinutes(10); then
pass the computed ZonedDateTime to orderService.findCreatedOlderThanWithItems so
operators can change the threshold without redeploying.
23-26: 대량 만료 대상 발생 시 배치 크기 제한을 고려하라.

장애 복구 후 또는 주문 폭증 시 CREATED 상태의 오래된 주문이 수천~수만 건 누적될 수 있다. 현재 구현은 전체를 한 번에 조회하므로:

힙 메모리 압박 (Order + OrderItem 전체 로드)

스케줄러 실행 시간 장기화

후속 스케줄링 지연

권장 조치:

Repository 쿼리에 LIMIT 추가 (예: 100건 단위)

조회 건수 초과 시 다음 스케줄링까지 대기하거나 즉시 재실행
♻️ 배치 크기 제한 예시

Repository 변경:
`@Query`("SELECT DISTINCT o FROM Order o JOIN FETCH o.orderItems "
     + "WHERE o.status = :status AND o.createdAt < :threshold "
     + "LIMIT :limit")
List<Order> findAllByStatusAndCreatedAtBeforeWithItems(
    `@Param`("status") OrderStatus status,
    `@Param`("threshold") ZonedDateTime threshold,
    `@Param`("limit") int limit);
Scheduler 변경:
private static final int BATCH_SIZE = 100;

`@Scheduled`(fixedDelayString = "${order.expiration.interval-ms:60000}")
public void expireCreatedOrders() {
    List<Order> expired = orderService.findCreatedOlderThanWithItems(
            ZonedDateTime.now().minusMinutes(expirationThresholdMinutes), BATCH_SIZE);
    // ... 기존 로직
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OrderExpirationScheduler.java`
around lines 23 - 26, The expireCreatedOrders scheduler currently loads all
CREATED orders older than the threshold at once; modify it to process in bounded
batches to avoid OOM/long runs: change
orderService.findCreatedOlderThanWithItems to accept a batch size (e.g., int
batchSize) and update the repository query to limit results (e.g., LIMIT/:size
or Pageable), then in OrderExpirationScheduler.expireCreatedOrders loop: fetch a
batch, process/expire them, and if the returned list size == batchSize either
immediately fetch the next batch or exit and let the next scheduled run
continue; introduce a BATCH_SIZE constant and ensure the scheduler uses it when
calling the service.
apps/commerce-api/src/main/java/com/loopers/infrastructure/order/OrderJpaRepository.java (1)
54-58: @Modifying에 clearAutomatically = true 추가를 권장한다.

현재 expireOrder 플로우에서는 bulk UPDATE 전에 해당 Order가 L1 캐시에 로드되지 않아 문제가 없다. 그러나 향후 코드 변경 시 동일 트랜잭션 내에서 Order를 먼저 조회하는 흐름이 추가되면, bulk UPDATE 후 getOrder가 stale 데이터를 반환하는 버그가 발생할 수 있다.

방어적으로 clearAutomatically = true를 추가하면 bulk UPDATE 후 영속성 컨텍스트가 자동으로 clear되어 후속 SELECT가 항상 DB에서 최신 데이터를 가져온다.
♻️ 수정 제안
-    `@Modifying`
+    `@Modifying`(clearAutomatically = true)
     `@Query`("UPDATE Order o SET o.status = :newStatus WHERE o.id = :id AND o.status = :currentStatus")
     int updateStatusIfCurrent(`@Param`("id") Long id,
                               `@Param`("newStatus") OrderStatus newStatus,
                               `@Param`("currentStatus") OrderStatus currentStatus);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/main/java/com/loopers/infrastructure/order/OrderJpaRepository.java`
around lines 54 - 58, 추가된 bulk UPDATE 쿼리에서 영속성 컨텍스트의 stale 데이터 문제를 방지하기 위해
OrderJpaRepository의 updateStatusIfCurrent 메서드에 선언된 `@Modifying` 애노테이션에
clearAutomatically = true를 설정하세요; 즉, `@Modifying` 애노테이션을 수정하여 bulk UPDATE 실행 후 영속성
컨텍스트를 자동으로 clear 하도록 구성하면 동일 트랜잭션 내에서 이후 getOrder/조회가 항상 DB의 최신 상태를 읽습니다.
apps/commerce-api/src/test/java/com/loopers/application/order/OrderServiceIntegrationTest.java (1)
197-212: 만료 조회 테스트가 핵심 경계값을 검증하지 못한다.

여기서는 모두 방금 생성한 주문에 대해 now() ± n분만 바꿔 조회하므로, 실제 요구사항인 “10분 초과 CREATED만 만료 대상”의 경계값을 확인하지 못한다. 운영에서는 비교 연산자 하나만 바뀌어도 정상 주문이 대량 만료될 수 있으니, Clock을 고정하거나 createdAt을 제어 가능한 fixture로 분리해 9:59, 10:00, 10:01 케이스를 명시적으로 검증하는 편이 안전하다. 추가 테스트로 정확히 10분 시점과 PAID/CANCELED 상태가 함께 제외되는지도 넣어야 한다. As per coding guidelines, "단위 테스트는 경계값/실패 케이스/예외 흐름을 포함하는지 점검한다."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/test/java/com/loopers/application/order/OrderServiceIntegrationTest.java`
around lines 197 - 212, The test 기준시간_이전에_생성된_CREATED_주문만_조회된다() doesn't
exercise the 10-minute boundary or state exclusions; update it to control
creation timestamps (use a fixed Clock or allow injecting createdAt via the test
fixture) and add explicit cases for createdAt = now - 9m59s, now - 10m00s, now -
10m01s when calling orderService.findCreatedOlderThanWithItems, asserting only
>10min are returned; also add cases where orders have status PAID or CANCELED to
verify they are excluded from expiration. Locate usages of
OrderService.createOrder / OrderCommand.Create and findCreatedOlderThanWithItems
to implement the controllable timestamps and new assertions.
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OrderExpirationSchedulerTest.java (1)
101-115: PAID 시나리오의 재고 fixture가 실제 운영 불변식과 다르다.

실제 결제 완료 경로인 apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentProcessor.java Line 33-37에서는 예약 재고를 confirm한 뒤 주문을 PAID로 바꾼다. 그런데 이 테스트는 예약 재고만 남긴 채 order.pay()로 상태만 PAID로 바꿔, 실제 서비스에서는 나오지 않는 조합을 검증하고 있다. 이 상태에서는 만료 로직이 확정 재고를 잘못 건드려도 테스트가 잡지 못한다. PAID fixture를 실제 성공 경로로 만들거나 최소한 confirmed 상태를 맞춘 뒤, 추가 테스트로 확정 재고가 그대로 유지되는지 확인하는 편이 안전하다. As per coding guidelines, "통합 테스트는 격리 수준, 플래키 가능성, 테스트 데이터 준비/정리 전략을 점검한다."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OrderExpirationSchedulerTest.java`
around lines 101 - 115, The test creates a PAID order by calling order.pay()
without following the real success path used in PaymentProcessor (which first
confirms reserved stock then marks the order PAID), so the fixture allows an
impossible combination and can miss bugs; update the test to mirror the real
flow by invoking the stock confirmation step (e.g., call the service/method used
to confirm reserved stock such as stockService.confirm(...) or the equivalent
confirmation method used by PaymentProcessor) before marking the order PAID (or
persist a Order in the confirmed state if your domain has such a method), then
call orderFacade.expireOrder(order.getId()) and keep/assert that
confirmed/finalized quantities remain unchanged (in addition to
reserved/available checks) to ensure the expiry logic does not touch confirmed
stock (reference symbols: PaymentProcessor, stockService.reserve,
stockService.confirm or confirm method, order.pay, orderFacade.expireOrder).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OrderExpirationSchedulerTest.java`:
- Around line 127-134: The test currently swallows exceptions from
orderFacade.expireOrder(order.getId()) which can mask failure and yield a
false-positive rollback assertion; update OrderExpirationSchedulerTest to remove
the empty catch and instead assert that the expected exception is thrown (using
assertThrows or assertThatThrownBy against orderFacade.expireOrder(...)), then
after asserting the exception verify rollback by reloading the Order via
orderRepository.findByIdWithItems(order.getId()) and asserting found.getStatus()
== OrderStatus.CREATED; optionally also assert the specific exception type and
that inventory/coupon side-effects were not applied.

---

Nitpick comments:
In
`@apps/commerce-api/src/main/java/com/loopers/infrastructure/order/OrderJpaRepository.java`:
- Around line 54-58: 추가된 bulk UPDATE 쿼리에서 영속성 컨텍스트의 stale 데이터 문제를 방지하기 위해
OrderJpaRepository의 updateStatusIfCurrent 메서드에 선언된 `@Modifying` 애노테이션에
clearAutomatically = true를 설정하세요; 즉, `@Modifying` 애노테이션을 수정하여 bulk UPDATE 실행 후 영속성
컨텍스트를 자동으로 clear 하도록 구성하면 동일 트랜잭션 내에서 이후 getOrder/조회가 항상 DB의 최신 상태를 읽습니다.

In
`@apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OrderExpirationScheduler.java`:
- Around line 22-26: The hardcoded 10-minute expiration in
OrderExpirationScheduler.expireCreatedOrders should be configurable: add a
property (e.g., order.expiration.minutes or order.expiration.duration with a
sensible default of 10) and read it into the scheduler (via `@Value` or
`@ConfigurationProperties`) instead of ZonedDateTime.now().minusMinutes(10); then
pass the computed ZonedDateTime to orderService.findCreatedOlderThanWithItems so
operators can change the threshold without redeploying.
- Around line 23-26: The expireCreatedOrders scheduler currently loads all
CREATED orders older than the threshold at once; modify it to process in bounded
batches to avoid OOM/long runs: change
orderService.findCreatedOlderThanWithItems to accept a batch size (e.g., int
batchSize) and update the repository query to limit results (e.g., LIMIT/:size
or Pageable), then in OrderExpirationScheduler.expireCreatedOrders loop: fetch a
batch, process/expire them, and if the returned list size == batchSize either
immediately fetch the next batch or exit and let the next scheduled run
continue; introduce a BATCH_SIZE constant and ensure the scheduler uses it when
calling the service.

In
`@apps/commerce-api/src/test/java/com/loopers/application/order/OrderServiceIntegrationTest.java`:
- Around line 197-212: The test 기준시간_이전에_생성된_CREATED_주문만_조회된다() doesn't exercise
the 10-minute boundary or state exclusions; update it to control creation
timestamps (use a fixed Clock or allow injecting createdAt via the test fixture)
and add explicit cases for createdAt = now - 9m59s, now - 10m00s, now - 10m01s
when calling orderService.findCreatedOlderThanWithItems, asserting only >10min
are returned; also add cases where orders have status PAID or CANCELED to verify
they are excluded from expiration. Locate usages of OrderService.createOrder /
OrderCommand.Create and findCreatedOlderThanWithItems to implement the
controllable timestamps and new assertions.

In
`@apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OrderExpirationSchedulerTest.java`:
- Around line 101-115: The test creates a PAID order by calling order.pay()
without following the real success path used in PaymentProcessor (which first
confirms reserved stock then marks the order PAID), so the fixture allows an
impossible combination and can miss bugs; update the test to mirror the real
flow by invoking the stock confirmation step (e.g., call the service/method used
to confirm reserved stock such as stockService.confirm(...) or the equivalent
confirmation method used by PaymentProcessor) before marking the order PAID (or
persist a Order in the confirmed state if your domain has such a method), then
call orderFacade.expireOrder(order.getId()) and keep/assert that
confirmed/finalized quantities remain unchanged (in addition to
reserved/available checks) to ensure the expiry logic does not touch confirmed
stock (reference symbols: PaymentProcessor, stockService.reserve,
stockService.confirm or confirm method, order.pay, orderFacade.expireOrder).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 071ce678-f308-40d0-9610-bd9e7f357489

📥 Commits

Reviewing files that changed from the base of the PR and between 7fe545c and fb0d229.

📒 Files selected for processing (11)

apps/commerce-api/src/main/java/com/loopers/application/order/OrderFacade.java
apps/commerce-api/src/main/java/com/loopers/application/order/OrderService.java
apps/commerce-api/src/main/java/com/loopers/application/payment/PaymentProcessor.java
apps/commerce-api/src/main/java/com/loopers/domain/order/Order.java
apps/commerce-api/src/main/java/com/loopers/domain/order/OrderRepository.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/order/OrderJpaRepository.java
apps/commerce-api/src/main/java/com/loopers/infrastructure/order/OrderRepositoryImpl.java
apps/commerce-api/src/main/java/com/loopers/interfaces/scheduler/OrderExpirationScheduler.java
apps/commerce-api/src/test/java/com/loopers/application/order/OrderServiceIntegrationTest.java
apps/commerce-api/src/test/java/com/loopers/domain/order/OrderTest.java
apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OrderExpirationSchedulerTest.java

coderabbitai · 2026-03-30T16:34:40Z

...ommerce-api/src/test/java/com/loopers/interfaces/scheduler/OrderExpirationSchedulerTest.java

+            try {
+                orderFacade.expireOrder(order.getId());
+            } catch (Exception ignored) {
+            }
+
+            // TX 롤백으로 주문 상태도 원복
+            Order found = orderRepository.findByIdWithItems(order.getId()).orElseThrow();
+            assertThat(found.getStatus()).isEqualTo(OrderStatus.CREATED);


⚠️ Potential issue | 🟡 Minor

예외 전파 검증이 빠져 있어 롤백 테스트가 거짓 양성이 된다.

catch (Exception ignored)로 삼키면 expireOrder()가 더 이상 예외를 던지지 않아도 이 테스트는 통과할 수 있다. 운영에서는 보상 실패가 조용히 묻히면 장애 탐지가 늦어지므로, 먼저 assertThrows나 assertThatThrownBy로 예외 자체를 고정한 뒤 롤백된 주문 상태를 확인해야 한다. 추가 테스트로 실제 예외 타입과 재고/쿠폰 부수효과 미반영까지 함께 검증해야 한다. As per coding guidelines, "단위 테스트는 경계값/실패 케이스/예외 흐름을 포함하는지 점검한다."

패치 예시

import static org.assertj.core.api.Assertions.assertThat; import static org.junit.jupiter.api.Assertions.assertAll; +import static org.junit.jupiter.api.Assertions.assertThrows; ... - try { - orderFacade.expireOrder(order.getId()); - } catch (Exception ignored) { - } + assertThrows(Exception.class, () -> orderFacade.expireOrder(order.getId())); // TX 롤백으로 주문 상태도 원복 Order found = orderRepository.findByIdWithItems(order.getId()).orElseThrow(); assertThat(found.getStatus()).isEqualTo(OrderStatus.CREATED);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/commerce-api/src/test/java/com/loopers/interfaces/scheduler/OrderExpirationSchedulerTest.java` around lines 127 - 134, The test currently swallows exceptions from orderFacade.expireOrder(order.getId()) which can mask failure and yield a false-positive rollback assertion; update OrderExpirationSchedulerTest to remove the empty catch and instead assert that the expected exception is thrown (using assertThrows or assertThatThrownBy against orderFacade.expireOrder(...)), then after asserting the exception verify rollback by reloading the Order via orderRepository.findByIdWithItems(order.getId()) and asserting found.getStatus() == OrderStatus.CREATED; optionally also assert the specific exception type and that inventory/coupon side-effects were not applied.

ghtjr410 added 25 commits March 27, 2026 14:07

feat: ApplicationEvent 기반 부가 로직 분리

7994f6a

feat: Transactional Outbox Pattern 구현

4aa948b

feat: Kafka Consumer 멱등 처리 + 메트릭 집계

646f4e0

feat: Kafka 기반 선착순 쿠폰 발급

c5ed60d

fix: AsyncConfig에 AsyncUncaughtExceptionHandler 안전망 추가

a68c233

feat: DLQ 구성 + Consumer Group 분리 + 배치 처리

ff29345

feat: likeCount Reconciliation 스케줄러 + LikeApiE2ETest 비동기 대응

d897d3a

refactor: Outbox Relay를 즉시 발행 + Scheduled 보완 + 셀프컨슘으로 전환

bb79d3e

refactor: Outbox INSERT를 BEFORE_COMMIT 리스너에서 Facade 직접 호출로 전환

075e27d

feat: Consumer 운영 가시성 추가 (EventLog + ConsumerMetrics)

3492930

fix: Consumer 셀프컨슘으로 Outbox SENT 마킹 + Javadoc 수정

396d98f

fix: OutboxRelayScheduler를 .get() 동기 방식으로 전환, backoff 제거

e2394f1

refactor: 토픽 상수 중앙화 + TransactionHelper 적용

d32c0f8

refactor: 셀프컨슘 제거, Kafka ACK 기준 SENT 마킹으로 통일

d8e5194

refactor: SENT 마킹을 @Modifying + @transactional로 단순화

9838ef9

feat: OutboxEventService에 @transactional(MANDATORY) 추가

cd9fc59

- TX 없는 컨텍스트에서 호출 시 즉시 예외 발생 - Outbox INSERT가 비즈니스 TX 밖에서 실수로 호출되는 것을 방지

test: commerce-streamer 통합 테스트 2개 추가

9367061

- IdempotentProcessorIntegrationTest: 멱등 처리, 중복 스킵, EventLog, 예외 재전파 - MetricsServiceIntegrationTest: UPSERT 좋아요/조회/판매 메트릭

test: Consumer 파싱/분기 단위 테스트 + streamer 통합 테스트 추가

aac48c3

- CatalogEventConsumerTest: product.liked/viewed 파싱, 미지원 이벤트 스킵 - OrderEventConsumerTest: payment.completed 파싱, payment.failed 로그만, 미지원 스킵

test: 스케줄러 통합 테스트 2개 추가

6755f9a

- OutboxCleanupSchedulerIntegrationTest: SENT 7일 경과 삭제, PENDING 미삭제 - EventHandledCleanupSchedulerIntegrationTest: 7일 경과 event_handled 삭제

test: CouponAsyncApiE2ETest 추가

58ba3f2

- 비동기 발급 요청: 200+PENDING, 중복 요청 기존 상태, 404, 401 - 발급 상태 조회: PENDING 확인, 미요청 404 - KafkaTemplate @MockBean으로 Kafka 의존 제거

test: CouponIssueConcurrencyTest 추가

c591b89

- 수량 100장에 200건 동시 요청 → 발급 수 정확히 100개 검증 - 20스레드 동시 실행, CountDownLatch로 동기화 - Atomic UPDATE(JPQL)로 수량 초과 방지 확인

Copilot AI review requested due to automatic review settings March 27, 2026 07:49

Copilot started reviewing on behalf of ghtjr410 March 27, 2026 07:49 View session

ghtjr410 changed the title ~~Volume 7~~ [7주차] 이벤트 기반 아키텍처, 카프카 Mar 27, 2026

Copilot AI reviewed Mar 27, 2026

View reviewed changes

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

ghtjr410 changed the title ~~[7주차] 이벤트 기반 아키텍처, 카프카~~ [7주차] 이벤트 기반 아키텍처, 카프카 - 최호석 Mar 27, 2026

chore: 리소스 풀 설정을 EC2 large(2vCPU, 8GB) 환경에 맞게 조정

a130ebf

- Hikari maximum-pool-size 40 → 30 (DB 허용 45의 67%) - Hikari minimum-idle 30 → 20 - Tomcat threads.max 200 → 40 (Hikari x 1.3)

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

ghtjr410 added 5 commits March 27, 2026 22:31

docs: product_metrics Reconciliation 부재 근거 + RETENTION_DAYS 7일 SLA 주석 추가

9db92dc

chore: PR.md, blog.md 삭제

cb01af5

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

	kafkaTemplate.send(outboxEvent.getTopic(), outboxEvent.getAggregateId(), outboxEvent.getPayload())
	kafkaTemplate.send(outboxEvent.getTopic(), outboxEvent.getAggregateId(), outboxEvent)

	ZonedDateTime before = ZonedDateTime.now().minusDays(RETENTION_DAYS);
	ZonedDateTime before = ZonedDateTime.now(ZoneOffset.UTC).minusDays(RETENTION_DAYS);

Conversation

ghtjr410 commented Mar 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

환경

트래픽 프로파일

전체 아키텍처

의사결정 기록

이 문서의 구조

1. 왜 분리하는가

D1. 핵심 vs 부가 분리 근거

D2. 핵심 내 분리 여부 — "모놀리식에서 안 한다"

D3. ApplicationEvent vs Outbox — 내부/외부 전달 기준

2. 어떻게 발행하는가

D4. Outbox INSERT 위치 — Facade 직접 호출

D5. 발행 전략 — 즉시 발행 + Scheduled 보완

D6. AFTER_COMMIT을 선택한 이유

D7. Event vs Command 토픽

3. 어떻게 중복을 막는가

D8. Kafka 레벨 멱등성 — 자체 Exactly-Once의 한계

D9. 발행 레벨 — At-Least-Once (무조건 한 번 이상 보낸다)

D10. 소비 레벨 — Producer를 믿지 않는다 (멱등 처리)

4. 실패하면 어떻게 되는가

D11. DLT + 재시도 전략

D11-1. DLQ 후속 처리 전략 (운영 구상)

D12. 결과적 일관성 보장 메커니즘

D13. 관측성 — 3계층 구조

5. 설정값 근거

Kafka Consumer 설정

Kafka Producer 설정

비동기 스레드풀 설정

서버 설정

리소스 풀 설계

학습 레퍼런스

리뷰 포인트

리뷰 포인트 1: 스케줄드 폴링 단독의 한계를 개선하기 위해 즉시 발행 + 보완 폴링 구조로 전환했습니다

스케줄드 폴링 단독의 문제점

개선 — 즉시 발행 + 보완 폴링

인지하고 있는 트레이드오프

트래픽 증가 시 대응 전략

목적

핵심 변경점

리스크/주의사항

테스트/검증 방법

Uh oh!

coderabbitai bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

운영 관점 주요 지적

1. Outbox 이벤트 복구·관찰성 부족

2. Idempotent 처리에서 발생 가능한 레이스 조건

3. 비동기 핸들러의 Kafka 전송 실패 정책 미비

4. 쿠폰 발급의 원자성 및 중복 검사 타이밍

5. Kafka 배치 소비자 부분 실패 정책 검증 필요

6. 모니터링·알림 체계 보강 필요

7. 리소스 재조정(스레드풀·DB 커넥션) 영향 분석 필요

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

ghtjr410 commented Mar 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 27, 2026 •

edited

Loading