Conversation
SweetCodey
left a comment
There was a problem hiding this comment.
Reviewed until API Design and left feedback
| * **Track order status:** Users should be able to view order status and track shipment progress. | ||
| * **View recommendations:** Users should receive personalized recommendations based on browsing and order history. | ||
|
|
||
| ### Non-Functional Requirements |
There was a problem hiding this comment.
Can we also mention the numbers here for availability and scalability?
|
|
||
| ```json | ||
| { | ||
| "recommendations": [ |
There was a problem hiding this comment.
Love the reason parameter. Very thoughtful!
|
|
||
| This API allows users to search for products using keywords and apply filters and sorting. | ||
|
|
||
|  |
There was a problem hiding this comment.
Do we want to send available quantity back to the client?
There was a problem hiding this comment.
Yes. This will be helpful if we want to show messages on search result like Only few quantities left or Only 10 quantities left
| "reviewCount": 2547, | ||
| "imageUrl": "https://cdn.shopping.com/images/prod_12345.jpg", | ||
| "inStock": true, | ||
| "availableQuantity": 145 |
There was a problem hiding this comment.
Do we want to send available quantity back to the client?
There was a problem hiding this comment.
Yes. This will be helpful if we want to show messages on search result like Only few quantities left or Only 10 quantities left
| The server returns a list of products matching the search criteria along with pagination metadata. | ||
|
|
||
| ```json | ||
| { |
There was a problem hiding this comment.
Thank you for including filters and pagination there.
|
|
||
| **HTTP Method & Endpoint** | ||
|
|
||
| We use `POST` method because adding an item to the cart creates a new entry in our system. The endpoint would be `/v1/cart/{cartId}/items`. |
There was a problem hiding this comment.
Would be great if you can add one more line explaining why you went for v1/cart/{cartid}/items instead of v1/cart/{cartid}. Tell that you are making changes to the items in the cart and not the cart itself.
| **HTTP Method & Endpoint** | ||
|
|
||
| We use `PATCH` method for this API because it updates only a specific field (quantity) of an existing | ||
| item without replacing the entire item. The endpoint would be `/v1/cart/{cartId}/items/{itemId}` |
There was a problem hiding this comment.
Something to think about: Why did we go for endpoint as v1/cart/cartid/items/itemid and not v1/cart/cartid/items? We can always pass in item ID in the body.
There was a problem hiding this comment.
I preferred itemid because:
- As per REST principle, URL should identify the resource. Since item is treated as a separate resource, we mentioned itemid in url
- It also makes the traceability of access logs easier.
Please let me know if u think otherwise
|
|
||
| #### Remove from Cart | ||
|
|
||
| This API allows users to remove an item from the cart. |
There was a problem hiding this comment.
This feels like updating the cart and not deleting. Can we remove the deletion and only have the update the cart API design?
Same for HLD for delete item. We can simply say that is updating cart.
|
|
||
| This API allows users to place an order with items from their cart. | ||
|
|
||
|  |
There was a problem hiding this comment.
The image seems to be wrong. Can you verify it once?
There was a problem hiding this comment.
Sure. will update
| * **Inventory Service** - Checks item availability and temporarily reserves item for the customer's order. | ||
| * **Fulfillment Service** - Handles the actual delivery of items in the customer's order. | ||
| * **Address Service** - Provides the address of the customer to ship order |
There was a problem hiding this comment.
Just a quick thought: Why do we have separate services for these? Can we combine some of these?
There was a problem hiding this comment.
I kept them separate because all of them has unique responsibilities and use-cases.
- Inventory is about the stocks which will be handled by a different team.
- Fulfillment - It is not customer facing (Except for tracking) and is mostly linked to external logistics and the system is updated async.
- Address Service - Though address service looks like CRUD, it has other stuff such as validation, suggestion. Although address is not always linked to ordering. Customers can add/modify address even through accounts page
There was a problem hiding this comment.
Sounds good. It's the address service that I felt had too much granularity for the design but it's fine to proceed.
|
|
||
|  | ||
|
|
||
| ### Recommendations Flow |
There was a problem hiding this comment.
Overall feedback:
- If the batch runs once nightly but TTL is 4 hours, recommendations expire ~20 hours before the next batch. For most of the day, users hit cache misses and fall back to trending products hereby defeating the purpose of personalization.
- For a platform with, say, 100M users, pre-computing and storing recommendations for every user (including dormant ones) is wasteful. A better approach is to compute on-demand for active users and cache the result, or run the batch only for recently active users.
There was a problem hiding this comment.
- Agree. It was a typo. It should be 24 hours.
- I agree on the 100MM offline computation. it waste resources. But i disagree on on-demand computations. It is costly. But running batch only for recently active users make sense
Below is my proposal. Please let me know what u think
- Hot user pool - Based on weighted usage activity, we will pre-compute offline
- For all other users we can perform on-demand. since others users are less active than hot users, the on-demand load is not that heavy
There was a problem hiding this comment.
Wonderful. Love the hybrid approach.
| 4. On cache miss, the **Recommendation Service** falls back to location-based trending products fetched from **Trending Product Cache**. Trending data is cached because it changes infrequently and is read frequently, allowing low-latency access. | ||
| 5. The recommendations are returned to the user's device and displayed on the page. | ||
|
|
||
| ### Product Search Flow |
There was a problem hiding this comment.
The Product Search flow is solid overall, but the main issue is that the search service is doing too much synchronously in a single request.
- User activity logging is in the critical path. Step 10 is a database write that happens on every search query. If that DB is slow or down, your search latency spikes or fails entirely and for a non-critical side-effect. This should be fired async to a message queue like Kafka.
- The search index isn't being used to its full potential. Product name, price, image URL, rating, and stock status should already live in the Elasticsearch document itself, updated via an event-driven pipeline. Instead, every search query round-trips to the Product Catalog Cache and Inventory Cache, turning a single-hop read into a fan-out across three systems.
There was a problem hiding this comment.
Agreed to both the points. Will correct it
| 9. The search service returns paginated results to the API Gateway, which sends them back to the user's device. | ||
| 10. The user's search query and clicked products are logged to the **User Activity Database** for future recommendation generation. | ||
|
|
||
| ### Cart Management Flow |
There was a problem hiding this comment.
Cart Management flow seems over-decomposed. The four operations (view, add, update, remove) all follow the same pattern: request hits API Gateway, routes to Cart Service, reads/writes the Cart Database, optionally checks inventory. Having four separate diagrams and walkthroughs for what is essentially CRUD on a single service adds bulk without adding insight. I'd consolidate this into one diagram showing the Cart Service, its dependencies (Cart DB, Inventory Service, Product Catalog Cache), and a brief table or list explaining how each operation flows through it.
There was a problem hiding this comment.
make sense. will update it
| 3. The **Cart Service** deletes the specified item from the **Cart Database**. | ||
| 4. The **Cart Service** returns a success response with the updated cart item count. | ||
|
|
||
| ### Checkout Flow |
There was a problem hiding this comment.
Checkout flow feedback:
-
The payment happens before order creation. If the system crashes after charging the customer but before persisting the order, you've taken their money with no record. Suggested fix: Add a step before payment where the Order Service creates the order in a
PENDING_PAYMENTstate. After payment succeeds, update it toCONFIRMED. After payment fails, update it toFAILED. -
In the text, order is created twice. Step 3 of "Payment Processing & Order Creation" creates the order, and then step 2 of "Post-Order Flow" creates it again. Suggested fix: Remove step 2 from "Post-Order Flow." If you adopt the fix from point 1, order creation moves even earlier before payment and this duplication naturally goes away.
-
No TTL on inventory reservations. If a user abandons checkout or a service crashes after reserving, those items stay locked forever. Suggested fix: Add one line after the reservation step: "Reservations are set with a TTL (e.g., 5-10 minutes). A background job periodically releases expired reservations back to available stock." No need to show that in the diagram though. Text only is fine.
-
"Atomic operation" is a little misleading. The flow spans multiple independent microservices. You can't have a true ACID transaction across them. Suggested fix: Replace "atomic operation" with "coordinated operation" and add a brief note: "If any step fails, the system executes compensation actions (e.g., releasing reserved inventory, issuing refunds) to maintain consistency. This follows the Saga pattern.". No need to dive deeper into the saga pattern though.
There was a problem hiding this comment.
- valid point
- will fix it
- sure
- Yes. i agree. readers might relate atomic operation to DB operation,.
| During checkout, the inventory service blocks that items being checked out so that others cannot order the same item. There are multiple approaches | ||
| available such as optimistic and pessimistic locking to reserve items. To learn more about inventory reservation refer section [Inventory Consistency](#inventory-consistency). | ||
|
|
||
| ### Order Tracking Flow |
There was a problem hiding this comment.
Order Tracking has a synchronous dependency on the Fulfillment Service. Every "Track Order" request calls the Fulfillment Service in real-time, but tracking info only updates a few times a day. If the Fulfillment Service is down, the entire tracking page breaks for stale data. Suggested fix: Have the Fulfillment Service push status updates to the Order Service via events (you already have the Order Events Stream). The Order Service stores the latest tracking state alongside the order, making tracking a single read from one service with no cross-service call.
There was a problem hiding this comment.
Nice observation. Will update it
| ### Database Selection | ||
| We cannot make a "single database" choice for the entire online shopping system. Each functionality of the system will have a specific database choice based on the requirement. | ||
|
|
||
| | Guideline | Recommendation | |
There was a problem hiding this comment.
These guidelines are very simplified for forming a mental model, but these are not absolute rules, and that is why can you add a one-line at the top: "These are general guidelines to build intuition, not absolute rules. The right choice always depends on your specific access patterns, scale, and consistency requirements."
|
|
||
| Based on the above guidelines, we made the database choices for our online shopping | ||
| service. | ||
| <table> |
There was a problem hiding this comment.
User Activity Database and Fulfillment Database are used in the design but missing from the Database Selection table. Can you add them too?
There was a problem hiding this comment.
Sure, will add it
| * Indexing: `cartId`, `userId` | ||
|
|
||
| #### Order Schema | ||
|  |
There was a problem hiding this comment.
Love the interrelation shown here in the diagram.
| 1. Customer A's checkout reads inventory. Available quantity is 1 and the version number is 10. | ||
| 2. At the same time, Customer B's checkout reads inventory. Available quantity is 1 and the version number is 10. | ||
| 3. Customer A's payment succeeds first. The system tries to update inventory using the below query. The query succeeds because the version is still 10. Version is incremented to 11. | ||
| ```sql | ||
| UPDATE inventory SET available_quantity = 0, version = 11 WHERE product_id = 'prod_12345' AND version = 10 | ||
| ``` | ||
| 4. Customer B's payment succeeds. The system tries to update inventory using the same query. This fails because version is now 11 and there is no entry with version 10. The database returns "0 rows updated". | ||
| ```sql | ||
| UPDATE inventory SET available_quantity = 0, version = 11 WHERE product_id = 'prod_12345' AND version = 10 | ||
| ``` | ||
| 5. The system detects this conflict and retries Customer B's checkout. During retry, we get the available quantity as 0 and the version number as 11 | ||
| 6. Since there are 0 available quantity, Customer B receives"Out of Stock" error. | ||
|
|
There was a problem hiding this comment.
Optimistic locking example charges the customer before validating inventory. Both customers payments succeed first, and the inventory conflict is only discovered afterward. That means Customer B gets charged and needs a refund. And this will be very common to see and we will see a lot of refunds.
Suggested fix: Reorder the example so the inventory update (version check) happens before payment processing.
| Optimistic locking `detect conflicts and retry`. So, unlike pessimistic locking, we don't block all transactions in favor of | ||
| one transaction. When a conflict it detected for other transactions, they retry again to resolve conflicts. | ||
|
|
||
| ### Recommendation Engine |
There was a problem hiding this comment.
Item-based collaborative filtering is more common in production e-commerce than user-based, because item similarities are more stable and cheaper to compute at scale. Consider adding a brief mention alongside the existing user-based explanation so readers have the complete picture.
No description provided.