Conversation
227d614 to
f514e4e
Compare
There was a problem hiding this comment.
Pull request overview
This PR implements event-driven resource checking to prevent the use of outdated resource check results. Instead of relying solely on periodic timer-based checks, the system now triggers resource availability checks immediately when services connect or disconnect.
Key Changes:
- Replaced polling-based resource waiting with an event-driven channel-based system
- Added immediate resource check triggering on service lifecycle events (connection/disconnection)
- Upgraded Go version from 1.22 to 1.23 to support new language features (iter.Seq)
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| monitor_resources_test.go | Updated test to remove outdated retry logic and added new test implementation for event-driven resource checking |
| monitor_resources.go | Refactored monitoring to use select statement with timer and pause/resume channel, added broadcast functions for resource change events |
| management_api.go | Updated to dereference resourcesAvailable map directly instead of through pointer |
| main_test.go | Added new test case configuration for validating that outdated resource check results are not used |
| main.go | Extended ResourceManager with new mutexes and channels for event-driven coordination, updated resource reservation to use channel-based waiting instead of sleep polling |
| go.mod | Upgraded Go version to 1.23 |
| .github/workflows/tests.yml | Updated CI workflow to use Go 1.23 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
monitor_resources_test.go
Outdated
|
|
||
| time.Sleep(3500 * time.Millisecond) | ||
|
|
||
| serviceOneHealthCheckResponse := getHealthcheckResponse(t, serviceTwoHealthCheckAddress) |
There was a problem hiding this comment.
The variable is named "serviceOneHealthCheckResponse" but it's actually fetching the response from serviceTwoHealthCheckAddress. This is confusing and should be renamed to "serviceTwoHealthCheckResponse" for clarity.
main.go
Outdated
| rm.storeRunningServiceNoLock(name, runningService) | ||
| if runningService.activeConnections == 0 { | ||
| log.Printf("[%s] All connections closed, sending resourceChange event", name) | ||
| rm.broadcastResourceChanges(maps.Keys(serviceConfigByName[name].ResourceRequirements)) |
There was a problem hiding this comment.
Potential deadlock: broadcastResourceChanges is called while serviceMutex is held (line 113). This method acquires resourceChangeByResourceMutex, which could lead to lock ordering issues if another goroutine tries to acquire these locks in a different order.
monitor_resources_test.go
Outdated
| //time.Sleep(1 * time.Second) | ||
| //statusResponse = getStatusFromManagementAPI(t, managementApiAddress) | ||
| //assertPortsAreClosed(t, []string{serviceOneHealthCheckAddress, serviceTwoHealthCheckAddress}) | ||
| //verifyServiceStatus(t, statusResponse, serviceOneName, true, map[string]int{resourceName: 11}) | ||
| //verifyServiceStatus(t, statusResponse, serviceTwoName, false, map[string]int{resourceName: 0}) | ||
| //verifyTotalResourcesAvailable(t, statusResponse, map[string]int{resourceName: 11}) | ||
| //verifyTotalResourceUsage(t, statusResponse, map[string]int{resourceName: 10}) | ||
|
|
There was a problem hiding this comment.
This commented-out code block (lines 141-147) should be removed if it's not needed. Leaving large blocks of commented code reduces code maintainability and can cause confusion about whether it should be used or not.
| //time.Sleep(1 * time.Second) | |
| //statusResponse = getStatusFromManagementAPI(t, managementApiAddress) | |
| //assertPortsAreClosed(t, []string{serviceOneHealthCheckAddress, serviceTwoHealthCheckAddress}) | |
| //verifyServiceStatus(t, statusResponse, serviceOneName, true, map[string]int{resourceName: 11}) | |
| //verifyServiceStatus(t, statusResponse, serviceTwoName, false, map[string]int{resourceName: 0}) | |
| //verifyTotalResourcesAvailable(t, statusResponse, map[string]int{resourceName: 11}) | |
| //verifyTotalResourceUsage(t, statusResponse, map[string]int{resourceName: 10}) |
| handleNotEnoughResource(requestingService, outputError, true, resource, currentlyAvailableAmount, resourceRequirements[resource]) | ||
| return &resource | ||
| } | ||
| //todo: maxwaittime |
There was a problem hiding this comment.
TODO comment without a clear action plan or issue reference. Consider either implementing this or adding a reference to a tracking issue, or removing the comment if it's not needed.
| //todo: maxwaittime | |
| // NOTE: no maximum wait time is enforced here; callers are expected to wait until resources become available. |
main.go
Outdated
| runningService.stderrWriter.FinalFlush() | ||
| releaseResourcesWhenServiceMutexIsLocked(service.ResourceRequirements) | ||
| delete(resourceManager.runningServices, service.Name) | ||
| resourceManager.broadcastResourceChanges(maps.Keys(service.ResourceRequirements)) |
There was a problem hiding this comment.
Potential deadlock: broadcastResourceChanges is called while serviceMutex is held (line 1458). This is the same issue as line 113 - calling a method that acquires resourceChangeByResourceMutex while holding serviceMutex can lead to lock ordering issues and potential deadlocks.
| if !ok { | ||
| panic("pauseResumeChan closed unexpectedly, crashing to avoid infinite loop") | ||
| } | ||
| timer.Stop() |
There was a problem hiding this comment.
Timer is stopped (line 28) but not drained. After calling Stop(), if the timer has already fired, the channel may still contain a value. This should be drained to avoid a race condition. Consider using a select with default case to drain the channel after stopping.
| timer.Stop() | |
| if !timer.Stop() { | |
| select { | |
| case <-timer.C: | |
| default: | |
| } | |
| } |
| resourceManager.resourceChangeByResourceMutex.Unlock() | ||
| } | ||
|
|
||
| func UnpauseResourceAvailabilityMonitoring(resourceName string) { |
There was a problem hiding this comment.
Function name is inconsistent with Go naming conventions. Public function names should be PascalCase, but should also follow consistent patterns. Consider renaming to "unpauseResourceAvailabilityMonitoring" (lowercase first letter) if this is meant to be package-private, or ensure it follows a consistent naming pattern with other public functions in the package.
| func UnpauseResourceAvailabilityMonitoring(resourceName string) { | |
| func unpauseResourceAvailabilityMonitoring(resourceName string) { |
95af14b to
5549e36
Compare
9e58013 to
96f4f99
Compare
96f4f99 to
c494ecf
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
README.md
Outdated
| - `listen_port`: Port the service listens on | ||
| - `is_running`: Whether the service is currently running | ||
| - `status`: stopped, waiting_for_resources, starting or running. | ||
| - `active_connections`: Number of active connections to the service |
There was a problem hiding this comment.
The documentation field 'active_connections' is outdated and no longer exists in the API. According to the changes in management_api.go, this should document 'waiting_connections' and 'proxied_connections' as separate fields instead.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (rm ResourceManager) broadcastResourceChanges(resources iter.Seq[string], recheckNeeded bool) { | ||
| for resource := range resources { | ||
| rm.broadcastResourceChangeWhenResourceChangeByResourceMutexIsLocked(resource, recheckNeeded) | ||
| } | ||
| } |
There was a problem hiding this comment.
The function broadcastResourceChanges calls broadcastResourceChangeWhenResourceChangeByResourceMutexIsLocked which, based on its name, expects resourceChangeByResourceMutex to already be locked. However, this function does not acquire the mutex before calling it. This function is called from incrementConnection (line 119 in main.go) and cleanUpStoppedServiceWhenServiceMutexIsLocked (line 1527 in main.go), both of which only hold serviceMutex. This will cause race conditions when accessing resourceChangeByResourceChans. Add mutex locking: acquire resourceChangeByResourceMutex at the start of this function and release it after the loop completes.
main.go
Outdated
| if interrupted { | ||
| return nil, fmt.Errorf("interrupt signal was received") | ||
| } | ||
|
|
There was a problem hiding this comment.
This is a duplicate check of the one on lines 745-746. The second check is redundant and can be removed to simplify the code.
| if interrupted { | |
| return nil, fmt.Errorf("interrupt signal was received") | |
| } |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
main.go
Outdated
| } | ||
| resourceManager.serviceMutex.Lock() | ||
| cleanUpStoppedServiceWhenServiceMutexIsLocked(&serviceConfig, runningService, false) | ||
| cleanUpStoppedServiceWhenServiceMutexIsLocked(&serviceConfig, runningService, true) |
There was a problem hiding this comment.
The third parameter 'true' is unclear without context. Consider extracting this to a named constant like 'shouldReleaseResources' to improve code readability.
| cleanUpStoppedServiceWhenServiceMutexIsLocked(&serviceConfig, runningService, true) | |
| shouldReleaseResources := true | |
| cleanUpStoppedServiceWhenServiceMutexIsLocked(&serviceConfig, runningService, shouldReleaseResources) |
| handleNotEnoughResource(requestingService, outputError, true, resource, currentlyAvailableAmount, resourceRequirements[resource]) | ||
| return &resource | ||
| } | ||
| //todo: maxwaittime |
There was a problem hiding this comment.
Todo comment should be properly formatted with a capital 'T' and proper spacing: "TODO: max wait time".
| //todo: maxwaittime | |
| // TODO: max wait time |
main.go
Outdated
| stopService(*findServiceConfigByName(earliestLastUsedService)) | ||
| continue | ||
| } | ||
| //TODO: add resource amounts |
There was a problem hiding this comment.
This TODO comment is vague. It should specify what resource amounts need to be added and where (e.g., "TODO: add resource amounts to log message").
| //TODO: add resource amounts | |
| // TODO: Add requested and currently available amounts for *missingResource to the following log message. |
config.go
Outdated
| Amount int `json:"Amount"` | ||
| CheckCommand string `json:"CheckCommand"` | ||
| CheckIntervalMilliseconds uint `json:"CheckIntervalMilliseconds"` | ||
| CheckIntervalMilliseconds uint `json:"CheckWhenNotEnoughIntervalMilliseconds"` |
There was a problem hiding this comment.
The struct field name 'CheckIntervalMilliseconds' doesn't match the JSON tag 'CheckWhenNotEnoughIntervalMilliseconds'. This will cause incorrect unmarshaling. The field name should be 'CheckWhenNotEnoughIntervalMilliseconds' to match the JSON tag.
| verifyTotalResourceUsage(t, statusResponse, map[string]int{resourceName: 0}) | ||
| verifyServiceStatus(t, statusResponse, serviceOneName, ServiceStateStopped, 0, 0, map[string]int{resourceName: 0}) | ||
| verifyServiceStatus(t, statusResponse, serviceTwoName, ServiceStateStopped, 0, 0, map[string]int{resourceName: 0}) | ||
| verifyResourceUsage(t, statusResponse, map[string]int{resourceName: 0}, map[string]int{resourceName: 1}, map[string]int{resourceName: 0}, map[string]int{resourceName: 0}) |
There was a problem hiding this comment.
Multiple consecutive map[string]int parameters make this call hard to read and error-prone. Consider using a struct with named fields (e.g., ExpectedResourceUsage{Reserved: ..., Free: ..., Used: ..., Total: ...}) to improve clarity.
90a451a to
e4a6c17
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "healthcheck timed out, not starting another healthcheck command: not enough time left for another HealthcheckInterval(%v) in StartupTimeout(%v)", | ||
| sleepDuration, | ||
| timeout, |
There was a problem hiding this comment.
The error message format string references variable names instead of using formatting placeholders. The message should format the duration values properly instead of printing "%dms" and "%s" literally.
| } | ||
| } | ||
| } | ||
| if !enoughOfResource && (!firstCheckNeeded || resourceConfig.CheckCommand == "") { |
There was a problem hiding this comment.
In the logic at line 1056, the condition checks (!firstCheckNeeded || resourceConfig.CheckCommand == "") which means resources without CheckCommand will immediately return as not enough resource without proper error logging when firstCheckNeeded is true. This seems inconsistent - resources without CheckCommand should be handled the same way regardless of the firstCheckNeeded flag.
| if !enoughOfResource && (!firstCheckNeeded || resourceConfig.CheckCommand == "") { | |
| if resourceConfig.CheckCommand == "" { | |
| if !enoughOfResource { | |
| handleNotEnoughResource(requestingService, outputError, false, resource, currentlyAvailableAmount, requiredAmount) | |
| return &resource | |
| } | |
| } else if !firstCheckNeeded && !enoughOfResource { |
| reservedAmount, ok := resourceManager.resourcesReserved[resource] | ||
| if !ok { | ||
| log.Printf( | ||
| "[%s] ERROR: Resource \"%s\" is missing from the list of the reserved resources. This shouldn't be happening", | ||
| requestingService, | ||
| resource, | ||
| ) | ||
| reservedAmount = 0 | ||
| } | ||
| enoughOfResource := requiredAmount <= currentlyAvailableAmount-reservedAmount | ||
| return currentlyAvailableAmount, enoughOfResource |
There was a problem hiding this comment.
There is a potential race condition in accessing resourceManager.resourcesReserved at line 1130. This field is accessed without holding serviceMutex, but it's modified in other places while holding serviceMutex. This could lead to inconsistent reads or race detector warnings. The access should be protected by acquiring serviceMutex first.
| //This is not optimal since it is blocked until the first resource change, which could take longer than the consequent ones. | ||
| //But in a scenario when there are enough resources, we need to wait for them all anyway, and if it's not enough, well, we're still waiting. |
There was a problem hiding this comment.
The comment "This is not optimal since it is blocked until the first resource change" describes a known limitation but doesn't explain the impact or why this approach was chosen. Consider expanding this comment to explain the tradeoffs and whether this is acceptable for the current use case.
| //This is not optimal since it is blocked until the first resource change, which could take longer than the consequent ones. | |
| //But in a scenario when there are enough resources, we need to wait for them all anyway, and if it's not enough, well, we're still waiting. | |
| // This is not optimal since it is blocked until the first resource change for each | |
| // resource, which could take longer than some of the subsequent changes. We accept | |
| // this coarse‑grained blocking because: | |
| // * this path is only used for resources with a CheckCommand, and the number of | |
| // such resources per service is expected to be small; | |
| // * keeping the logic as a simple "wait for the first change, then re‑evaluate" | |
| // avoids more complex, concurrent coordination and reduces the chance of subtle | |
| // race conditions in the resource manager; and | |
| // * in the common case where enough resources eventually become available, the | |
| // service must wait for all required resources anyway, so the additional | |
| // latency from this blocking approach is acceptable for our current workloads. |
| } | ||
| } | ||
| time.Sleep(100 * time.Millisecond) //Give lmp time to catch up | ||
| //bug in the actual code here? |
There was a problem hiding this comment.
The comment at line 198 has a potential issue. It states "bug in the actual code here?" which is unclear. This comment should either be clarified with specific details about what bug is suspected, or it should be removed if the bug was already addressed.
| //bug in the actual code here? |
| } | ||
|
|
||
| func findFirstMissingResourceWhenServiceMutexIsLocked(resourceRequirements map[string]int, requestingService string, outputError bool) *string { | ||
| func findFirstMissingResourceWhenServiceMutexIsLocked(resourceRequirements map[string]int, requestingService string, outputError bool, firstCheckNeeded bool) *string { |
There was a problem hiding this comment.
The variable name 'firstCheckNeeded' is confusing and doesn't clearly convey its purpose. Based on the usage, it appears to control whether to wait for the first check command result. A clearer name would be 'shouldWaitForCheckCommand' or 'needsInitialCheckCommand'.
| free = resourceManager.resourcesAvailable[resourceName] | ||
| } |
There was a problem hiding this comment.
There is a potential race condition when accessing resourceManager.resourcesAvailable at line 69. This field should be protected by resourcesAvailableMutex according to the comment in the struct definition, but here it's accessed while holding serviceMutex. Either acquire resourcesAvailableMutex before reading, or ensure the locking strategy is documented and consistent.
| func (rm ResourceManager) broadcastFirstChangeIfMutexIsLocked(resourceName string) { | ||
| resourceChangeByResourceChans, ok := rm.checkCommandFirstChangeByResourceChans[resourceName] | ||
| if !ok { | ||
| return //map is not initialized for resources without CheckCommand, so it being missing is ok | ||
| } | ||
| sendSignalToChannels(resourceChangeByResourceChans, resourceName, "checkCommandFirstChangeByResourceChans", struct{}{}) | ||
| } | ||
| func (rm ResourceManager) broadcastResourceChangeWhenResourceChangeByResourceMutexIsLocked(resourceName string, recheckNeeded bool) { | ||
| serviceChannels, ok := rm.resourceChangeByResourceChans[resourceName] | ||
| if !ok { | ||
| log.Printf("[Resource Monitor][%s] ERROR: resourceChangeByResourceChans map is not initialized", resourceName) | ||
| return | ||
| } | ||
| sendSignalToChannels(serviceChannels, resourceName, "resourceChangeByResourceChans", recheckNeeded) | ||
| } |
There was a problem hiding this comment.
These functions use a receiver value type instead of a pointer receiver. Since they operate on and modify state in the ResourceManager (via locking and channel operations), they should use pointer receivers (*ResourceManager) for consistency with other methods like storeRunningService and incrementConnection.
| resourceManager.resourceChangeByResourceMutex.Lock() | ||
| if len(resourceManager.resourceChangeByResourceChans[resourceName]) > 0 || len(resourceManager.checkCommandFirstChangeByResourceChans[resourceName]) > 0 { | ||
| timer.Reset(checkInterval) | ||
| } | ||
| resourceManager.resourceChangeByResourceMutex.Unlock() |
There was a problem hiding this comment.
The condition checks if there are items in the channels by checking the length, but channels can become empty between the check and the timer reset. This could lead to the timer not being reset when it should be. Consider using a more robust signaling mechanism or ensuring the check and reset are properly synchronized.
| resourceManager.resourceChangeByResourceMutex.Lock() | ||
| //This might be not necessary if there is no CheckCommand | ||
| for _, resourceToCheck := range resourceManager.checkCommandFirstChangeByResourceChans { | ||
| delete(resourceToCheck, requestingService) | ||
| } | ||
| resourceManager.resourceChangeByResourceMutex.Unlock() |
There was a problem hiding this comment.
The comment at line 1085 states "This might be not necessary if there is no CheckCommand" which indicates uncertainty about the code's behavior. Either the cleanup is necessary or it isn't - this should be clarified and the comment updated to explain when and why this cleanup is needed.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| runningService.manageMutex.Lock() | ||
| resourceManager.serviceMutex.Lock() | ||
| _, ok := resourceManager.runningServices[serviceConfig.Name] | ||
| if ok { | ||
| resourceManager.serviceMutex.Unlock() | ||
| log.Printf("[%s] ERROR: Trying to start a service while it is already present in the list of running services", serviceConfig.Name) | ||
| return nil, fmt.Errorf("service already started") | ||
| } |
There was a problem hiding this comment.
This early-return path leaks runningService.manageMutex (it’s locked on line 619 and never unlocked if ok is true), which can deadlock future operations on this RunningService. Unlock runningService.manageMutex before returning (and consider creating/locking the mutex only after confirming the service isn’t already present).
| runningService.stdoutWriter.FinalFlush() | ||
| runningService.stderrWriter.FinalFlush() |
There was a problem hiding this comment.
stdoutWriter/stderrWriter can be nil (e.g., if runServiceCommand fails and returns nil writers, or if cleanup happens before they’re set). Calling FinalFlush() will panic. Guard these calls with nil checks (or ensure writers are always initialized to non-nil no-op implementations).
| runningService.stdoutWriter.FinalFlush() | |
| runningService.stderrWriter.FinalFlush() | |
| if runningService.stdoutWriter != nil { | |
| runningService.stdoutWriter.FinalFlush() | |
| } | |
| if runningService.stderrWriter != nil { | |
| runningService.stderrWriter.FinalFlush() | |
| } |
| currentlyAvailableAmount, | ||
| enoughOfResource = | ||
| isEnoughResourceForServiceWithCheckCommandThatRan( | ||
| resource, | ||
| config.ResourcesAvailable[resource].Amount, | ||
| currentlyAvailableAmount, | ||
| resourceManager.resourcesReserved[resource], | ||
| requiredAmount, | ||
| ) | ||
| } else { | ||
| log.Printf( | ||
| "[%s] Not enough %s to start. Total: %d, Reserved by running services: %d, Required: %d", | ||
| requestingService, | ||
| resource, | ||
| config.ResourcesAvailable[resource].Amount, | ||
| resourceManager.resourcesInUse[resource], | ||
| requiredAmount, | ||
| ) |
There was a problem hiding this comment.
This assignment is not valid Go syntax as written and will not compile. It should be a single multi-assign statement (e.g., currentlyAvailableAmount, enoughOfResource = ...) with correct formatting/indentation.
| currentlyAvailableAmount, | |
| enoughOfResource = | |
| isEnoughResourceForServiceWithCheckCommandThatRan( | |
| resource, | |
| config.ResourcesAvailable[resource].Amount, | |
| currentlyAvailableAmount, | |
| resourceManager.resourcesReserved[resource], | |
| requiredAmount, | |
| ) | |
| } else { | |
| log.Printf( | |
| "[%s] Not enough %s to start. Total: %d, Reserved by running services: %d, Required: %d", | |
| requestingService, | |
| resource, | |
| config.ResourcesAvailable[resource].Amount, | |
| resourceManager.resourcesInUse[resource], | |
| requiredAmount, | |
| ) | |
| currentlyAvailableAmount, enoughOfResource = isEnoughResourceForServiceWithCheckCommandThatRan( | |
| resource, | |
| requestingService, | |
| requiredAmount, | |
| ) |
| //This is not optimal since it is blocked until the first resource change, which could take longer than the consequent ones. | ||
| //But in a scenario when there are enough resources, we need to wait for them all anyway, and if it's not enough, well, we're still waiting. | ||
| for resource, changeChan := range firstChangeChanByResource { | ||
| <-changeChan | ||
| currentlyAvailableAmount, enoughOfResource := isEnoughResourceForServiceWithCheckCommandThatRan( |
There was a problem hiding this comment.
This blocks indefinitely waiting for changeChan. If the check command fails (execution/parsing error), checkResourceAvailabilityWithKnownCommand returns early and never broadcasts the ‘first change’ signal, so this can deadlock a request forever. Add a timeout/cancellation path (e.g., reuse the existing max-wait timer) and/or broadcast a first-change signal even on check errors so waiters can re-evaluate or fail fast.
| } else if (sortOrder === "proxied_connections") { | ||
| // proxied_connections is no longer a valid sort option but may exist in localStorage; this migration code can be removed in the future. |
There was a problem hiding this comment.
The previous UI stored the sort key as active_connections (per the removed <option value=\"active_connections\">). Migrating proxied_connections won’t help users upgrading from the immediate prior version. Update the migration to handle active_connections (and any other previously valid values) so the UI doesn’t fall back incorrectly.
| } else if (sortOrder === "proxied_connections") { | |
| // proxied_connections is no longer a valid sort option but may exist in localStorage; this migration code can be removed in the future. | |
| } else if ( | |
| sortOrder === "proxied_connections" || | |
| sortOrder === "active_connections" | |
| ) { | |
| // proxied_connections and active_connections are no longer valid sort options but may exist in localStorage; this migration code can be removed in the future. |
| assert.Equal(t, expectedReserved[resource], resourceInfo.ReservedByStartingServices, "Resource %s reserved by starting services", resource) | ||
| assert.Equal(t, expectedFree[resource], resourceInfo.Free, "Resource %s free", resource) | ||
| assert.Equal(t, expectedUsed[resource], resourceInfo.InUse, "Resource %s used", resource) | ||
| assert.Equal(t, expectedTotal[resource], resourceInfo.Total, "Resource %s total available", resource) |
There was a problem hiding this comment.
Indexing expectedFree/expectedUsed/expectedTotal by key without checking presence can silently treat missing keys as 0, producing misleading passes/failures if maps are mis-specified but the lengths happen to match. Prefer asserting key existence per resource (e.g., value, ok := expectedFree[resource] + assert.True(t, ok, ...)) before comparing.
| assert.Equal(t, expectedReserved[resource], resourceInfo.ReservedByStartingServices, "Resource %s reserved by starting services", resource) | |
| assert.Equal(t, expectedFree[resource], resourceInfo.Free, "Resource %s free", resource) | |
| assert.Equal(t, expectedUsed[resource], resourceInfo.InUse, "Resource %s used", resource) | |
| assert.Equal(t, expectedTotal[resource], resourceInfo.Total, "Resource %s total available", resource) | |
| // Ensure all expected maps contain this resource key before comparison. | |
| reservedExpected, ok := expectedReserved[resource] | |
| assert.True(t, ok, "Expected reserved resources to contain key %s", resource) | |
| freeExpected, ok := expectedFree[resource] | |
| assert.True(t, ok, "Expected free resources to contain key %s", resource) | |
| usedExpected, ok := expectedUsed[resource] | |
| assert.True(t, ok, "Expected used resources to contain key %s", resource) | |
| totalExpected, ok := expectedTotal[resource] | |
| assert.True(t, ok, "Expected total resources to contain key %s", resource) | |
| assert.Equal(t, reservedExpected, resourceInfo.ReservedByStartingServices, "Resource %s reserved by starting services", resource) | |
| assert.Equal(t, freeExpected, resourceInfo.Free, "Resource %s free", resource) | |
| assert.Equal(t, usedExpected, resourceInfo.InUse, "Resource %s used", resource) | |
| assert.Equal(t, totalExpected, resourceInfo.Total, "Resource %s total available", resource) |
|
|
||
| With this configuration, Qwen and Automatic1111 can run at the same time. Assuming they do, a request for Gemma will unload the one least recently used. If they are currently in use, a request to Gemma will have to wait for one of the other services to free up. | ||
| `ResourcesAvailable` can include any resource metrics, CPU cores, multiple VRAM values for multiple GPUs, etc. These values are not checked against actual usage. | ||
| `CheckCommand` is a shell command that will ran any time a service requiring this resource needs to be started. If there is not enough resources, it will run every `CheckWhenNotEnoughIntervalMilliseconds` and any time a service using this resource stops until the value returned satisfies the amount of resource required. |
There was a problem hiding this comment.
Fix grammar: 'will ran' should be 'will run'.
| `CheckCommand` is a shell command that will ran any time a service requiring this resource needs to be started. If there is not enough resources, it will run every `CheckWhenNotEnoughIntervalMilliseconds` and any time a service using this resource stops until the value returned satisfies the amount of resource required. | |
| `CheckCommand` is a shell command that will run any time a service requiring this resource needs to be started. If there is not enough resources, it will run every `CheckWhenNotEnoughIntervalMilliseconds` and any time a service using this resource stops until the value returned satisfies the amount of resource required. |
| @@ -0,0 +1,2 @@ | |||
| todos in the code | |||
| make tests stable | |||
There was a problem hiding this comment.
Committing a generic todo.txt at repo root tends to become stale and isn’t actionable (no owners, links, or context). Consider moving these items into tracked issues, or expand this file with concrete references (file/line or issue IDs) so it remains useful.
| make tests stable | |
| [OWNER: qa-team] Stabilize flaky integration tests (see ISSUE-123 in tracker for details). |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
962d9ef to
8998f5f
Compare
No description provided.