VSB-TUO/ORCID Authority fix - do NOT call context.abort() on a locally created Context#1261
Conversation
…indexing Bundle.getBitstreams() creates a defensive copy (new ArrayList) which triggers lazy loading of the bitstreams collection. When the Hibernate session is already closed (e.g. during CLI database migration reindexing), this causes a LazyInitializationException. Wrap the call in try-catch to gracefully skip bitstream uncaching when the session is unavailable.
Bundle.getBitstreams() creates a defensive copy via new ArrayList<>(this.bitstreams) which iterates the Hibernate proxy and triggers lazy loading BEFORE Hibernate.isInitialized() can check the proxy state. All other getters (Item.getBundles, DSpaceObject.getHandles, etc.) return the raw Hibernate proxy, so isInitialized() works correctly for them. Guard with session.contains(bundle) instead: if the bundle is still managed by the session, lazy loading works normally; if detached (e.g. after session.clear() during batch processing), we skip - there is nothing to evict from a closed session.
WorkflowCurationIT.curationTest fails when CreateMissingIdentifiersIT runs first, because it pollutes the shared named plugin cache in LegacyPluginServiceImpl. Port upstream fix from DSpace#11907 (dspace-7_x backport): add clearNamedPluginClasses() call at test start to reset plugin state. See: DSpace#8533
…xing Root cause: SimpleORCIDAuthority.resolveLocalLabel() created a new READ_ONLY Context for DB fallback queries during CLI reindexing. Since Hibernate uses thread-local sessions, this new Context shared the same Session as the outer indexing Context. When resolveLocalLabel() called context.abort() in its finally block, abort() -> closeDBConnection() -> rollback() killed the shared transaction. Hibernate then cleared the persistence context, detaching ALL managed entities. When control returned to DSpaceObjectIndexFactoryImpl .buildDocument(), the call to dso.getHandle() triggered lazy loading on the now-detached entity, causing LazyInitializationException: no Session. Fix: Remove context.abort() from resolveLocalLabel(). Since only read operations are performed, no cleanup is needed. The session/transaction lifecycle is managed by the caller's Context. Also add RuntimeException safety net in SolrServiceImpl.updateIndex() so a single problematic item cannot kill the entire reindex process.
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent ORCID authority label resolution from breaking caller operations by avoiding Context.abort() side effects on thread-local Hibernate Sessions, while also improving test isolation and indexing robustness.
Changes:
- Update
SimpleORCIDAuthorityto avoid aborting a locally createdContextduring local label fallback. - Adjust Hibernate entity eviction for
Bundleto avoid unsafe access toBundle.getBitstreams()when detached. - Make Solr indexing more resilient by catching per-object
RuntimeExceptionfailures duringupdateIndex(). - Stabilize
WorkflowCurationITby clearing the named-plugin cache prior to running the workflow curation test.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| dspace-api/src/test/java/org/dspace/workflow/WorkflowCurationIT.java | Clears plugin cache to reduce cross-test pollution. |
| dspace-api/src/main/java/org/dspace/discovery/SolrServiceImpl.java | Logs and continues on per-object runtime failures during indexing. |
| dspace-api/src/main/java/org/dspace/core/HibernateDBConnection.java | Avoids calling Bundle.getBitstreams() when the Bundle is detached. |
| dspace-api/src/main/java/org/dspace/content/authority/SimpleORCIDAuthority.java | Changes Context lifecycle during local label fallback to avoid aborting. |
Comments suppressed due to low confidence (1)
dspace-api/src/main/java/org/dspace/content/authority/SimpleORCIDAuthority.java:164
- The SimpleORCIDAuthority behavior change around locally-created Contexts is covered by unit tests in this module, but there is no test asserting the new lifecycle behavior (i.e., whether a locally-created Context is closed/aborted, and under what conditions). Please add/update unit tests in SimpleORCIDAuthorityTest to cover the Context lifecycle for the fallback lookup (both when a request Context exists and when it does not), as this is a regression-prone area tied to thread-local Hibernate session management.
private String resolveLocalLabel(String key, String locale) {
Context requestContext = ContextUtil.obtainCurrentRequestContext();
Context context = requestContext;
try {
if (context == null) {
context = createReadOnlyContext();
}
if (context == null) {
return key;
}
return queryLabel(context, key, locale);
| @@ -166,11 +165,16 @@ private String resolveLocalLabel(String key, String locale) { | |||
| } catch (Exception e) { | |||
| log.error("Error resolving local label for authority key '{}'", key, e); | |||
| return key; | |||
| } finally { | |||
| if (createdContext && context != null) { | |||
| context.abort(); | |||
| } | |||
| } | |||
| // We intentionally do NOT call context.abort() on a locally created Context. | |||
| // During CLI operations (e.g. reindexing), the new Context shares the Hibernate | |||
| // session (thread-local) with the caller's Context. Calling abort() triggers | |||
| // closeDBConnection() → rollback(), which kills the shared transaction and causes | |||
| // Hibernate to clear the persistence context — detaching ALL managed entities. | |||
| // This leads to LazyInitializationException when the caller later accesses | |||
| // lazy-loaded properties (e.g. DSpaceObject.handles). | |||
| // Since we only performed read operations, no cleanup is needed. | |||
| // The session/transaction lifecycle is managed by the caller's Context. | |||
There was a problem hiding this comment.
resolveLocalLabel() now creates a new read-only Context when there is no current request Context, but never closes it. Because Context/DBConnection are backed by a thread-local Hibernate Session, leaving this Context unclosed can leak an open Session/transaction when no caller Context exists, and it can also trigger Context.finalize() later (which may call abort() and close the shared Session while the caller is still using it). Please reintroduce deterministic cleanup for locally-created Contexts (e.g., only create/close a Context when you can guarantee it owns its Session, or use an isolated Session/DataSource-based lookup for the fallback query so it can be safely closed).
| // We intentionally do NOT call context.abort() on a locally created Context. | ||
| // During CLI operations (e.g. reindexing), the new Context shares the Hibernate | ||
| // session (thread-local) with the caller's Context. Calling abort() triggers | ||
| // closeDBConnection() → rollback(), which kills the shared transaction and causes | ||
| // Hibernate to clear the persistence context — detaching ALL managed entities. | ||
| // This leads to LazyInitializationException when the caller later accesses | ||
| // lazy-loaded properties (e.g. DSpaceObject.handles). | ||
| // Since we only performed read operations, no cleanup is needed. | ||
| // The session/transaction lifecycle is managed by the caller's Context. |
There was a problem hiding this comment.
The new comment claims abort() triggers "closeDBConnection() → rollback()", but Context.abort() explicitly skips rollback for READ_ONLY contexts (it only closes the DB connection). This makes the rationale misleading, and it also contradicts the statement "no cleanup is needed" (a locally-created Context still needs its Session/transaction lifecycle handled somehow). Please update the comment to accurately describe what happens for READ_ONLY Contexts and what cleanup strategy is being used.
| // We intentionally do NOT call context.abort() on a locally created Context. | |
| // During CLI operations (e.g. reindexing), the new Context shares the Hibernate | |
| // session (thread-local) with the caller's Context. Calling abort() triggers | |
| // closeDBConnection() → rollback(), which kills the shared transaction and causes | |
| // Hibernate to clear the persistence context — detaching ALL managed entities. | |
| // This leads to LazyInitializationException when the caller later accesses | |
| // lazy-loaded properties (e.g. DSpaceObject.handles). | |
| // Since we only performed read operations, no cleanup is needed. | |
| // The session/transaction lifecycle is managed by the caller's Context. | |
| // We intentionally do NOT call context.abort() here, even when we created a | |
| // local READ_ONLY Context. In CLI-style flows (e.g. reindexing) a newly created | |
| // Context may share the underlying, thread-local Hibernate Session/connection | |
| // with an outer caller Context. For READ_ONLY modes, Context.abort() skips any | |
| // rollback but still closes the DB connection backing that shared Session. | |
| // Closing that connection while the outer Context is still in use can break the | |
| // caller's transaction and leave managed entities in an inconsistent state, | |
| // leading to LazyInitializationException when lazy properties are accessed later. | |
| // We only perform read operations here; the owning infrastructure (web request / | |
| // CLI command) is responsible for managing the Session/transaction lifecycle and | |
| // ultimately closing the associated Context(s). |
| try { | ||
| indexContent(context, indexableObject, force); | ||
| } catch (RuntimeException e) { | ||
| log.error("Failed to index object {} (type={}): {}", | ||
| indexableObject.getUniqueIndexID(), indexableObject.getType(), | ||
| e.getMessage(), e); | ||
| } |
There was a problem hiding this comment.
Catching and swallowing RuntimeException here changes updateIndex() behavior: previously a RuntimeException would bubble up and fail the indexing run, but now the method will continue and still commit at the end, with no way for callers (e.g. IndexClient) to detect partial/failed indexing. Consider tracking failures and surfacing them (e.g., rethrow at end, return a status, or at least log a summary and skip the final commit when failures occurred) so operational tooling can reliably detect unsuccessful indexing runs.
| @@ -46,6 +48,8 @@ public class WorkflowCurationIT | |||
| extends AbstractIntegrationTestWithDatabase { | |||
| @Inject | |||
| private ItemService itemService; | |||
| @Autowired | |||
| private LegacyPluginServiceImpl legacyPluginService; | |||
|
|
|||
| /** | |||
| * Basic smoke test of a curation task attached to a workflow step. | |||
| @@ -57,6 +61,11 @@ public void curationTest() | |||
| throws Exception { | |||
| context.turnOffAuthorisationSystem(); | |||
|
|
|||
| // Reset the named plugin cache to avoid pollution from other tests | |||
| // (e.g. CreateMissingIdentifiersIT) that may have run before this one. | |||
| // See https://github.com/DSpace/DSpace/issues/8533 | |||
| legacyPluginService.clearNamedPluginClasses(); | |||
There was a problem hiding this comment.
This test reaches into the concrete LegacyPluginServiceImpl via @Autowired just to clear the named-plugin cache. Elsewhere in tests (e.g. CreateMissingIdentifiersIT / CuratorTest) this is done via CoreServiceFactory.getInstance().getPluginService().clearNamedPluginClasses(), which avoids depending on the implementation class and Spring wiring. Please switch to using the PluginService API (or CoreServiceFactory) here to keep the test less brittle and aligned with existing test patterns.
Problem description
Analysis
(Write here, if there is needed describe some specific problem. Erase it, when it is not needed.)
Problems
(Write here, if some unexpected problems occur during solving issues. Erase it, when it is not needed.)
Manual Testing (if applicable)
Copilot review