Migration lep#520
Conversation
| except (CDSMigrationException, ValidationError, InvalidRelationValue) as e: | ||
|
|
||
| exc = ManualImportRequired( | ||
| message=str(e), |
There was a problem hiding this comment.
When we catch any kind of migration errors I get "" for str(e), so we lose the error message.
For example if we get report number already exists error it's ManualImportRequired and when we catch we're losing it's message

with the message:

maybe we can do
message=getattr(e, "message", None) or str(e)or maybe we can update the CDSMigrationException class so str(e) can keep the error message?
There was a problem hiding this comment.
I've fixed this elsewhere in the logging, it shouldn't happen anymore
| ) | ||
| for experiment in experiments: | ||
| if experiment.lower().strip() == "not applicable": | ||
| if experiment.lower().strip() in ["not applicable", "xx"]: |
There was a problem hiding this comment.
Can we also add select: here so i can remove it in my PR?
https://github.com/CERNDocumentServer/cds-migrator-kit/pull/521/changes#diff-49ab885d21545801b8fba4c7acb6cdff45760cdb4fbca9eff8d79dfd8592dc88R475
or we can add it with my PR :)
There was a problem hiding this comment.
your PR adds it already so someone will need to rebase
| def _match_affiliation(self, affiliation_name, json_entry): | ||
| """Match an affiliation against `CDSMigrationAffiliationMapping` db table.""" | ||
| if is_ror(affiliation_name): | ||
| ror = normalize_ror(affiliation_name) | ||
| name = AffiliationsMetadata.query.filter_by(pid=ror).one_or_none() | ||
| if name is None: | ||
| raise ManualImportRequired( | ||
| message="Affiliation {ror} does not exist in the AffiliationMetadata table".format(ror=ror), | ||
| field="validation", | ||
| stage="transform", | ||
| description="Add this affiliation", | ||
| recid=json_entry["recid"], |
There was a problem hiding this comment.
since only recid is used in json_entry wouldnt it be better if we just pass the recid?
There was a problem hiding this comment.
usually yes but I am not sure if it will make a big difference here
| if subject_value.lower() == "xx": | ||
| raise IgnoreKey("subjects") |
There was a problem hiding this comment.
Can we do like
if subject_value.lower() in ["xx", "other subjects"]:So maybe i can remove this?
https://github.com/CERNDocumentServer/cds-migrator-kit/pull/522/changes#diff-fea67b5537f89757fd7489fe0cd10225987b0d7ff4b4e8cd6e961057bef8879bR158-R160
There was a problem hiding this comment.
your PR will add this so someone will need to rebase. In general, I wouldn't change a PR because another feature needs something, because it is hard to draw a line where to stop :)
| def date(self, key, value): | ||
| """Translates dates.""" | ||
| dates = self.get("dates", []) | ||
| valid = value.get("a") |
There was a problem hiding this comment.
Do we need a date validation?
There was a problem hiding this comment.
we have the validation on the load step
|
|
||
|
|
||
| class ResearchModel(CdsOverdo): | ||
| """Translation model for MoUs.""" |
There was a problem hiding this comment.
the docstring? No :)
|
|
||
|
|
||
| class ResearchCommitteeModel(CdsOverdo): | ||
| """Translation model for MoUs.""" |
| "8564_8", # file id | ||
| "8564_s", # bibdoc id | ||
| "8564_x", # icon thumbnails sizes | ||
| # "8564_y", # file description - done by files dump |
There was a problem hiding this comment.
this is ignored for almost all models, why not here?
There was a problem hiding this comment.
it seems my answer for this was not posted somehow...
because for the research collection we use this field to determine open access fields, we can't ignore it
| if subject.get("subject", "") in ["xx"]: | ||
| del subject | ||
|
|
||
| subjects(json_entry) |
There was a problem hiding this comment.
Since we already check in the rule, you think it's needed? Or if we need should we also check select: other subjects to delete?
There was a problem hiding this comment.
yes we can move the whole clean up here. Let's move it after merging your PR, when we have all the values to drop?
* chore(tests): fix testing cases, formatting
it is best to review commit by commit
PR introduces some slight performance improvements
needs: CERNDocumentServer/cds-rdm#795