-
Notifications
You must be signed in to change notification settings - Fork 10
feat(organization): reestruturação completa dos modelos e normalização de dados #1267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(organization): reestruturação completa dos modelos e normalização de dados #1267
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Este PR promove uma reestruturação do módulo de organizações para suportar a coexistência de dados brutos (raw) e padronizados (canonical), adicionando metadados de origem e novas abstrações para hierarquia organizacional e papéis.
Changes:
- Atualiza o modelo
Organizationpara incluir metadados de origem (source) e identificador externo (external_id) e ajusta a modelagem de tipos de instituição. - Introduz novos modelos para hierarquia (
OrganizationalLevel) e para dados brutos (RawOrganization), com mecanismos de vinculação/normalização. - Substitui
inst_typepor novas listas de choices (SOURCE_CHOICESeORGANIZATION_ROLES).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
| organization/models.py | Reestrutura modelos para canonical/raw, adiciona hierarquia e papéis, altera campos/constraints e lógica de criação/consulta/validação. |
| organization/choices.py | Define novas choices para fonte do dado e papéis organizacionais. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class Organization(BaseOrganization, CommonControlField, ClusterableModel): | ||
| """ | ||
| Representa organizações/instituições no sistema atual (versão 2) | ||
|
|
||
| name = remove_extra_spaces(name) | ||
| acronym = remove_extra_spaces(acronym) | ||
| Usado em: | ||
| - researcher.models: NewResearcher.affiliation (ForeignKey) |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Organization.create_or_update was removed from the model, but it is still called in multiple places (e.g., core_settings/tasks.py:65, researcher/tasks.py:66, organization/tasks.py:71, organization/tests.py:61). This will raise AttributeError at runtime and break existing tasks/tests. Either restore a backwards-compatible create_or_update (and any required fields/args) or update all call sites in this PR to the new API.
| blank=True, | ||
| related_name="organizations", | ||
| help_text=_("Types of institution (university, hospital, etc.)"), | ||
| ) |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change renames the Organization institution-type relationship to institution_type, but the codebase still uses institution_type_scielo in tasks/tests (e.g., organization/tasks.py:77, organization/tests.py:67). Without updating those references (or providing a compatibility alias), imports and runtime code will fail with attribute/keyword errors.
| ) | |
| ) | |
| # Backwards-compatible alias for legacy code using `institution_type_scielo` | |
| institution_type_scielo = institution_type |
| class Meta: | ||
| verbose_name = _("Organization") | ||
| verbose_name_plural = _("Organizations") | ||
| unique_together = [("name", "location", "external_id", "source")] # NOVO | ||
| indexes = [ | ||
| models.Index(fields=["name"]), | ||
| models.Index(fields=["source"]), | ||
| models.Index(fields=["external_id"]), | ||
| ] |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model fields/constraints were significantly changed (removed institution_type_mec, added source/external_id, renamed M2M), but there is no new migration in organization/migrations/ to reflect these schema changes. This will cause migration drift and runtime errors when Django loads the app. Add the appropriate schema/data migrations as part of this PR.
organization/models.py
Outdated
| # organizational_level só pode existir se organization existir | ||
| if self.organizational_level.exists() and not self.organization.exists(): | ||
| raise ValidationError( | ||
| { | ||
| "organizational_level": _( | ||
| "Cannot set organizational_level without organization" | ||
| ) | ||
| } | ||
| ) |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RawOrganization.clean() treats organizational_level and organization as querysets (.exists()), but they are ForeignKey fields. This will raise AttributeError during model validation/admin save. Use _id checks (or direct truthiness) for FK presence instead of .exists().
organization/models.py
Outdated
| # organizational_level deve pertencer à organization | ||
| if self.organizational_level.exists() and self.organization.exists(): | ||
| org_ids = set(self.organization.values_list("id", flat=True)) | ||
| for org_level in self.organizational_level.all(): | ||
| if org_level.organization_id not in org_ids: | ||
| raise ValidationError( | ||
| { | ||
| "organizational_level": _( | ||
| "Organizational level must belong to the selected organization" | ||
| ) | ||
| } | ||
| ) |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second validation block also treats FKs as querysets (values_list, .all()). Since organizational_level is a single FK, validate by comparing self.organizational_level.organization_id against self.organization_id (when both are set) instead of iterating.
organization/models.py
Outdated
| @property | ||
| def is_matched(self): | ||
| """Indica se está vinculado a Organization.""" | ||
| return self.organization.exists() |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_matched uses self.organization.exists(), but organization is a ForeignKey, so .exists() is invalid. Use bool(self.organization_id) (or self.organization is not None) instead.
| original=clean_xml_tag_content(original), | ||
| name=clean_xml_tag_content(name), | ||
| normalized_name=clean_xml_tag_content(normalized_name), | ||
| acronym=clean_xml_tag_content(acronym), | ||
| country=clean_xml_tag_content(country), | ||
| state=clean_xml_tag_content(state), | ||
| city=clean_xml_tag_content(city), | ||
| level_1=clean_xml_tag_content(level_1), | ||
| level_2=clean_xml_tag_content(level_2), | ||
| level_3=clean_xml_tag_content(level_3), |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean_xml_tag_content() defaults to assert_string=True, which returns None for values containing punctuation/digits (e.g., 'Univ. de São Paulo'). Using it for raw fields like original, name, and location/levels can silently null out valid inputs and undermine the goal of storing raw data. Consider calling it with assert_string=False (or using a less strict normalizer) for these fields.
| original=clean_xml_tag_content(original), | |
| name=clean_xml_tag_content(name), | |
| normalized_name=clean_xml_tag_content(normalized_name), | |
| acronym=clean_xml_tag_content(acronym), | |
| country=clean_xml_tag_content(country), | |
| state=clean_xml_tag_content(state), | |
| city=clean_xml_tag_content(city), | |
| level_1=clean_xml_tag_content(level_1), | |
| level_2=clean_xml_tag_content(level_2), | |
| level_3=clean_xml_tag_content(level_3), | |
| original=clean_xml_tag_content(original, assert_string=False), | |
| name=clean_xml_tag_content(name, assert_string=False), | |
| normalized_name=clean_xml_tag_content(normalized_name), | |
| acronym=clean_xml_tag_content(acronym), | |
| country=clean_xml_tag_content(country, assert_string=False), | |
| state=clean_xml_tag_content(state, assert_string=False), | |
| city=clean_xml_tag_content(city, assert_string=False), | |
| level_1=clean_xml_tag_content(level_1, assert_string=False), | |
| level_2=clean_xml_tag_content(level_2, assert_string=False), | |
| level_3=clean_xml_tag_content(level_3, assert_string=False), |
organization/models.py
Outdated
| return RawOrganization.objects.get(**params) | ||
| except RawOrganization.MultipleObjectsReturned: | ||
| return RawOrganization.objects.filter(**params).first() |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OrganizationalLevel.get is querying RawOrganization.objects instead of cls.objects. This returns the wrong model type and also means cls.DoesNotExist / cls.MultipleObjectsReturned handling in callers won’t work correctly. Change the query/exception handling to use cls (OrganizationalLevel) here.
| return RawOrganization.objects.get(**params) | |
| except RawOrganization.MultipleObjectsReturned: | |
| return RawOrganization.objects.filter(**params).first() | |
| return cls.objects.get(**params) | |
| except cls.MultipleObjectsReturned: | |
| return cls.objects.filter(**params).first() |
| city=None, | ||
| level_1=None, | ||
| level_2=None, | ||
| level_3=None, | ||
| organization=None, | ||
| organizational_level=None, | ||
| match_status=None, |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RawOrganization.create() accepts organization and organizational_level parameters but never assigns them to the created object. Either remove these unused parameters or set the corresponding FK fields during creation (and let clean() enforce consistency).
| Fluxo: | ||
| 1. Importação: preenche name, acronym, country, state, city, level_*, source | ||
| 2. Normalização: vincula organization e organizational_level | ||
|
|
||
| Usado em: | ||
| - journal.models: ? | ||
| - article.models: ? | ||
|
|
||
| Exemplo: | ||
| raw = RawOrganization.create_or_update( | ||
| user=user, | ||
| name="Univ. de São Paulo", | ||
| acronym="USP", | ||
| country="Brasil", | ||
| state="SP", | ||
| city="São Paulo", | ||
| level_1="Faculdade de Medicina", | ||
| level_2="Departamento de Cardiologia", | ||
| source="article_xml", | ||
| ) |
Copilot
AI
Feb 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RawOrganization docstring/example uses a source= argument and describes tracking source data, but the model currently has no source field and create_or_update does not accept source. Either add the field/API to match the intended behavior or update the docstring/PR description to reflect the implemented model.
4e7e3b3 to
5c32746
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| name=None, | ||
| normalized_name=None, | ||
| acronym=None, | ||
| match_status=None, | ||
| ): |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RawOrganization.create_or_update() permite name=None, mas no caminho de criação (except DoesNotExist) chama create() que exige name e vai lançar ValueError. Para evitar falhas em runtime, faça name obrigatório em create_or_update quando precisar criar o registro (ou derive name de original quando não for fornecido).
| except FileNotFoundError: | ||
| pass # Ou trate conforme a necessidade do sistema |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OrganizationInstitutionType.load() ignora FileNotFoundError silenciosamente. Isso pode esconder erros de deploy/configuração e resultar em base sem tipos carregados. Considere ao menos logar o erro (ou re-levantar em ambientes não-prod) para facilitar diagnóstico.
| class Organization(BaseOrganization, CommonControlField, ClusterableModel): | ||
| """ | ||
| Representa organizações/instituições no sistema atual (versão 2) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A classe Organization não possui mais o método create_or_update, mas ele ainda é chamado em outros pontos do código (ex.: core_settings/tasks.py:65, researcher/tasks.py:66, organization/tasks.py:71). Isso vai causar AttributeError em runtime. Reintroduza um wrapper create_or_update compatível (ou atualize todos os chamadores e testes no mesmo PR) antes de remover o método antigo.
|
|
||
| level_1 = clean_xml_tag_content(level_1) | ||
| level_2 = clean_xml_tag_content(level_2) | ||
| level_3 = clean_xml_tag_content(level_3) | ||
|
|
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Em OrganizationalLevel.get(), clean_xml_tag_content() pode transformar level_* em None (pontuação/números) e, com isso, a query pode acabar filtrando apenas por organization, retornando um registro arbitrário no caso de múltiplos níveis. Considere não usar assert_string=True para esses campos e/ou lançar erro quando a normalização resultar em None.
| level_1 = clean_xml_tag_content(level_1) | |
| level_2 = clean_xml_tag_content(level_2) | |
| level_3 = clean_xml_tag_content(level_3) | |
| # Preserve original values to detect invalid normalization | |
| original_level_1 = level_1 | |
| original_level_2 = level_2 | |
| original_level_3 = level_3 | |
| level_1 = clean_xml_tag_content(level_1) | |
| level_2 = clean_xml_tag_content(level_2) | |
| level_3 = clean_xml_tag_content(level_3) | |
| # Se um nível foi fornecido, mas a normalização o tornou None, | |
| # isso indica um valor inválido (ex.: apenas pontuação/números). | |
| if original_level_1 not in (None, "") and level_1 is None: | |
| raise OrganizationLevelGetError( | |
| f"Valor inválido para level_1 após normalização: {original_level_1!r}" | |
| ) | |
| if original_level_2 not in (None, "") and level_2 is None: | |
| raise OrganizationLevelGetError( | |
| f"Valor inválido para level_2 após normalização: {original_level_2!r}" | |
| ) | |
| if original_level_3 not in (None, "") and level_3 is None: | |
| raise OrganizationLevelGetError( | |
| f"Valor inválido para level_3 após normalização: {original_level_3!r}" | |
| ) |
| original = clean_xml_tag_content(original) | ||
| params = {"original": original} |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RawOrganization.get() aplica clean_xml_tag_content() em original. Como essa função pode retornar None para valores comuns (pontuação, números, abreviações), a busca pode acabar consultando original=None e retornar registros incorretos. Para original, use apenas limpeza de espaços/HTML (sem a validação alpha-only) ou valide e rejeite entradas que virem None.
| original = clean_xml_tag_content(original) | |
| params = {"original": original} | |
| cleaned_original = clean_xml_tag_content(original) | |
| if cleaned_original is None: | |
| raise ValueError( | |
| "RawOrganization.get received an invalid original value after cleaning" | |
| ) | |
| params = {"original": cleaned_original} |
| Fluxo: | ||
| 1. Importação: preenche name, acronym, country, state, city, level_*, source | ||
| 2. Normalização: vincula organization e organizational_level | ||
|
|
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RawOrganization documenta/importa source (e o exemplo passa source="article_xml"), mas o model não tem campo source e create_or_update() também não aceita esse parâmetro. Isso impede rastrear a origem do dado bruto; adicione o campo (e ajuste constraints/índices) ou remova a referência do fluxo/exemplo.
| level_1 = clean_xml_tag_content(level_1) | ||
| level_2 = clean_xml_tag_content(level_2) | ||
| level_3 = clean_xml_tag_content(level_3) | ||
|
|
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean_xml_tag_content() (com assert_string=True) retorna None para strings com pontuação/números (ex.: "Dept. 2"), o que pode fazer create_or_update() gravar NULL/perder dados e também tornar o get() ambíguo (filtra só por organization). Para níveis organizacionais, use um sanitizador que preserve caracteres comuns (ex.: apenas remove_extra_spaces) ou chame clean_xml_tag_content(..., assert_string=False) e valide explicitamente quando o resultado for inválido.
| level_1 = clean_xml_tag_content(level_1) | |
| level_2 = clean_xml_tag_content(level_2) | |
| level_3 = clean_xml_tag_content(level_3) | |
| cleaned_level_1 = clean_xml_tag_content(level_1, assert_string=False) | |
| if cleaned_level_1 is None and level_1 not in (None, ""): | |
| raise ValidationError( | |
| _("Invalid value for Organization Level 1: %(value)r"), | |
| params={"value": level_1}, | |
| ) | |
| level_1 = cleaned_level_1 | |
| cleaned_level_2 = clean_xml_tag_content(level_2, assert_string=False) | |
| if cleaned_level_2 is None and level_2 not in (None, ""): | |
| raise ValidationError( | |
| _("Invalid value for Organization Level 2: %(value)r"), | |
| params={"value": level_2}, | |
| ) | |
| level_2 = cleaned_level_2 | |
| cleaned_level_3 = clean_xml_tag_content(level_3, assert_string=False) | |
| if cleaned_level_3 is None and level_3 not in (None, ""): | |
| raise ValidationError( | |
| _("Invalid value for Organization Level 3: %(value)r"), | |
| params={"value": level_3}, | |
| ) | |
| level_3 = cleaned_level_3 |
| Fluxo: | ||
| 1. Importação: preenche name, acronym, country, state, city, level_*, source | ||
| 2. Normalização: vincula organization e organizational_level | ||
|
|
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O docstring/fluxo menciona o campo source na importação (e o exemplo passa source="article_xml"), mas o model RawOrganization não possui um campo source e create_or_update() também não aceita esse parâmetro. Isso impede rastrear a origem do dado bruto e contradiz o objetivo do PR; adicione o campo (e ajuste unique_together/índices conforme necessário) ou remova a referência do fluxo/exemplo.
| if not original: | ||
| raise ValueError("RawOrganization.get requires original") | ||
|
|
||
| original = clean_xml_tag_content(original) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Em RawOrganization.get(), clean_xml_tag_content() pode retornar None para valores comuns (pontuação, números, abreviações). Isso pode fazer a busca virar original=None e retornar registros incorretos. Para original, prefira apenas limpar HTML/espaços (sem a validação alpha-only) ou valide e rejeite entradas que virem None.
| original = clean_xml_tag_content(original) | |
| cleaned_original = clean_xml_tag_content(original) | |
| if cleaned_original is None: | |
| raise ValueError( | |
| "RawOrganization.get received an invalid 'original' value after cleaning" | |
| ) | |
| original = cleaned_original |
| import csv | ||
|
|
||
| from django.apps import apps | ||
| from django.core.exceptions import ValidationError |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'ValidationError' is not used.
| from django.core.exceptions import ValidationError |
5c32746 to
d2c0c12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import csv | ||
|
|
||
| from django.apps import apps | ||
| from django.core.exceptions import ValidationError |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused import: ValidationError is imported but not referenced in this module. Please remove it (or add the validation logic that was intended) to keep the module clean and avoid lint/test failures in stricter environments.
| from django.core.exceptions import ValidationError |
| null=True, | ||
| blank=True, |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OrganizationInstitutionType.name is nullable/blank, but get/create_or_update require a non-empty name and the model is declared unique by (name, source). Allowing NULL/blank here makes it easy to create invalid rows (e.g., via admin) and undermines the intended uniqueness. Consider making name non-null/non-blank.
| null=True, | |
| blank=True, |
| null=True, | ||
| blank=True, |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RawOrganization.original is declared null=True/blank=True, but RawOrganization.get/create_or_update require original and Meta.unique_together includes it. Leaving this nullable allows invalid/duplicate rows that your API can’t address. Consider making original non-null/non-blank to match the intended invariant.
| null=True, | |
| blank=True, | |
| null=False, | |
| blank=False, |
| original=clean_xml_tag_content(original), | ||
| name=clean_xml_tag_content(name), | ||
| normalized_name=clean_xml_tag_content(normalized_name), | ||
| acronym=clean_xml_tag_content(acronym), | ||
| country=clean_xml_tag_content(country), |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RawOrganization is described as storing data "como vieram da fonte", but create() runs clean_xml_tag_content() on original/name/etc. Since clean_xml_tag_content(assert_string=True) returns None for punctuation/numbers (e.g., "Univ."), this can silently drop raw inputs. If you need to preserve raw text, use a non-destructive normalizer (or call clean_xml_tag_content(..., assert_string=False)).
| institution_type = models.ManyToManyField( | ||
| "OrganizationInstitutionType", | ||
| verbose_name=_("Institution Type (SciELO)"), | ||
| verbose_name=_("Institution Type"), | ||
| blank=True, | ||
| related_name="organizations", |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactor removes/renames Organization fields (e.g., institution_type_mec / institution_type_scielo / is_official), but the rest of the repo still references them (e.g., researcher/tasks.py:66-74, organization/tests.py:75-81). As-is, those code paths will break due to missing attributes. Consider providing backward-compatible aliases/migration path or updating dependents in the same PR.
| if save: | ||
| # antes de adicionar, salva o objeto principal para garantir integridade | ||
| self.save() |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_institution_type can fail when the Organization instance hasn't been saved yet: calling self.institution_type.add(...) requires a primary key, but if save=False and self.pk is None it will raise ValueError. Consider saving when self.pk is missing (or raising a clearer error / forcing save=True).
| if save: | |
| # antes de adicionar, salva o objeto principal para garantir integridade | |
| self.save() | |
| """ | |
| Add an OrganizationInstitutionType to this Organization, ensuring that | |
| the Organization instance has been saved before creating the M2M relation. | |
| """ | |
| if self.pk is None: | |
| if save: | |
| # antes de adicionar, salva o objeto principal para garantir integridade | |
| self.save() | |
| else: | |
| raise ValueError( | |
| "Organization instance must be saved before adding an " | |
| "institution type. Save the instance first or call " | |
| "add_institution_type(..., save=True)." | |
| ) |
| Q(location__country__name__icontains=search_term) | ||
| | Q(name__icontains=search_term) | ||
| | Q(source=search_term) | ||
| | Q(external_id=search_term) | ||
| ) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For autocomplete, source/external_id are matched with exact equality (Q(source=search_term) / Q(external_id=search_term)). Other autocomplete filters in the repo use partial matching with __icontains (e.g., article/models.py:1746), which makes the UI much more usable. Consider switching these to __icontains for consistency.
| Exemplo: | ||
| raw = RawOrganization.create_or_update( | ||
| user=user, | ||
| name="Univ. de São Paulo", | ||
| acronym="USP", |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring example is out of sync with the implementation: it calls RawOrganization.create_or_update(user=..., name=..., acronym=..., country=..., ... , source=...), but create_or_update currently requires positional args (original, country, state, city, level_1, level_2, level_3) and RawOrganization has no source field. Please update the example/docs (or adjust the API/model) so it reflects the actual callable signature and stored fields.
| null=True, | ||
| blank=True, |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OrganizationInstitutionType.source is also nullable/blank even though it participates in the unique_together constraint and is shown in str. If source is part of identity, consider making it non-null/non-blank (and/or restricting it with choices) to prevent rows like " (None)" and ambiguous uniqueness.
| null=True, | |
| blank=True, | |
| null=False, | |
| blank=False, |
| except IntegrityError: | ||
| return cls.get(name=name) | ||
| return cls.get( | ||
| original=original, | ||
| country=country, | ||
| state=state, |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug in IntegrityError fallback: create() cleans country/state/city/level_* before saving, but on IntegrityError it calls get() with the uncleaned variables. Because get() matches those fields exactly (and only cleans 'original'), this can fail to find the existing row and raise DoesNotExist. Fix by passing cleaned values (or cleaning all fields inside get()).
Descrição:
🚀 O que foi feito?
Este PR implementa uma mudança estrutural profunda no módulo de organizações para suportar a coexistência de dados brutos (raw) e dados padronizados (canonical). A arquitetura foi migrada para um modelo mais flexível que permite rastrear a origem da informação e manter a hierarquia de níveis organizacionais.
🛠 Alterações Principais:
1.
choices.pyinst_type(estático).SOURCE_CHOICES: Identificação da origem do dado (ROR,MEC,Legacy, etc).ORGANIZATION_ROLES: Definição de papéis comoFunder,Sponsor,Publisher, entre outros.2.
models.py(Arquitetura de Dados)BaseOrganization(Abstract): Campo base para nome e sigla.Organization: Modelo padronizado principal. Agora inclui suporte a IDs externos (ex: ROR ID), logotipos e tipos de instituição vinculados a múltiplas fontes.RawOrganization: Novo modelo para armazenar dados exatamente como extraídos (XML/CSV). Permite o vínculo manual ou automático com umaOrganizationpadronizada viamatch_status.OrganizationalLevel: Implementação de níveis hierárquicos (Faculdade > Departamento > Laboratório) vinculados obrigatoriamente a uma organização pai.BaseOrganizationRole: Classe base para atribuir papéis a organizações em diferentes contextos (Coleções, Periódicos, Artigos) com suporte a intervalo de datas (BaseDateRange).