APP-356 Convert POTNTL_DUP_INV_SUM (Potential Duplicate Investigations) by krista-skylight · Pull Request #3131 · CDCgov/NEDSS-Modernization

krista-skylight · 2026-04-06T22:54:29Z

Description

Please include a summary of the changes and any key information a reviewer may need.

Tickets

Jira Ticket

Checklist for adding a library:

…into kc/convert_POTNTL_DUP_INV_SUM merge branches

…into kc/convert_POTNTL_DUP_INV_SUM merge main

…into kc/convert_POTNTL_DUP_INV_SUM merge from main

…in report spec

…into kc/convert_POTNTL_DUP_INV_SUM merge with main

…into kc/convert_POTNTL_DUP_INV_SUM merge changes from main

…into kc/convert_POTNTL_DUP_INV_SUM merge from main

JordanGuinn

This is looking good!! 🔥

JordanGuinn · 2026-05-05T20:53:55Z

    library_name: str = Field(min_length=1)
    data_source_name: str = Field(min_length=1)
    subset_query: str = Field(min_length=1)
+    days_value: int | None = None  # Specific to potntl_dup_inv_sum


(thought, nb): Probably not worth addressing at this moment, but if there end up being multiple reports that require unique properties like this, maybe we end up sketching out a custom_props Object field or similar to capture them all? Just in the spirit of minimizing the amount of library-specific bits on the ReportSpec model.

Yea def a good idea to revisit as we work through more translations!

JordanGuinn · 2026-05-05T20:55:23Z

  sas:
    image: ghcr.io/cdcent/nbs7-sas-linux:v1.0.4
    platform: linux/amd64
+    container_name: "${COMPOSE_PROJECT_NAME}-sas"


(q, nb): Any particular reason for this addition?

I do not know enough to understand why but when I was troubleshooting issues with @mcmcgrath13 doing sas translations, she suggested making this change to get around an error message that I can no longer recall lol.

Mary, do you think I should push this for everyone or leave it as something only I modify?

I think it's fine to push - makes working with the sas container easier via compose

JordanGuinn · 2026-05-05T20:55:56Z


    db_tables = [t['table_name'] for t in schema['tables']]
-    fk_tables = schema['config']['nbs']['fk_tables']
+    fk_tables = schema['config'].get('nbs', {}).get('fk_tables', [])


(q, nb): What's this change for?

so if there are no fk tables specified (not always relevant), then we default to empty list

JordanGuinn · 2026-05-05T20:59:25Z

+    ]
+
+    def test_execute_report_with_days_value(self, snapshot):
+        """Test with a specific days value (e.g., 365 days)."""


(q, nb): Can/should we do any actual date comparisons of the underlying results returned by the report lib, to ensure they actually match the days value provided?

No, unfortunately, the report doesn't retain data about the two event dates for the potential duplicate or how close in days they are to each other. That's why I did a comparison between 3650 and 30 days to check that 3650 has more potential dupilcates than 30.

…into kc/convert_POTNTL_DUP_INV_SUM merge main

JordanGuinn

🚀

mcmcgrath13 · 2026-05-07T00:18:09Z

+      - column_name: ILLNESS_ONSET_DATE
+        type: string
+        data: None
+        null_percentage: 1.0


(q, nb): why include the column, but have everything be null?

I was just trying to mimic everything I saw in the original table as much as possible since I wasn't sure what downstream effects empty columns have on this library or other ones using this table

mcmcgrath13 · 2026-05-07T00:18:55Z

+        data: f"{fake.date_between(start_date='-1y', end_date='today').strftime('%Y-%m-%d')} 12:00:00.000"
+        null_percentage: 0.05
+
+      - column_name: EVENT_DATE_TYPE


should this be linked off of the non-null ness of EVENT_DATE? (similar to how in phdc the state is dependent on state cd)

oooh great point! I just modified accordingly

mcmcgrath13 · 2026-05-07T00:22:25Z


    # KLUDGE: NULL writing is not always correct
    result = result.replace(' nan,', ' NULL,')
+    result = result.replace('nan', ' NULL')


(q, nb): why did we need to add this one? there's a risk that a valid part of a string with nan in it ill now be turned into NULL is it the opening paren case of (nan,?

I remember adding some of these to get get_faker_sql to not break for my data but now that I'm trying it again without that line, it still does work, so I have removed it!

mcmcgrath13 · 2026-05-07T00:24:34Z

I only see the SAS output and not the python in the report catalog - can you update with the python?

Also, could you update the spreadsheet tracker and add the e2e ticket?

Co-authored-by: Mary McGrath <m.c.mcgrath13@gmail.com>

…Cgov/NEDSS-Modernization into kc/convert_POTNTL_DUP_INV_SUM pull upstream

…tntl_dup_inv_sum/test_execute_report_with_days_value/snapshot.yml

…s_potntl_dup_inv_sum/test_execute_report_with_days_value/snapshot.yml

sonarqubecloud · 2026-05-07T22:39:52Z

❌ The last analysis has failed.

See analysis details on SonarQube Cloud

sonarqubecloud · 2026-05-07T23:44:51Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Krista Chan and others added 30 commits April 1, 2026 13:44

chore: initial copies of new library modules

60e01c0

Merge branch 'main' of https://github.com/CDCgov/NEDSS-Modernization …

4970f48

…into kc/convert_POTNTL_DUP_INV_SUM merge branches

feat: tablefaker schema for dup_inv_sum and initial conversion and tests

a8399b9

chore: add changelog and sql file for potntl dup inv

a62cbfe

chore: merge main branch

c73d401

Merge branch 'main' of https://github.com/CDCgov/NEDSS-Modernization …

e7ad140

…into kc/convert_POTNTL_DUP_INV_SUM merge main

chore: switch to rdb database

fdce598

chore: more nan handling and add INVESTIGATION_KEY

fae8225

chore: new snapshot

fad7088

chore: changes from main

ba5b29b

chore: try getting rid of datetime to pass tests

b01e674

Merge branch 'main' of https://github.com/CDCgov/NEDSS-Modernization …

a7a9b99

…into kc/convert_POTNTL_DUP_INV_SUM merge from main

tests: data type fixes and typo fixes to make tests run

1aabe0d

chore: fixes to subheader, add all needed columns, handle days_value …

44e6f1c

…in report spec

tests: fix 30 day assertions. switch to triple quotes

0679b42

chore: if no fktables

09ffc64

tests: rewrite without the extra fields

3fab55c

chore: change fk_table logic and go back to 3650 days default

ecd93e6

chore: pull main merge changes

501788f

Merge branch 'main' of https://github.com/CDCgov/NEDSS-Modernization …

f6610c0

…into kc/convert_POTNTL_DUP_INV_SUM merge with main

chore: changes to get sas to work

f2fd4bb

chore: rework without TimeRange

3349cd9

chore: rename files

22764c7

chore: more renaming

f64f9a9

tests: tablefaker schema matches actual data

e375a87

chore: date formats that actually work

e6359e9

chore: change migration filenames

ca04489

chore: bring in changes from main

7f3f99d

Merge branch 'main' of https://github.com/CDCgov/NEDSS-Modernization …

d239bc3

…into kc/convert_POTNTL_DUP_INV_SUM merge changes from main

Merge branch 'main' of https://github.com/CDCgov/NEDSS-Modernization …

9ba225d

…into kc/convert_POTNTL_DUP_INV_SUM merge from main

chore: linter fixes

1293fe3

JordanGuinn reviewed May 5, 2026

View reviewed changes

Krista Chan added 5 commits May 6, 2026 12:10

Merge branch 'main' of https://github.com/CDCgov/NEDSS-Modernization …

86f46a3

…into kc/convert_POTNTL_DUP_INV_SUM merge main

tests: add a negative days value test

6bb51ec

tests: remove small days value test

3a0e74d

tests: remove disease filter test, change to >

955a6b8

chore: linter fixes

756f107

krista-skylight requested review from JordanGuinn and mcmcgrath13 May 6, 2026 19:31

JordanGuinn approved these changes May 6, 2026

View reviewed changes

mcmcgrath13 reviewed May 7, 2026

View reviewed changes

Comment thread apps/report-execution/tests/conftest.py Outdated

Krista Chan and others added 7 commits May 7, 2026 14:52

chore: make event date type null if event date is null

e4250fc

Update apps/report-execution/tests/conftest.py

6a8c6e2

Co-authored-by: Mary McGrath <m.c.mcgrath13@gmail.com>

chore: remove 'nan' null replacement

922d9a2

Merge branch 'kc/convert_POTNTL_DUP_INV_SUM' of https://github.com/CD…

aba9306

…Cgov/NEDSS-Modernization into kc/convert_POTNTL_DUP_INV_SUM pull upstream

Delete apps/report-execution/tests/integration/libraries/snapshots/po…

d759414

…tntl_dup_inv_sum/test_execute_report_with_days_value/snapshot.yml

Delete apps/report-execution/tests/integration/libraries/snapshots/nb…

158a8aa

…s_potntl_dup_inv_sum/test_execute_report_with_days_value/snapshot.yml

chore: reload snapshot

f29a95c

Krista Chan added 6 commits May 7, 2026 15:42

tests: run snapshot update in ci

88a743c

chore: undo snapshot update

d0da3d3

tests: change to test execute report check data

46bc66e

try again with snapshot update

d76d0df

upload shapshot as artifact

cb25633

undo ci changes

d4cb61f

Conversation

krista-skylight commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tickets

Uh oh!

JordanGuinn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JordanGuinn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mcmcgrath13 commented May 7, 2026

Uh oh!

sonarqubecloud Bot commented May 7, 2026

Uh oh!

sonarqubecloud Bot commented May 7, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

krista-skylight commented Apr 6, 2026 •

edited

Loading