Skip to content

Hente folkemengdefiler#59

Open
coh-ssb wants to merge 23 commits into
mainfrom
hente_folkemengdefiler
Open

Hente folkemengdefiler#59
coh-ssb wants to merge 23 commits into
mainfrom
hente_folkemengdefiler

Conversation

@coh-ssb

@coh-ssb coh-ssb commented May 4, 2026

Copy link
Copy Markdown
Collaborator

Sjekke hvordan det ser ut så langt.

@coh-ssb coh-ssb requested a review from aosthus June 26, 2026 11:00

@aosthus aosthus left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all a job well done.

I have some comments and suggestions. Most of them are only suggestions that are already commented on here.

In addition to the individual comments, I also have some general comments about some of the files. For starters, I do now have enough knowledge about unit tests that I can really code review the tests/*.py-files in a meaningful way. You might want to hear from others on this regard.

Also, it is very good to have example-notebooks as you are providing here. I would however highly recommend to rewrite the comments in these notebooks to comply with universal design guidelines. This means not using headings unless they are actual headings, and using normal text when it is actual normal text. I would be happy to help you with this if this is confusing.

As I have some comments, I am waiting a little bit with the approving of this pull request. I hope that some of the suggestions I have provided can be answered one way or the other. Feel free to ask me for some time to look at it together. After a response on my comments, I will reevaluate my review, and if I am satisfied with the answers (one way or the other), I would approve the PR. As it stands now, there are still, as I have suggested, some questions I feel need answering before approving.

Comment on lines 26 to 38
def _round_half_up(values: pd.Series, decimals: int = 0) -> pd.Series:
"""Kommersiell avrunding (0.5 -> 1, -0.5 -> -1), også for desimaler.

Fungerer på numpy-arrays eller pandas-Serier av tall.
"""
factor = 10**decimals
# Cast to float to avoid issues with Int64 etc.
v = pd.Series(pd.to_numeric(values, errors="coerce").astype(float))
arr = v.to_numpy(dtype=float)
# round half away from zero
rounded = np.sign(arr) * np.floor(np.abs(arr) * factor + 0.5) / factor

return pd.Series(rounded, index=v.index, name=v.name)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function might not work as intended. For the most part, it seems to round commercially, but there are cases where this is not rounding commercially. It does round correctly many of the times, but not always. See example below, for a fictional dataset.

This will correctly round:

df = pd.DataFrame({'col1': ['1', '2', '3', '4', '5'], 'var1': [1.5, 2.5, 3.5, -1.5, -2.5], 'var2': [0.125, 0.575, 1.005, 1.275, 2.445]})

_round_half_up(df['var1'])
0    2.0
1    3.0
2    4.0
3   -2.0
4   -3.0
Name: var1, dtype: float64

This will not correctly round:

df = pd.DataFrame({'col1': ['1', '2', '3', '4', '5'], 'var1': [1.5, 2.5, 3.5, -1.5, -2.5], 'var2': [0.145, 0.575, 1.005, 1.275, 2.445]})

_round_half_up(df['var2'], 2)
0    0.13 <-- Correcttly rounds, but... 
1    0.57 <-- Not correctly rounded, as well as the ones below
2    1.00
3    1.27
4    2.44
Name: var2, dtype: float64

It might be unusual that some of the .5-points round one way, and the others round another way. Consider if this function is needed, or if the used can round data later on in the process using dapla-statbank-client towards the statbank, or if the used is expected to know how to round data themselves. If it is needed, consider the suggested code below as a suggestion to how to rewrite the _round_half_up()-function:

Suggested change
from decimal import Decimal, ROUND_HALF_UP
def _round_half_up(values: pd.Series, decimals: int = 0) -> pd.Series:
q = Decimal(1).scaleb(-decimals)
def r(x):
if pd.isna(x):
return np.nan
return float(Decimal(str(x)).quantize(q, rounding=ROUND_HALF_UP))
return pd.to_numeric(values, errors="coerce").map(r)

Comment thread src/ssb_kostra_python/hente_data_folkemengde.py
Comment thread src/ssb_kostra_python/hente_data_folkemengde.py
Comment thread src/ssb_kostra_python/summere_til_aldersgrupperinger.py
Comment thread src/ssb_kostra_python/summere_til_aldersgrupperinger.py

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there is a hente_data_folkemengde()-function, is it possible to rewrite this example notebook so that it uses that function?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Må legge inn endringer.

Comment thread examples/hente_folkemengdefil_kommuner_eksempel.py
Comment thread examples/hente_folkemengdefil_fylkeskommuner_eksempel.py
Comment thread examples/hente_folkemengdefil_bydeler_eksempel.py
Comment thread examples/aggregere_kjonn_eksempel.py

@aosthus aosthus left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all a job well done.

I have some comments and suggestions. Most of them are only suggestions that are already commented on here.

In addition to the individual comments, I also have some general comments about some of the files. For starters, I do now have enough knowledge about unit tests that I can really code review the tests/*.py-files in a meaningful way. You might want to hear from others on this regard.

Also, it is very good to have example-notebooks as you are providing here. I would however highly recommend to rewrite the comments in these notebooks to comply with universal design guidelines. This means not using headings unless they are actual headings, and using normal text when it is actual normal text. I would be happy to help you with this if this is confusing.

As I have some comments, I am waiting a little bit with the approving of this pull request. I hope that some of the suggestions I have provided can be answered one way or the other. Feel free to ask me for some time to look at it together. After a response on my comments, I will reevaluate my review, and if I am satisfied with the answers (one way or the other), I would approve the PR. As it stands now, there are still, as I have suggested, some questions I feel need answering before approving.

@sonarqubecloud

sonarqubecloud Bot commented Jul 2, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants