Add R-parity Python functions and datasets#420
Open
JoaoCarabetta wants to merge 4 commits into
Open
Conversation
Introduce cached parquet downloads, filtering, multi-format output (sf/arrow/duckdb relation), and shared read_geobr_v2/hybrid helpers to align Python with the R v2.0.0 data path. Co-authored-by: Cursor <cursoragent@cursor.com>
Port cep_to_state, remove_islands, and read_* wrappers for capitals, favelas, polling places, and quilombola lands. Co-authored-by: Cursor <cursoragent@cursor.com>
Upgrade deprecated GitHub Actions, use astral-sh/setup-uv cross-platform, and skip network-dependent list_geobr test while testing filters via read_geobr_v2. Co-authored-by: Cursor <cursoragent@cursor.com>
AppVeyor is not required for Python (GitHub Actions Python-CMD-check covers all platforms). Path filters skip builds when only python-package or .github change. Co-authored-by: Cursor <cursoragent@cursor.com>
camilagb
reviewed
May 27, 2026
Collaborator
camilagb
left a comment
There was a problem hiding this comment.
One function that is missing is the grid_state_correspondence_table, that returns the data in https://github.com/ipeaGIT/geobr/blob/v1.9.1/python-package/geobr/data/grid_state_correspondence_table.csv. Maybe include a new file?
from geobr import __path__ as geobr_directory
import pandas as pd
grid_file_path = geobr_directory[0] + "/data/grid_state_correspondence_table.csv"
dtypes = {"name_state": str, "abbrev_state": str, "code_grid": str}
def get_grid_state_table() -> pd.DataFrame:
grid_state_correspondence_table = pd.read_csv(
grid_file_path, encoding="latin-1", dtype=dtypes
)
return grid_state_correspondence_table
camilagb
reviewed
May 28, 2026
Comment on lines
+41
to
+45
| output: str = "sf", | ||
| show_progress: bool = True, | ||
| cache: bool = True, | ||
| verbose: bool = False, | ||
| year: int = 2010, |
Collaborator
There was a problem hiding this comment.
Suggested change
| output: str = "sf", | |
| show_progress: bool = True, | |
| cache: bool = True, | |
| verbose: bool = False, | |
| year: int = 2010, | |
| year: int, | |
| output: str = "gpd", | |
| show_progress: bool = True, | |
| cache: bool = True, | |
| verbose: bool = False |
| Parameters | ||
| ---------- | ||
| output : str | ||
| ``"sf"`` for GeoDataFrame (default), ``"duckdb"``, or ``"arrow"``. |
Collaborator
There was a problem hiding this comment.
Suggested change
| ``"sf"`` for GeoDataFrame (default), ``"duckdb"``, or ``"arrow"``. | |
| ``"gpd"`` for GeoDataFrame (default), ``"duckdb"``, or ``"arrow"``. |
Comment on lines
+59
to
+85
| if output == "sf": | ||
| gdf = read_municipal_seat(year=year, verbose=verbose) | ||
| gdf = gdf[gdf["code_muni"].isin(codes)] | ||
| return gdf.sort_values("code_muni").reset_index(drop=True) | ||
|
|
||
| from geobr.utils import read_geobr_v2 | ||
|
|
||
| results = [] | ||
| for code in codes: | ||
| part = read_geobr_v2( | ||
| geography="municipalseat", | ||
| year=year, | ||
| code=code, | ||
| simplified=True, | ||
| output=output, | ||
| show_progress=show_progress, | ||
| cache=cache, | ||
| verbose=False, | ||
| ) | ||
| results.append(part) | ||
| if output == "sf": | ||
| import geopandas as gpd | ||
|
|
||
| return gpd.GeoDataFrame( | ||
| pd.concat(results, ignore_index=True) | ||
| ).sort_values("code_muni").reset_index(drop=True) | ||
| return results |
Collaborator
There was a problem hiding this comment.
For this update, it's necessary to also update the read_municipal_seat function and use the filter_by_code proposed in #418
Suggested change
| if output == "sf": | |
| gdf = read_municipal_seat(year=year, verbose=verbose) | |
| gdf = gdf[gdf["code_muni"].isin(codes)] | |
| return gdf.sort_values("code_muni").reset_index(drop=True) | |
| from geobr.utils import read_geobr_v2 | |
| results = [] | |
| for code in codes: | |
| part = read_geobr_v2( | |
| geography="municipalseat", | |
| year=year, | |
| code=code, | |
| simplified=True, | |
| output=output, | |
| show_progress=show_progress, | |
| cache=cache, | |
| verbose=False, | |
| ) | |
| results.append(part) | |
| if output == "sf": | |
| import geopandas as gpd | |
| return gpd.GeoDataFrame( | |
| pd.concat(results, ignore_index=True) | |
| ).sort_values("code_muni").reset_index(drop=True) | |
| return results | |
| return read_municipal_seat( | |
| year=year, | |
| code_muni=codes, | |
| output=output, | |
| show_progress=show_progress, | |
| cache=cache, | |
| verbose=verbose | |
| ) |
| def read_polling_places( | ||
| year: int, | ||
| code_muni: str = "all", | ||
| output: str = "sf", |
Collaborator
There was a problem hiding this comment.
Suggested change
| output: str = "sf", | |
| output: str = "gpd", |
| year: int, | ||
| code_muni: str = "all", | ||
| simplified: bool = True, | ||
| output: str = "sf", |
Collaborator
There was a problem hiding this comment.
Suggested change
| output: str = "sf", | |
| output: str = "gpd", |
| date: int, | ||
| code_state: str = "all", | ||
| simplified: bool = True, | ||
| output: str = "sf", |
Collaborator
There was a problem hiding this comment.
Suggested change
| output: str = "sf", | |
| output: str = "gpd", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cep_to_state,remove_islands,read_capitals,read_favela,read_polling_places,read_quilombola_landbr_offcoast.parquetfor island clipping and export new symbols from__init__.pyTest plan
pytest -m "not network")Depends on #418
Made with Cursor