Path optimization docs by lucas-diedrich · Pull Request #64 · MannLabs/py-lmd

lucas-diedrich · 2026-01-17T12:50:17Z

Add docstrings to path optimization methods.

…ted imports, fix docstring

lucas-diedrich · 2026-01-17T12:51:15Z

Moved optional umap import to module level so that the recursive function call does not involve the overhead of repetitive imports.

Copilot

Pull request overview

This pull request adds comprehensive docstrings and type hints to path optimization functions in the lmd.path module, improving code documentation and type safety. The PR also includes code cleanup by removing debug print statements and reorganizing the UMAP dependency check.

Changes:

Added type hints to all path optimization functions (calc_len, tsp_hilbert_solve, _get_closest, _tsp_greedy_solve, _get_nodes, tsp_greedy_solve)
Added detailed docstrings with Args and Returns sections for all functions
Moved UMAP import check to the beginning of _tsp_greedy_solve for earlier failure detection
Removed debug print statements and unnecessary code (e.g., p = p no-op)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-17T12:54:30Z

+    """Calculate the length of a path based on a list of coordinates

    Args:
        data (np.array): Array of shape `(N, 2)` containing a list of coordinates


The type annotation in the Args section uses np.array, but the function signature uses np.ndarray. These should be consistent. Use np.ndarray in the docstring to match the type hint.

Suggested change

data (np.array): Array of shape `(N, 2)` containing a list of coordinates

data (np.ndarray): Array of shape `(N, 2)` containing a list of coordinates

Copilot · 2026-01-17T12:54:30Z

        data (np.array): Array of shape `(N, 2)` containing a list of coordinates

+    Returns:
+        Length of path


The Returns section is incomplete. It should specify the return type more formally, such as "float: The total length of the path".

Suggested change

Length of path

float: The total length of the path.

Copilot · 2026-01-17T12:54:31Z

+        p: Iterations to use in constructing the Hilbert curve.
+
+    Returns:
+        Ordered indices of data of the shape `(N,)` according to their position along the Hilbert curve.


The Returns section format is inconsistent with the other functions. The return type and description should be on the same line or follow a consistent pattern. Consider formatting it as "np.ndarray: Ordered indices of data..." to match documentation conventions.

Suggested change

Ordered indices of data of the shape `(N,)` according to their position along the Hilbert curve.

np.ndarray: Ordered indices of data of the shape `(N,)` according to their position along the Hilbert curve.

Copilot · 2026-01-17T12:54:31Z

+    Returns:
+        `return_sorted=True`: Array of sorted nodes
+        `return_sorted=False`: Ordered indices of nodes


The Returns section format is inconsistent. It describes two different return cases using the parameter name (return_sorted) rather than clearly indicating the return type. Consider reformatting this as:
"Returns:
np.ndarray: When return_sorted is False, returns ordered indices of nodes.
When return_sorted is True, returns array of sorted nodes."

Copilot · 2026-01-17T12:54:31Z

+        T: First element in `choices` that is not in `used`
+        None: If all nearest neighbors have been visited, or `-1` in choices.


The Returns section of the docstring lists two separate items (T and None) on different lines, which doesn't align with typical docstring conventions. Consider reformatting as:
"Returns:
T | None: First element in choices that is not in used, or None if all nearest neighbors have been visited or if -1 is encountered in choices."

Suggested change

T: First element in `choices` that is not in `used`

None: If all nearest neighbors have been visited, or `-1` in choices.

T | None: First element in `choices` that is not in `used`, or None if all nearest neighbors have been visited or if `-1` is encountered in choices.

Copilot · 2026-01-17T12:54:31Z

+        k: K-Nearest neighbors selection
+
+    Returns:
+        Ordered indices of data of the shape `(N,)`.


The Returns section states it returns "Ordered indices of data of the shape (N,)", but examining the function implementation shows it actually returns the sorted data array (coordinates), not indices. At line 175, it concatenates data[nodes] and the recursive result, which are coordinate arrays. This docstring should be corrected to indicate it returns sorted coordinates, not indices.

Suggested change

Ordered indices of data of the shape `(N,)`.

np.ndarray: Array of shape `(N, 2)` containing the input coordinates ordered

along the greedy TSP path.

Copilot · 2026-01-17T12:54:31Z

-# TODO: Add docstrings
 # return the first element not present in a list
-def _get_closest(used, choices, world_size):
+def _get_closest(used: list[T], choices: list[T], world_size: Any) -> Union[T, None]:


The type hint for the choices parameter indicates list[T], but the function is called with a numpy array (see line 160 where knn_index[current_node] is passed). While the function works due to duck typing, the type hint should be more accurate. Consider using list[T] | np.ndarray or a more general sequence type to reflect actual usage.

Suggested change

def _get_closest(used: list[T], choices: list[T], world_size: Any) -> Union[T, None]:

def _get_closest(used: list[T], choices: Union[list[T], np.ndarray], world_size: Any) -> Union[T, None]:

Copilot · 2026-01-17T12:54:32Z

-def tsp_greedy_solve(node_list, k=100, return_sorted=False):
+def tsp_greedy_solve(
+    node_list: np.ndarray, k: int = 100, return_sorted: bool = False
+) -> Union[np.ndarray, tuple[np.ndarray, np.ndarray]]:


The return type annotation indicates tuple[np.ndarray, np.ndarray] as one possibility, but the function never returns a tuple. When return_sorted=False, it returns the result of _get_nodes which is a list[int]. The return type should be np.ndarray | list[int] instead.

Suggested change

) -> Union[np.ndarray, tuple[np.ndarray, np.ndarray]]:

) -> Union[np.ndarray, list[int]]:

Copilot · 2026-01-17T12:54:32Z

+    typically yields a shorter traversal path than a random ordering, but does
+    not guarantee an optimal Traveling Salesperson solution.
+
+


There's an extra blank line in the docstring. Remove one of the blank lines to maintain consistent formatting with other docstrings in the file.

Suggested change

Copilot · 2026-01-17T12:54:32Z

 # TODO: Add type hints
 # TODO: Add docstrings


The TODOs indicate that type hints have been added, but the TODOs are still present in the code. These TODO comments should be removed since the tasks have been completed.

Suggested change

# TODO: Add type hints

# TODO: Add docstrings

sophiamaedler · 2026-01-19T10:12:47Z

+    """Calculate the length of a path based on a list of coordinates

    Args:
        data (np.array): Array of shape `(N, 2)` containing a list of coordinates


Lukas Heumos gave me the best practice advice that if type hints are used in the function declaration they should not also be included in the docstrings as this is redundant information that makes it harder to read and also to maintain up to date.

sophiamaedler · 2026-01-19T10:15:59Z

+def tsp_greedy_solve(
+    node_list: np.ndarray, k: int = 100, return_sorted: bool = False
+) -> Union[np.ndarray, tuple[np.ndarray, np.ndarray]]:
    """Find an approximation of the closest path through a list of coordinates


Suggested change

"""Find an approximation of the closest path through a list of coordinates

"""Find an approximation of the shortest path through a list of coordinates

I think shortest is more accurate than closest. I'm not sure why that was used in the docs in the first place.

lucas-diedrich added 10 commits January 17, 2026 13:14

[Doc] Add docstring to tsp_hilbert_solve

2a5f459

[Doc] Add docstring

1f29e30

[Doc] Add docstring to _get_closest

e5fef03

Remove todos

1f0152a

[Doc] Add docstrings

444f131

Remove debugging print statement

afc093a

Formatting

030d826

Move umap import to module top level to reduce overhead through repea…

bdb5bc8

…ted imports, fix docstring

[Doc] Add docs

03b8fe3

Update module-level docstring

cde720e

lucas-diedrich self-assigned this Jan 17, 2026

lucas-diedrich added the documentation Improvements or additions to documentation label Jan 17, 2026

lucas-diedrich requested a review from Copilot January 17, 2026 12:50

Copilot started reviewing on behalf of lucas-diedrich January 17, 2026 12:50 View session

[Fix] Adapt type hints to python 3.9

8de37ad

Copilot AI reviewed Jan 17, 2026

View reviewed changes

sophiamaedler reviewed Jan 19, 2026

View reviewed changes

	data (np.array): Array of shape `(N, 2)` containing a list of coordinates
	data (np.ndarray): Array of shape `(N, 2)` containing a list of coordinates

	Ordered indices of data of the shape `(N,)` according to their position along the Hilbert curve.
	np.ndarray: Ordered indices of data of the shape `(N,)` according to their position along the Hilbert curve.

		T: First element in `choices` that is not in `used`
		None: If all nearest neighbors have been visited, or `-1` in choices.

	T: First element in `choices` that is not in `used`
	None: If all nearest neighbors have been visited, or `-1` in choices.
	T \| None: First element in `choices` that is not in `used`, or None if all nearest neighbors have been visited or if `-1` is encountered in choices.

	Ordered indices of data of the shape `(N,)`.
	np.ndarray: Array of shape `(N, 2)` containing the input coordinates ordered
	along the greedy TSP path.

	def _get_closest(used: list[T], choices: list[T], world_size: Any) -> Union[T, None]:
	def _get_closest(used: list[T], choices: Union[list[T], np.ndarray], world_size: Any) -> Union[T, None]:

	) -> Union[np.ndarray, tuple[np.ndarray, np.ndarray]]:
	) -> Union[np.ndarray, list[int]]:

		typically yields a shorter traversal path than a random ordering, but does
		not guarantee an optimal Traveling Salesperson solution.

	"""Find an approximation of the closest path through a list of coordinates
	"""Find an approximation of the shortest path through a list of coordinates

Conversation

lucas-diedrich commented Jan 17, 2026

Uh oh!

lucas-diedrich commented Jan 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

sophiamaedler Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

sophiamaedler Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

sophiamaedler Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants