Description
Add a validate_dataset_name() helper that enforces the dataset naming rules from the manual (lowercase, hyphens only, no spaces or special characters). Apply it in the upload endpoint. Extend the search endpoint to accept a comma-separated datasets parameter for multi-dataset scoping.
Acceptance Criteria
Technical Notes
Valid examples: "fast-food", "kitchen-equipment", "industry-reports"
Invalid examples: "Fast Food" (spaces + uppercase), "doc_2024" (underscores), "reports!" (special chars)
A simple regex is sufficient:
import re
def validate_dataset_name(name: str) -> str:
if not name:
raise ValueError("Dataset name cannot be empty")
if not re.match(r'^[a-z0-9]+(-[a-z0-9]+)*$', name):
raise ValueError(
f"Invalid dataset name '{name}'. "
"Use lowercase letters, numbers, and hyphens only (e.g. 'fast-food')."
)
return name
Description
Add a
validate_dataset_name()helper that enforces the dataset naming rules from the manual (lowercase, hyphens only, no spaces or special characters). Apply it in the upload endpoint. Extend the search endpoint to accept a comma-separateddatasetsparameter for multi-dataset scoping.Acceptance Criteria
validate_dataset_name(name: str) -> strhelper exists (suggestapp/utils/validation.py)ValueErrorfor names containing uppercase, spaces, underscores, or special characters other than hyphensValueErrorfor empty string inputPOST /documents/upload— invalid dataset names returnHTTP 422with a clear messageGET /documents/searchacceptsdatasets: Optional[str]as a comma-separated string (e.g.?datasets=fast-food,equipment) and converts it tolist[str]before passing tosearch_knowledge_graph()validate_dataset_name()covering: valid names, uppercase, spaces, underscores, empty string, leading/trailing hyphensTechnical Notes
Valid examples:
"fast-food","kitchen-equipment","industry-reports"Invalid examples:
"Fast Food"(spaces + uppercase),"doc_2024"(underscores),"reports!"(special chars)A simple regex is sufficient: