Skip to content

Extract `Inspection Resulted in Closure' Feature from Restaurant Inspection Data #14

@eclee25

Description

@eclee25

The purpose of this issue is to continue the data cleaning that was started in the dc_doh_hackathon repository. The routine should then be modified to be able to be run repeatedly on the restaurant inspection dataset as it is updated.

From the issue_19 folder:

Take a look at `Issue 19 Walkthrough.ipynb' for the data-related conclusions reached at the hackathon. Continue with this script or start a new one that develops a routine to identify inspections that resulted in closures (see the original full GitHub issue text below). Be sure that the final script can be run from the command line taking three arguments:

  1. The input restaurant inspection data file (...inspection_summary_data.csv)
  2. The file that maps inspection_id to census block (The output of issue Geocode Restaurant Inspections and Map to Census Blocks #6)
  3. The output filename

Please also provide a README.md that describes the script and how to run it.

You can model the solution after the files here or
here

Place all of your files in the scripts/feature_engineering/extract_restaurant_inspection_features/ folder

For reference, here is the original issue description from the dc_doh_hackathon:

issue_19

Start with the DC DOH Food Service Establishment Inspection report data in the /Data Sets/Restaurant Inspections/ folder in Dropbox.

Develop a script to extract the number of food establishment inspections that resulted in (temporary) closure of the establishment. More details on violations can be found here

Input:
CSV files with inspection summary and violation details

Output:
A CSV file with

  • 1 row for each establishment type and risk category, and each week, year, and census block
  • The following columns:

feature_id: The ID for the feature, in this case, "restaurant_inspection_closures"
feature_type: The establishment_type from the restaurant data set
feature_subtype: The risk_category from 1-5
year: The ISO-8601 year of the feature value
week: The ISO-8601 week number of the feature value
census_block_2010: The 2010 Census Block of the feature value
value: The value of the feature, i.e. the number of inspections that resulted in closure in establishments with the given types and risk categories in the specified week, year, and census block.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions