Developed and tested on Windows 10 inside a venv environment using Python 3.7.7 and pip 19.2.3
Setup your environment by installing the requirements using pip.
pip install -r requirements.txt- Copy the
config.examplefile toconfig.py.
cp config.example config.py- Run
fetch_raw.pyto retrieve the data for the PubMed articles defined in the qrel files from the 2017 CLEF eHealth Lab.
python fetch_raw.py- Run
insert_docs.pyto insert the fetched article data from PubMed into a local database.
python insert_docs.py- Run
fetch_validity.pyto check the database against the original qrel files.
python fetch_validity.py- Run
clean_docs.pyto preprocess the articles and store them as a feature matrix.
python clean_docs.py- Run
run_experiments.pyto determine the performance for baseline (using all data) and selected datasets.
python run_experiments.py- Run
result_analysis.pyto create plots and significance tests.
python result_analysis.py