This code is for the large language model assertion pipeline. Detailed instructions coming soon!
- Run
run_umls_synonym_ner.pyandrun_dataset_ner.pyto build NER datasets (recommend using targeted NER prompts instead of broad NER prompts for NER dataset pull) - (Optional - highly recommended) Run
run_ner_cosine_similarity.pyfollowed byrun_llm_filter_cosine_sim_ner_output.pyto filter NER outputs (filter NER outputs to remove the low-yield named entities --> also helpful to review filtered NER outputs and remove those that are not related to your target entity) - Run
run_extraction.pyto build target-matcher and extract high-yield text from clinical notes - Run
run_llm_assertion.pyto generate LLM assertions