Epilepsy data mapping

Goal: Determine if patient-reported data (from KAND and REN datasets) corresponds to the gold standard phenotype descriptions in HPO and Mondo.

Action item: Map KAND and REN data to CPONT

@dolson Parse the datasets into diseases and phenotypes: look at the data that is shared in those datasets and determine what is the low-hanging fruit, what is clearly a disease or a phenotype and parse those into a new file
if the data has already been mapped to OMOP, can lookup the OBO term using the OMOP2OBO mappings from C-Path or Tiffany. c) @ehartley can write a python script to pull out the OBO IDs that map to OMOP
@dolson if they data has not already been mapped to OMOP, it will need to be manually curated by looking up HPO and Mondo IDs/labels and adding them to the spreadsheet. (We can look into automated text mining methods to do this as well. Any text mined results will need to be manually reviewed)
if a concept does not map to an existing Mondo or HPO term, please make a note and we can request these terms to be added
need to identify which HPO and Mondo terms are missing from CPONT and add those to CPONT
ultimately we will want to do some comparison of individual patient data to the gold standard phenotypes in Monarch. Note, in the REN datasets, the patients may not have reported their specific rare epilepsy subtype. We may be able to compare patient reported phenotypes to gold standard phenotypes to identify the disease subtype.

Edited Aug 30, 2023 by Daniel Olson