Skip to content

ChemBioHTP/EnzyExtract2

Repository files navigation

data/external/

EnzyExtract 1 See scripts/alarms/alarm_hallucination.py

  • scientific notation
    • rows: 20890, PMIDs: 3802
    • The kcat/km method: parse (kcat), (km), and (kcat/km). If kcat/km does not match up with kcat and km, then something went wrong with scientific notation
      • 8692, PMIDs: 1511
  • hallucination > 0.5
    • at > 0.5, rows: 5720, PMIDS: 806
  • repetition
    • rows: 9630, PMIDs: 784

EnzyExtract 2

  • too many sigfigs ( see scripts/alarms/step2_alarm_sigfig.py)
    • 111 pmids, of which most (via manual inspection) look like they are actually correct
  • out of distribution: use the intersection of BRENDA+EnzyExtract as "super-reliable". Consider points that are out of distribution.

BRENDA

  • do the BRENDA/EnzyExtract correlation plot but color based on those above
  • BRENDA/EnzyExtract correlation plot ( see scripts/alarms/alarm_correlation.py)
    • at kcat_diff > 1.1: 4510, PMIDs: 1592
  • out of distribution: use the intersection of BRENDA+EnzyExtract as "super-reliable". Consider points that are out of distribution.
  1. Use grand_biblio to add DOI to brenda

EnzyExtract

  • scientific notation, 3802 PMIDs
  • scientific notation with kcat, km, and kcat/km: 1511 PMIDs
    • can give LLM the "calculation" tool`
    1. verify that the kcat and Km values match what is provided in the image.
    2. Use the calculation tool to ensure that indeed the purported
    3. If not, try flipping the signs of all exponents (for instance, 4 x 10^5 to 4 x 10^-5) and try again.
  • hallucination threshold > 0.5: 806 PMIDs
  • repetition threshold > 0.5: 784 PMIDs
  • too many sigfigs: 111 PMIDs, of which most (via manual inspection) look like they are actually correct

BRENDA

  • BRENDA/EnzyExtract correlation plot: 1592 PMIDs kcat differs more than 1.1-fold
  • out of distribution: use the intersection of BRENDA+EnzyExtract as "super-reliable". Consider points that are out of distribution.

Flags

  • abbreviated substrate

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages