Abstract

Protein-protein interactions (PPIs) are often highly specific to the conditions in which they occur, including cell type, development stage, and disease state. However, most large PPI databases remain agnostic to these key nuances, reporting interactions without indicating the biological context. This gap limits researchers' ability to understand how molecular networks very across biological systems. Much of this context-specific information resides in the scientific literature, where it is difficult to access at scale. Natural language processing (NLP) offers a powerful solution by mining PPIs directly from publications while retaining the surrounding contextual information, such as the cell type in which the interaction was observed.

Our project, COMPILE (Context-aware Mapping of Protein Interactions from Literature Evidence), will develop a user-friendly web platform to make this information accessible. Using context-aware NLP pipelines, we will extract protein-protein interactions from the literature, annotate them with detailed biological context, and link each interaction to the exact supporting sentence in the source paper. The results will be stored in an interactive knowledge graph, where proteins are nodes, interactions are edges, and contextual attributes are embedded as metadata. By combining literature mining with rich graphical visualization, COMPILE will allow researchers to easily identify PPIs relevant to their biological system of interest and explore protein networks across different cellular contexts. This tool will bridge the gap between unstructured text and actionable, evidence-linked PPI data, accelerating hypothesis generation in molecular biology.

Environment Setup

python3.12 -m venv env
source env/bin/activate

pip install scispacy
pip install spacy
pip install re
pip install neo4j
pip install indra
pip install \ https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.4/en_ner_jnlpba_md-0.5.4.tar.gz

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
website		website
.gitignore		.gitignore
Backend_api.py		Backend_api.py
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
bun.lockb		bun.lockb
cell_nodes.csv		cell_nodes.csv
components.json		components.json
create_nodes_and_edges.py		create_nodes_and_edges.py
data_final.csv		data_final.csv
eslint.config.js		eslint.config.js
index.html		index.html
ner_relation.py		ner_relation.py
package-lock.json		package-lock.json
package.json		package.json
paper_nodes.csv		paper_nodes.csv
pmid_to_abstract.json		pmid_to_abstract.json
pmid_to_abstract_updated.json		pmid_to_abstract_updated.json
postcss.config.js		postcss.config.js
ppi_cell_edges.csv		ppi_cell_edges.csv
ppi_nodes.csv		ppi_nodes.csv
ppi_paper_edges.csv		ppi_paper_edges.csv
protein_nodes.csv		protein_nodes.csv
protein_ppi_edges.csv		protein_ppi_edges.csv
scispacy_ner.py		scispacy_ner.py
tailwind.config.ts		tailwind.config.ts
testing.py		testing.py
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstract

Environment Setup

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

License

hackbio-ca/compile-protein-interactions

Folders and files

Latest commit

History

Repository files navigation

Abstract

Environment Setup

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages