MuseGraph

The official implementation of Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining (TPAMI 2025).

Environment Setup

Llama-factory v0.8.0
Python 3.7.15

Quick Start

Dataset Generation

After downloading the required dataset files, you can use the dataset generation scripts in the data directory (e.g., nc_imdb.ipynb) to prepare the corresponding datasets.
For CoT-based instruction generation, please refer to the Prompt Template section.

After generating and mixing multiple datasets, you can configure and register them in dataset_info.json under the llama-factory directory, for example:

  "train_nc_IMDB": {
    "file_name": "train_nc_IMDB.json",
    "columns": {
      "prompt": "instruction",
      "query": "input",
      "response": "output"
    }
  }

Train&&Test&&Evaulate

The script lora_process.py provides an end-to-end pipeline for model training, testing, and evaluation:

python src/lora_process.py

Data Download

The datasets used in this project can be accessed from the following links:

Prompt Template

IMDB

Type	Prompt
Task-specific Instruction	Input: Given the target MOVIE with the compact graph description in the IMDB dataset, what the following categories does this MOVIE belong to: {Category List}. This MOVIE may have one or more categories. Directly give the answer of this MOVIE's categories. The compact graph description of this MOVIE is listed as follows: Title: {Title of MOVIE} Ego Graph Nodes: {Ego Graph Node List} One-hop Neighbors: {1-hop Neighbor List} Random Walks: {Random Walk Paths}. Output: {Ground-truth Category List}.
Querying GPT-4	I have a question as below: {Task-specific Instruction Input} and the answer is {Task-specific Instruction Output}. Imagine that you have made the correct choice and proceed with step-by-step reasoning. Your reasoning needs to incorporate Ego Graph Nodes, One-hop Neighbors, and Random Walks in the given compact graph description.
CoT-based Instruction	Input: Given the target MOVIE with the compact graph description in the IMDB dataset, what the following categories does this MOVIE belong to: {Category List}. This MOVIE may have one or more categories. Please think about the categorization in a step-by-step manner and avoid making false associations. Then provide your reasoning. Using the following format: Answer: {Answer}; Reasoning: {Reason}. The compact graph description of this MOVIE is listed as follows: Title: {Title of MOVIE} Ego Graph Nodes: {Ego Graph Node List} One-hop Neighbors: {1-hop Neighbor List} Random Walks: {Random Walk Paths}. Output: Answer: {Ground-truth Category List}; Reasoning: {Generated by GPT-4}.

Acknowledgements

We sincerely thank the following open-source repositories for their valuable codebases and contributions, which greatly helped this project:

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.idea		.idea
data		data
images		images
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MuseGraph

Environment Setup

Quick Start

Data Download

Prompt Template

IMDB

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MuseGraph

Environment Setup

Quick Start

Data Download

Prompt Template

IMDB

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages