Skip to content

如何评测MuSiQue #5

@1190201205

Description

@1190201205

我想要评测MuSiQue数据集中的musique_ans_v1.0_dev.jsonl这个文件该如何将其转换为项目需要的格式呢
是不是先要运行prepare_retriever.sh然后部署vllm最后在运行run_evalutation.sh

在运行prepare_retriever.sh的时候我调整成下面的参数
减小了 dense_config.faiss_config.batch_size和 dense_config.batch_size
用两张4090还是爆显存该怎么调整呢
DEVICE_ID='[4,5]'
ENCODER_PATH='/Model/bge-large-en-v1.5'
data_path='data/musique_ans_v1.0_dev.jsonl'

python -m flexrag.entrypoints.prepare_index
retriever_type=dense
corpus_path=[$data_path]
saving_fields=[id,question,answer,paragraphs,answer_aliases]
text_process_pipeline.processor_type=[length_filter]
text_process_pipeline.length_filter_config.max_chars=4096
text_process_pipeline.length_filter_config.min_chars=10
text_process_fields=[paragraphs]
dense_config.database_path=test
dense_config.encode_fields=[paragraphs]
dense_config.passage_encoder_config.encoder_type=hf
dense_config.passage_encoder_config.hf_config.model_path=$ENCODER_PATH
dense_config.passage_encoder_config.hf_config.prompt='query: '
dense_config.passage_encoder_config.hf_config.normalize=True
dense_config.passage_encoder_config.hf_config.device_id=$DEVICE_ID
dense_config.index_type=faiss
dense_config.faiss_config.batch_size=4896
dense_config.faiss_config.log_interval=100000
dense_config.batch_size=4896
dense_config.log_interval=100000

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions