Skip to content

4080显卡,基本跑不了多少数据,过万条训练数据就报错 #54

Description

@iissy

我已经把配置文件改小了:
`class T5ModelConfig:

d_ff: int = 1024                        # 全连接层维度

d_model: int = 512                      # 词向量维度
num_heads: int = 8                     # 注意力头数 d_model // num_heads == d_kv
d_kv: int = 64                          # d_model // num_heads

num_decoder_layers: int = 6            # Transformer decoder 隐藏层层数
num_layers: int = 6                    # Transformer encoder 隐藏层层数

`

词汇表也只10000,百度百科百万级别数据,我只能取几千条跑,多了就报错。

电脑配置:
显卡:4080(12G显存)
内存:32G,
cpu:i9-14900HX(24核,32线程)

这配置不配训练大模型吗?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions