May I know what the precision of the 130B chat model is? Will it be possible to provide a quantized model perhaps to int4 or int8
May I know what the precision of the 130B chat model is?
Will it be possible to provide a quantized model perhaps to int4 or int8