Hi, @XiangLi1999. Thank you for your great work!
According to the offical code, teacher_student arg is set as True in typical and nucleus decoding. I'm just wondering teacher_student should be set as True only for the contrastive decoding and False for all the other baselines in order to completely ignore the effect of contrastive decoding and perform only the typical / nucleus decoding. I would kindly ask you that is it right to set teacher_student=False in typical decoding code just like greedy decoding.
Thank you.
elif args.do_sample == 'typical':
output_sequences = model.generate(
input_ids=input_ids,
max_length=args.length + len(encoded_prompt[0]),
min_length=args.length + len(encoded_prompt[0]),
temperature=args.temperature,
top_k=args.k,
top_p=args.p,
typical_p=0.95,
repetition_penalty=args.repetition_penalty,
do_sample=True,
num_beams=1,
num_return_sequences=args.num_return_sequences,
student_lm=student_lm,
teacher_student=True,
model_kwargs_student={},
st_coef=args.st_coef)
print('typical sampling')
elif args.do_sample=='greedy' and args.contrastive_decoding == 'none':
output_sequences = model.generate(
input_ids=input_ids,
max_length=args.length + len(encoded_prompt[0]),
min_length=args.length + len(encoded_prompt[0]),
temperature=args.temperature,
top_k=args.k,
top_p=args.p,
repetition_penalty=args.repetition_penalty,
do_sample=False,
num_beams=1,
num_return_sequences=args.num_return_sequences,
student_lm=student_lm,
teacher_student=False,
model_kwargs_student={},
st_coef=args.st_coef)
print('greedy')
Hi, @XiangLi1999. Thank you for your great work!
According to the offical code,
teacher_studentarg is set asTruein typical and nucleus decoding. I'm just wonderingteacher_studentshould be set asTrueonly for the contrastive decoding andFalsefor all the other baselines in order to completely ignore the effect of contrastive decoding and perform only the typical / nucleus decoding. I would kindly ask you that is it right to setteacher_student=Falsein typical decoding code just like greedy decoding.Thank you.