The loss equaled to 87.3365 during the training stage and didn't change

I followed the instruction and didn't change the settings in solver.prototxt, but the loss converged to 87.3365 soon. It's said that this is because the learning rate is too large and the feature before the softmax layer equals to inf. So I am wondering what settings should I use with this network.
Thanks a lot!