- Use small corpus and hand-calculated values to double check. - Use large corpus and ensure some select known phrases compare predictably with each other. aka "How are you" > "How are monkeys"