By calculating semantic entropy with a second LLM, we can better flag answers as unreliable due to lack of knowledge
As you surely know, AI has made huge strides in the last two years with the development and mass-scale deployment of large language models…