Skip to content

Hallucination Detection Feature

The hallucination detection feature can detect Hallucinations (plausible lies not based on data) in sentences answered by the Conversational Generative AI.

  • If hallucinations are detected, the feature calculates an indicator called the hallucination score based on the intensity or amount of hallucinations.
  • The closer the hallucination score is to 100, the more likely it is that the answers contain lies. (Although the hallucination detection function itself may be inadequate or inaccurate).
  • Please use this feature to determine whether or not to confirm facts using reliable information sources.

How to use

1. Open the dialog

When you press the button with the shield icon at the bottom of the answer frame you want to detect, the dialog for detecting the hallucination opens.

(* "including a logo for ICL" in the above answer is a hallucination.)

2. Select a function

Select one of the three hallucination detection functions on the Hallucination tab (cf. within the red frame in the following figure).

  • Each hallucination detection function has different granularity, method, and use cases.

"Majority vote for entire answer" function:

  • Granularity/Method) This function attempts to detect hallucinations from the entire answer passage based on a majority vote, and displays the hallucination score and the majority vote result.
  • The more unrelated or contradictory the answer passage is to the majority opinions, the higher the hallucination score.
  • If it detects hallucinations, it displays the new answer passage with the hallucinations removed or mitigated.
  • Use cases) While this function is the fastest, the coarse granularity, so it is suitable for situations where you want to...
  • "quickly check for severe hallucinations"
  • "find hallucinations across multiple sentences"

"Majority vote for key phrases in answer" function:

  • Granularity/Method) This function attempts to detect hallucinations from key phrases in each sentence of the answer based on majority votes, and displays the hallucination score and the majority vote results for each sentence.
  • The more unrelated or contradictory the answer sentence is to the majority opinions, the higher the hallucination score.
  • If it detects hallucinations, it displays each new answer sentence with the hallucinations removed or mitigated.
  • The tab shows the overall average of each hallucination score.
  • Use cases) While this function is the slowest, the fine granularity, so it is suitable for situations where you want to...
  • "check each sentence for narrow hallucinations"
  • "find hallucinations in key phrases such as person's names and eras"

"Check answer/docs correspondences" function:

  • This function is only available in chats that refer to documents.
  • Granularity/Method) This function matches the correspondences between the answer and the documents on a sentence-by-sentence basis, and displays the hallucination score and the existence of correspondence for each sentence.
  • The more unrelated or contradictory the answer sentence is to the document, the higher the hallucination score.
  • However, removal or mitigation of hallucinations is not implemented.
  • The tab shows the overall average of each hallucination score.
  • Use cases) Because this function directly compares with the documents, it is suitable for situations where you want to...
  • "check if there are any answer sentences that are not mentioned in any of the documents"
  • "find sentences in the documents that correspond to the answer sentences"

3. Select an option

Select one of the hallucination detection options from the select box at the bottom of the dialog (cf. within the red frame in the following figure).

  • This option sets the type of references to compare with the answer for hallucination detection.

"Use regenerated answers as references" option:

  • A detector uses the another answers obtained by asking the same question to the Conversational AI again as the references for comparison.
  • This option is only available when using the "Majority vote on entire answer" function.

"Use document chunks as references" option:

  • A detector uses the document chunks that the Conversational AI referred to during the chat as references for comparison.
  • This option is only available in chats that refer to documents when using the "Majority vote focusing on key phrases" or "Check answer/docs correspondences" functions.

"No references" option:

  • A detector does not use any references, but only it compares only with the knowledge held by the Conversational AI.
  • This option is only available when using the "Majority vote focusing on key phrases" feature.

4. Start checking

When the Check button is pressed, the selected hallucination detection function is executed, and the hallucination detection result is displayed after several tens of seconds (cf. within the red frame in the following figure).

  • However, the specific processing time varies depending on the selected function and the length of the answer.
  • If the previous hallucination detection result is to be overwritten, Recheck button will appear instead.

How to read detection results

Hallucination Score

The number displayed on the tab and chips at the beginning of each sentence are the hallucination scores assigned to the entire answer passage and each sentence.

  • Taking a value from 0 to 100, a value closer to 100 indicates a greater intensity or amount of hallucinations in the answer.
  • The color of the tab and chips changes according to the hallucination scores. Blue corresponds to 0, gray to 50, and red to 100.
  • However, for answers that cannot be detected for hallucinations, such as greetings and apologies, the hallucination score remains "--" and the tab and chips are displayed in black.

"Majority vote on entire answer" function

EXPLANATION is the reason for calculating the hallucination score. The result of the majority vote is displayed as the calculation reason.

MITIGATED ANSWER is the new answer passage after the hallucinations has been removed or mitigated.

  • Basically, it is corrected to the opinion agreed upon by the majority.
  • It is not inquired from an external database or the internet, so there is no guarantee that the corrected answer is a fact generally accepted.

CHECKED ANSWER is the answer passage that was the target of the hallucination detection. It is a reprint of the answer displayed on the chat screen.

  • If the display area is folded, please press the button at the end.

REFERENCE PASSAGES are other answer passages referred to when making a majority vote. The results of asking the Conversational AI to re-answer in the same context three times are displayed.

  • If the display area is folded, please press the button at the end.

"Majority vote focusing on key phrases" function

SCORE AND EXPLANATION FOR EACH CHECKED SENTENCE is the hallucination score and calculation reason for each sentence.

  • The result of the majority vote for each key phrase is displayed as the calculation reason.
  • If hallucinations are detected, each new sentence with the hallucinations removed or mitigated is displayed under "Hallucination mitigation".
  • Basically, it is corrected to the opinion agreed upon by the majority.
  • It is not inquired from an external database or the internet, so there is no guarantee that the corrected answer is a fact generally accepted.

REFERENCE PASSAGES are document chunks that are referenced when a majority vote is taken on key phrases.

  • It is only displayed if the option "Use document chunks as references" is selected.
  • If the display area is folded, please press the button at the end.

(* Answers were given in English with reference to Japanese documents.)

"Check answer/docs correspondences" function

SCORE AND EXPLANATION FOR EACH CHECKED SENTENCE is the hallucination score and calculation reason for each sentence.

  • The sentence in the document that had a correspondence with the answer sentence and its relationship are displayed as the calculation reason.

REFERENCE PASSAGES are document chunks referred to when checking the correspondence.

  • If the display area is folded, please press the button at the end.

(* Answers were given in English with reference to Japanese documents.)