# Knowledge agents

EARLY ACCESS

___

Use the Quality center to evaluate and monitor your [knowledge agents](https://www.infobip.com/docs/agentos-ai-agents/knowledge-agents/overview). Test agent responses with sample questions, analyze real end user traffic, and use the results to identify issues and improve agent performance.

To access, go to **AI Agents** > **Quality center** > **Knowledge Agents**.

___

## Process overview

1. Select the knowledge agent you want to evaluate.
2. Create and run the task.
3. Check the [status](#view-task-status) of the task.
4. When the task is complete, download and [view the results](#view-the-results).
5. [Update the knowledge agent](#update-the-agent-based-on-results) as required.

___

## Evaluate

Test the knowledge agent with sample questions to verify it responds as expected. Upload a `.csv` file with questions, run the task, and review the results.

### Prepare test questions [#prepare-test-questions-evaluate]

Create a `.csv` file with sample questions that represent real end user queries.

Example questions for a car dealership agent:

- What car models do you have available?
- Do you sell new and used cars?
- Can I trade in my old car?
- What financing options do you provide?

To verify answer accuracy, use [ground truth evaluation](#ground-truth-evaluation-optional) by adding a `reference_answer` column alongside each question.

### Run the task [#run-the-task-evaluate]

1. On the Infobip web interface, go to **AI Agents** > **Quality center** > **Knowledge Agents**.
2. In the **Select knowledge agent** field, select the agent you want to test.
3. Upload the `.csv` file that contains the set of questions.
4. In the **Task name** field, enter a name for the task.
5. Select **Run task**.

### Ground truth evaluation (optional) [#ground-truth-evaluation-optional-evaluate]

Ground truth evaluation compares agent responses against expected answers you provide. Add two columns to your `.csv` file: `question` and `reference_answer`.

When the test runs, each generated answer receives one of these labels:

| Label | Meaning |
|---|---|
| **GOOD** | The answer matches the expected answer |
| **INCOMPLETE** | Correct overall but missing some details from the expected answer |
| **BAD** | Contains incorrect or fabricated information |

The label and reasoning for each answer are included in the results file.

___

## View task status

You can view all tasks in the **Tasks** section. This section shows a maximum of the 20 most recent tasks of each type.

| Status | Description |
|---|---|
| **Completed** | The task is complete and the results are ready to download. |
| **Failed** | The task failed. Hover over the information icon next to the status to see why. |
| **In progress** | The task is running. |

___

## View the results

When the task is complete, download the results `.csv` file from the **Tasks** section.

### Results fields [#results-fields-view-the-results]

| Field | Description |
|---|---|
| `question` | The question from the test file. |
| `answer` | The response generated by the knowledge agent. |
| `original_contexts` | Relevant chunks from the knowledge sources that were retrieved, including the source document name. |
| `reranked_contexts` | Contexts after reranking, if enabled. Not available for very short messages. |
| `content_filter_results` | Content filter result for each message. Included only if the content filter is enabled. Contains the status (`safe`, `annotated`, `blocked`), and violation details per category including severity, threshold, configured action, and whether the filter was triggered. |
| `latency` | Time taken by the knowledge agent to generate the response. This differs from the chatbot response time, which may include additional processing. |
| `topic` | The topic determined by the agent based on the question. Provides insight into end user interests. |
| `is_question_answered` | `TRUE`: agent answered the question. `FALSE`: agent could not answer (missing content or out of scope). `N/A`: message was not a question. |
| `is_answer_in_context` | Indicates whether the answer is based on the retrieved context. |
| `answer_classification` | `GOOD`, `INCOMPLETE`, or `BAD`. Included only for Evaluate tasks when `reference_answer` is provided. |
| `answer_classification_reasoning` | Explanation of why the answer received its classification. Included only for Evaluate tasks when `reference_answer` is provided. |

___

## Update the agent based on results

Use the results to identify and fix issues. Investigate the root cause before making changes.

| Issue | Solution |
|---|---|
| Agent answered correctly but the answer is not in the context | The prompt likely contains the answer. If the answer is wrong, check the knowledge source content. |
| Agent uses an inappropriate tone | Review and update the prompt in [agent settings](https://www.infobip.com/docs/agentos-ai-agents/knowledge-agents/create-knowledge-agent#configure-the-agent-settings). |
| Agent responds in the wrong language | Review and update the prompt in [agent settings](https://www.infobip.com/docs/agentos-ai-agents/knowledge-agents/create-knowledge-agent#configure-the-agent-settings). |
| Agent responses are cut off | Increase the [output tokens](https://www.infobip.com/docs/agentos-ai-agents/knowledge-agents/create-knowledge-agent#advanced-settings) setting. |
| Agent hallucinates | The knowledge source is likely missing relevant content. [Add it](https://www.infobip.com/docs/agentos-ai-agents/knowledge-agents/connect-knowledge-sources). |
| Agent answers out-of-scope questions | Tighten the prompt to define scope and limitations in [agent settings](https://www.infobip.com/docs/agentos-ai-agents/knowledge-agents/create-knowledge-agent#configure-the-agent-settings). |

For an overview of testing approaches, see [Test knowledge agent](https://www.infobip.com/docs/agentos-ai-agents/knowledge-agents/test-knowledge-agent).

___