agentlab.agents.generic_agent.reproducibility_agent

An agent that reproduces exactly the same traces as GenericAgent, to compare the results.

This module contains the classes and functions to reproduce the results of a study. It is used to create a new study that will run the same experiments as the original study, but with a reproducibility agent that will mimic the same answers as the original agent.

Stats are collected to compare the original agent’s answers with the new agent’s answers. Load the this reproducibility study in agent-xray to compare the results.

Functions

make_repro_agent(agent_args, exp_dir)

Create a reproducibility agent from an existing agent.

reproduce_study(original_study_dir[, log_level])

Reproduce a study by running the same experiments with the same agent.

Classes

ReproAgent(chat_model_args, flags[, ...])

ReproAgentArgs([agent_name, ...])

ReproChatModel(old_messages[, delay])

A chat model that reproduces a conversation.

class agentlab.agents.generic_agent.reproducibility_agent.ReproAgent(chat_model_args, flags, max_retry=4, repro_dir=None)

Bases: GenericAgent

get_action(obs)

Updates the agent with the current observation, and returns its next action (plus an info dict, optional).

Parameters:

obs:

The current observation of the environment, after it has been processed by obs_preprocessor(). By default, a BrowserGym observation is a dict with the following entries: - “chat_messages”: list[str], messages between the agent and the user. - “goal”: str, the current goal. - “open_pages_urls”: list[str], open pages. - “active_page_index”: int, the index of the active page. - “url”: str, the current URL. - “screenshot”: 3D np.array, the current screenshot. - “dom_object”: dict, the current DOM object. See DOMSnapshot from chrome devtools. - “axtree_object”: dict, the current AXTREE object. See Accessibility Tree from chrome devtools. - “extra_element_properties”: dict[bid, dict[name, value]] extra properties of elements in the DOM. - “focused_element_bid”: str, the bid of the focused element. - “last_action”: str, the last action executed. - “last_action_error”: str, the error of the last action. - “elapsed_time”: float, the time elapsed since the start of the episode.

Returns:

action: str

The action to be processed by action_mapping() (if any), and executed in the environment.

info: AgentInfo

Additional information about the action. with the following entries being handled by BrowserGym:

  • “think”: optional chain of thought

  • “messages”: list of messages with the LLM

  • “stats”: dict of extra statistics that will be saved and aggregated.

  • “markdown_page”: str, string that will be displayed by agentlab’s xray tool.

  • “extra_info”: dict, additional information that will be saved and aggregated.

class agentlab.agents.generic_agent.reproducibility_agent.ReproAgentArgs(agent_name: str = None, chat_model_args: agentlab.llm.base_api.BaseModelArgs = None, flags: agentlab.agents.generic_agent.generic_agent_prompt.GenericPromptFlags = None, max_retry: int = 4, _repro_dir: str = None)

Bases: GenericAgentArgs

make_agent()

Comply the experiments.loop API for instantiating the agent.

class agentlab.agents.generic_agent.reproducibility_agent.ReproChatModel(old_messages, delay=1)

Bases: object

A chat model that reproduces a conversation.

Parameters:
  • messages (list) – A list of messages previously executed.

  • delay (int) – A delay to simulate the time it takes to generate a response.

get_stats()
agentlab.agents.generic_agent.reproducibility_agent.make_repro_agent(agent_args: AgentArgs, exp_dir: Path | str)

Create a reproducibility agent from an existing agent.

Note, if a new flag was added, it was not saved in the original pickle. When loading the pickle it silently adds the missing flag and set it to its default value. The new repro agent_args will thus have the new flag set to its default value.

Parameters:
  • agent_args (AgentArgs) – The original agent args.

  • exp_dir (Path | str) – The directory where the experiment was saved.

Returns:

The new agent args.

Return type:

ReproAgentArgs

agentlab.agents.generic_agent.reproducibility_agent.reproduce_study(original_study_dir: Path | str, log_level=20)

Reproduce a study by running the same experiments with the same agent.