agentlab.analyze.inspect_results

Functions

`ablation_report`(result_df[, reduce_fn, ...])	Reduce the multi-index to a change description compared to the previous row.
`categorize_error`(row)
`display_report`(report[, ...])	Display the report in a nicer-ish format.
`error_report`(df[, max_stack_trace, use_log])	Report the error message for each agent.
`error_report_detailed`(df[, max_stack_trace])	Report the error message for each agent, categorizing them as server errors or retry errors.
`flag_report`(report[, metric, round_digits])
`get_all_summaries`(results_dir[, ...])
`get_all_task_messages`(exp_dir[, max_n_exp])
`get_constants_and_variables`(df[, drop_constants])	Filter out constants from the dataframe.
`get_sample_std_err`(df, metric)	Get the standard error for a binary metric.
`get_std_err`(df, metric)	Get the standard error for a binary metric.
`get_study_summary`(study_dir[, ignore_cache, ...])	Get the cached study summary for the given study directory or computes it.
`global_report`(result_df[, reduce_fn, ...])	Produce a report that summarize all tasks and all episodes for each agent.
`load_result_df`(exp_dir[, progress_fn, ...])	Load the result dataframe.
`map_err_key`(err_msg)
`print_errors_chronologically`(df)	Print the errors in chronological order, grouping contiguous chunks of the same error.
`reduce_episodes`(result_df)	Reduce the dataframe to a single row per episode and summarize some of the columns.
`report_2d`(df[, reduce_fn, n_row_keys])	Generic function to create a 2d report based on the dataframe.
`report_constant_and_variables`(df[, ...])
`report_different_errors`(sub_df)	Report the different errors in the dataframe.
`set_index_from_variables`(df[, ...])	Set the index, inplace, to env.task_name and all variables.
`set_wrap_style`(df)
`shrink_columns`(df[, also_wrap_index])	Make the column names more compact by replacing underscores with newlines
`split_by_key`(df, key)	Return a dict of dataframes spearted by the given key.
`summarize`(sub_df)
`summarize_stats`(sub_df)	Summarize the stats columns.
`summarize_study`(result_df)	Create a summary of the study.

agentlab.analyze.inspect_results.ablation_report(result_df: ~pandas.core.frame.DataFrame, reduce_fn=<function summarize>, progression=False)

Reduce the multi-index to a change description compared to the previous row.

NOTE: This assumes that this experiments was launched with make_ablation_study.

Rows will be sorted according to the average ExpArgs.order for all experiments associated with the multi-index.

Parameters:

result_df – The result dataframe as returned by load_result_df.
reduce_fn – The function to use to reduce the sub dataframe. By default this is summarize.
progression – If True, the change description will be the progression

Returns:

A dataframe with the change description as index.

agentlab.analyze.inspect_results.categorize_error(row)

agentlab.analyze.inspect_results.display_report(report: DataFrame, apply_shrink_columns: bool = True, copy_to_clipboard: bool = True, rename_bool_flags: bool = True, print_only: str = None)

Display the report in a nicer-ish format.

To be able to wrap col names we need to use set_wrap_stype, which returns a styled df, and doesn’t behave like a normal df. For encapsulate the displaying in this function.

Parameters:

report – The report to display
apply_shrink_columns – Make the column more compat by replacing underscores with newlines
copy_to_clipboard – Copy the report to the clipboard
rename_bool_flags – Rename the boolean flags to be more compact and readable
print_only – Print only the given column

agentlab.analyze.inspect_results.error_report(df: DataFrame, max_stack_trace=10, use_log=False): Report the error message for each agent.

agentlab.analyze.inspect_results.error_report_detailed(df: DataFrame, max_stack_trace=10): Report the error message for each agent, categorizing them as server errors or retry errors.

agentlab.analyze.inspect_results.flag_report(report: DataFrame, metric: str = 'avg_reward', round_digits: int = 2)

agentlab.analyze.inspect_results.get_all_summaries(results_dir: Path, skip_hidden=True, ignore_cache=False, ignore_stale=False)

agentlab.analyze.inspect_results.get_all_task_messages(exp_dir, max_n_exp=None)

agentlab.analyze.inspect_results.get_constants_and_variables(df: DataFrame, drop_constants: bool = False): Filter out constants from the dataframe.

agentlab.analyze.inspect_results.get_sample_std_err(df, metric): Get the standard error for a binary metric.

agentlab.analyze.inspect_results.get_std_err(df, metric): Get the standard error for a binary metric.

agentlab.analyze.inspect_results.get_study_summary(study_dir: Path, ignore_cache=False, ignore_stale=False, progress_fn=None, sentinel=None) → DataFrame

Get the cached study summary for the given study directory or computes it.

The cache is based on the modified times of all the files in the study.

Parameters:

study_dir – The study directory to summarize
ignore_cache – If True, ignore the cache and recompute the summary
ignore_stale – If True, don’t verify if files have changed since the last summary was computed. This may lead to stale summaries.
progress_fn – Pass tqdm.tqdm to show progress.
sentinel – Captures internal values for unit testing.

Returns:

The study summary

Return type:

pd.DataFrame

agentlab.analyze.inspect_results.global_report(result_df: ~pandas.core.frame.DataFrame, reduce_fn=<function summarize>, rename_index=<function <lambda>>)

Produce a report that summarize all tasks and all episodes for each agent.

Parameters:

result_df – The result dataframe as returned by load_result_df.
reduce_fn – The function to use to reduce the sub dataframe. By default this is summarize.
rename_index – Function to rename the index. By default we remove the prefix “agent.flags.”

Returns:

The report

Return type:

pd.DataFrame

agentlab.analyze.inspect_results.load_result_df(exp_dir, progress_fn=<class 'tqdm.std.tqdm'>, set_index=True, result_df=None, index_white_list=('agent.*', ), index_black_list=('*model_url*', '*extra*', '*._*'), remove_args_suffix=True)

Load the result dataframe.

Will set the index to env.task_name and all columens that are not constant and starts with agent. This will allow to easily groupby and compare results. This index can be changed later using df.set_index.

Parameters:

exp_dir – Path to the experiment directory
progress_fn – Progress function to use when loading the results
set_index – If True, set the index to env.task_name and variable agent
result_df – If not None, speed up the loading process by reusing alreading loaded objects.
index_white_list – List of wildard patterns to match variables that should be included in the index.
index_black_list – List of wildard patterns to match variables that should be excluded from the index.
remove_args_suffix – If True, remove the _args suffix from the columns

Returns:

The result dataframe

Return type:

pd.DataFrame

agentlab.analyze.inspect_results.map_err_key(err_msg: str)

agentlab.analyze.inspect_results.print_errors_chronologically(df: DataFrame): Print the errors in chronological order, grouping contiguous chunks of the same error.

agentlab.analyze.inspect_results.reduce_episodes(result_df: DataFrame) → DataFrame: Reduce the dataframe to a single row per episode and summarize some of the columns.

agentlab.analyze.inspect_results.report_2d(df: ~pandas.core.frame.DataFrame, reduce_fn: callable = <function reduce_episodes>, n_row_keys=1)

Generic function to create a 2d report based on the dataframe.

The code is simple but can be a bit cryptic. This is best explained in the following 3 steps: 1) Groupby: Will use the existing multi-index to groupby. Make sure to set the

an index to the desired keys before calling this function.

Reduce: Uses the reduce_fn to reduce the content of each group to a single variable, creating a 1D series indexed by its original index.
Unstack: Produce a 2D table such that the first n_row_keys are used to specify how many dimensions are used for the rows. The remaining dimensions are used for the columns.

Parameters:

df – The dataframe to reduce
reduce_fn – The function to use to reduce the sub dataframe. By default this is reduce_episodes.
n_row_keys – The number of keys to use for the rows.

Returns:

The 2D report

Return type:

pd.DataFrame

agentlab.analyze.inspect_results.report_constant_and_variables(df, show_stack_traces=True)

agentlab.analyze.inspect_results.report_different_errors(sub_df): Report the different errors in the dataframe.

agentlab.analyze.inspect_results.set_index_from_variables(df: DataFrame, index_white_list=('agent.*',), index_black_list=('*model_url*', '*extra*', '*._*'), task_key='env.task_name', add_agent_and_benchmark=True)

Set the index, inplace, to env.task_name and all variables.

Introspects df to find all fields that are variable and set the index to those fields. This will allow to easily groupby and compare results. To filter undersired variables from the index, use index_white_list and index_black_list.

Parameters:

df – The dataframe to modify
index_white_list – List of wildard patterns to match variables that should be included in the index.
index_black_list – List of wildard patterns to match variables that should be excluded from the index.
task_key – The key to use as the first level of the index.
add_agent_and_benchmark – If True, add agent.agent_name and env.benchmark

agentlab.analyze.inspect_results.set_wrap_style(df)

agentlab.analyze.inspect_results.shrink_columns(df, also_wrap_index=True): Make the column names more compact by replacing underscores with newlines

agentlab.analyze.inspect_results.split_by_key(df: DataFrame, key): Return a dict of dataframes spearted by the given key.

agentlab.analyze.inspect_results.summarize(sub_df)

agentlab.analyze.inspect_results.summarize_stats(sub_df): Summarize the stats columns.

agentlab.analyze.inspect_results.summarize_study(result_df: DataFrame) → DataFrame: Create a summary of the study. Similar to global report, but handles single agent differently.