thoth.lab package¶

Subpackages¶

thoth.lab.viz package
- Module contents

Submodules¶

thoth.lab.adviser module¶

Adviser results processing and analysis.

thoth.lab.adviser.aggregate_adviser_results(adviser_version: str, limit_results: bool = False, max_ids: int = 5) → pandas.core.frame.DataFrame[source]¶

Aggregate adviser results from jsons stored in Ceph.

Parameters

adviser_version – minimum adviser version considered for the analysis of adviser runs
limit_results – reduce the number of adviser runs ids considered to max_ids to test analysis
max_ids – maximum number of adviser runs ids considered

thoth.lab.adviser.create_adviser_heatmap(adviser_justification_df: pandas.core.frame.DataFrame, file_name: Optional[str] = None, save_result: bool = False, output_dir: Optional[str] = None)[source]¶

Create adviser justifications heatmap plot.

Parameters

adviser_justification_df – data frame as returned by `create_final_dataframe’ per identifier.
file_name – file name used in the name of files saved
save_result – resulting plots created are stored in output_dir.
output_dir – output directory where plots are stored if save_results is set to True.

thoth.lab.adviser.create_adviser_results_histogram(plot_df: pandas.core.frame.DataFrame)[source]¶

Create inspection performance parameters plot in 3D.

:param plot_df dataframe for plot of adviser results

thoth.lab.adviser.create_final_dataframe(adviser_dataframe: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶

Create final dataframe with all information required for plots.

Parameters: adviser_dataframe – data frame as returned by aggregate_adviser_results method.

thoth.lab.adviser.extract_adviser_justifications(report: Dict[str, Any], adviser_dict: Dict[str, Any], ids: str) → Dict[str, Any][source]¶: Retrieve justifications from adviser results.

thoth.lab.adviser.extract_justifications_from_products(products: List[Dict[str, Any]], adviser_dict: Dict[str, Any], ids: str) → Dict[str, Any][source]¶: Extract justifications from products in adviser results.

thoth.lab.common module¶

Common methods for thoth-lab.

thoth.lab.common.aggregate_thoth_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, repo_path: Optional[pathlib.Path] = None, store_name: Optional[str] = None, is_inspection: Optional[str] = None) → Union[list, dict][source]¶

Aggregate results from jsons stored in Ceph for Thoth or locally from repo.

Parameters

limit_results – reduce the number of reports ids considered to max_ids to test analysis
max_ids – maximum number of reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
repo_path – required if you want to retrieve the dataset locally and is_local is set to True
store – ResultStorageBase type depending on Thoth data (e.g solver, performance, adviser, etc.)
is_inspection – flag used only for InspectionResultStore as we store results in batches

thoth.lab.common.aggregate_thoth_results_from_ceph(store_name: str, files: Union[dict, list], limit_results: bool = False, max_ids: int = 5) → Tuple[Union[dict, list], int][source]¶: Aggregate Thoth results from Ceph.

thoth.lab.common.extract_zip_file(file_path: pathlib.Path)[source]¶: Extract files from zip files.

thoth.lab.convert module¶

Utilities to work with package dependencies.

thoth.lab.dependency_monkey module¶

Dependency Monkey results processing and analysis.

thoth.lab.dependency_monkey.aggregate_dm_results_per_identifier(identifiers_inspection: List[str], limit_results: bool = False, max_batch_identifiers_ids: int = 5) → Union[dict, List[str]][source]¶

Aggregate inspection batch ids and specification from dm documents stored in Ceph.

Parameters

inspection_identifier – list of identifier/s to filter inspection batch ids
limit_results – limit inspection batch ids considered to max_batch_identifiers_ids to test analysis
max_batch_identifiers_ids – maximum number of inspection batch ids considered

thoth.lab.exception module¶

Exceptions for thoth-lab methods.

exception thoth.lab.exception.NotUniqueValues[source]¶

Bases: Exception

An exception when dateframe unique method cannot return results.

thoth.lab.graph module¶

Various helpers and utils for interaction with the graph database.

class thoth.lab.graph.DependencyGraph(incoming_graph_data=None, **attr)[source]¶

Bases: networkx.classes.ordered.OrderedDiGraph

Construct a dependency graph by extending nx.OrderedDiGraph.

adjlist_dict_factory¶: alias of collections.OrderedDict

static get_root(tree)[source]¶

Return root of the current graph, if any.

By default, tree topology is considered as input, so if there are multiple roots, only the first one is returned.

node_dict_factory¶: alias of collections.OrderedDict

class thoth.lab.graph.GraphQueryResult(result)[source]¶

Bases: object

Wrap results of graph database queries.

plot_bar()[source]¶: Plot histogram of results obtained.

plot_pie()[source]¶: Plot a pie of results into Jupyter notebook.

serialize()[source]¶: Serialize the output of graph query.

to_dataframe()[source]¶: Construct a panda’s dataframe on results.

thoth.lab.graph.get_root(tree)¶

Return root of the current graph, if any.

By default, tree topology is considered as input, so if there are multiple roots, only the first one is returned.

thoth.lab.inspection module¶

Inspection results processing and analysis.

thoth.lab.inspection.aggregate_inspection_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, inspection_repo_path: pathlib.Path = PosixPath('performance')) → list[source]¶

Aggregate inspection results from jsons stored in Ceph or locally from performance repo.

Parameters

limit_results – reduce the number of inspection reports ids considered to max_ids to test analysis
max_ids – maximum number of inspection reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
inspection_repo_path – required to retrieve the performance dataset locally and is_local is set to True

thoth.lab.inspection.aggregate_inspection_results_per_identifier(inspection_ids: List[str], identifier_inspection: List[str], inspection_batch_data: Dict[str, dict]) → dict[source]¶

Aggregate inspection results per identifier from inspection documents stored in Ceph.

Parameters

inspection_ids – list of inspection ids
identifier_inspection – list of identifier/s to filter inspection ids
inspection_batch_data – info to be added to each inspection (e.g. specification)

thoth.lab.inspection.columns_to_analyze(df: pandas.core.frame.DataFrame, low: int = 0, display_clusters: bool = False, cluster_by_hue: bool = False) → pandas.core.frame.DataFrame[source]¶

Print all columns within dataframe and count of unique column values within limit.

Parameters

df – data frame to analyze as returned by `process_inspection_results’
low – the lower limit (0 if not specified) of distinct value counts
display_clusters – if true, displays grouped counts of parameter and parameter sort_values
cluster_by_hue – if true, displays distribution of parameters to analyze sorted by hues

thoth.lab.inspection.concatenated_df(dfs: List[pandas.core.frame.DataFrame], column: str)[source]¶

Reorganize dataframe to show the distribution of jobs in a category across different subsets of data.

Parameters

dfs – list of inspection result dataframes which can be different datasets or subset of datasets
column – column name or category for grouping to see the distribution of results

thoth.lab.inspection.create_duration_box(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, **kwargs)[source]¶: Create duration Box plot.

thoth.lab.inspection.create_duration_dataframe(inspection_df: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶: Compute statistics and duration DataFrame.

thoth.lab.inspection.create_duration_histogram(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, bins: Optional[int] = None, **kwargs)[source]¶: Create duration Histogram plot.

thoth.lab.inspection.create_duration_scatter(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, **kwargs)[source]¶: Create duration Scatter plot.

thoth.lab.inspection.create_duration_scatter_with_bounds(data: pandas.core.frame.DataFrame, col: str, index: Optional[Union[list, pandas.core.indexes.base.Index, pandas.core.indexes.range.RangeIndex]] = None, **kwargs)[source]¶: Create duration Scatter plot with upper and lower bounds.

thoth.lab.inspection.create_filtered_df(df: pandas.core.frame.DataFrame, pi_name: Optional[str] = None, pi_component: Optional[str] = None, runtime_environment: Optional[str] = None, packages: Optional[List[Tuple[str, str, str]]] = None) → pandas.core.frame.DataFrame[source]¶: Create dataframe using the filters selected for plots.

thoth.lab.inspection.create_final_dataframe(packages_versions: dict, python_packages_dataframe: pandas.core.frame.DataFrame, inspection_df: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶

Create final dataframe with all information required for plots.

Parameters

packages_versions – dict as returned by create_python_package_df method.
python_packages_dataframe – data frame as returned by create_python_package_df method.
inspection_df – data frame containing data of inspections results.

thoth.lab.inspection.create_inspection_2d_plot(plot_df: pandas.core.frame.DataFrame, quantity: str, components: List[str], color_scales: List[str], identifiers_inspections: List[str], have_annotations: bool = False)[source]¶

Create inspection performance parameters plot in 2D.

:param plot_df dataframe for plot of inspections results

thoth.lab.inspection.create_inspection_3d_plot(plot_df: pandas.core.frame.DataFrame, quantity: str, identifiers_inspections: List[str])[source]¶

Create inspection performance parameters plot in 3D.

:param plot_df dataframe for plot of inspections results

thoth.lab.inspection.create_inspection_analysis_plots(inspection_df: pandas.core.frame.DataFrame)[source]¶

Create inspection analysis plots for the inspection pd.Dataframe.

Parameters: inspection_df – data frame as returned by `process_inspection_results’ for a specific inspection identifier

thoth.lab.inspection.create_inspection_dataframes(inspection_results_dict: dict, duration_info: bool = False) → dict[source]¶

Create dictionary with data frame as returned by `process_inspection_results’ for each inspection identifier.

Parameters: inspection_results_dict – dictionary containing inspection results per inspection identifier.

thoth.lab.inspection.create_inspection_parameters_dataframes(parameters: List[str], inspection_df_dict: dict, component: Optional[str] = None) → Dict[str, pandas.core.frame.DataFrame][source]¶

Create pd.DataFrame of selected parameters from inspections results to be used for statistics and error analysis.

It also outputs batches and parameters map that is necessary for plots.

Parameters

parameters – inspection parameters used in the analysis
inspection_df_dict – dictionary with data frame as returned by `process_inspection_results’ per identifier.
component – PI component name (e.g tensorflow, pytorch).

thoth.lab.inspection.create_inspection_time_dataframe()[source]¶: Create pd.Dataframe of time of inspections for build and job.

thoth.lab.inspection.create_multiple_violin_plot(data: pandas.core.frame.DataFrame, quantity: str, x_label: str = '', y_label: str = '', save_result: bool = False, project_folder: str = '', folder_name: str = '', linewidth: int = 1)[source]¶: Create violin plot.

thoth.lab.inspection.create_plot_from_df(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, title_plot: str = ' ', x_label: str = ' ', y_label: str = ' ', static: str = True, save_result: bool = False, project_folder: str = '', folder_name: str = '', scatter: bool = False)[source]¶: Create plot using two columns of the DataFrame.

thoth.lab.inspection.create_plot_multiple_batches(data: pandas.core.frame.DataFrame, quantity: str, plot_type: str = 'box', x_label: str = '', y_label: str = '', static: str = True, save_result: bool = False, project_folder: str = '', folder_name: str = '')[source]¶: Create (Histogram or Box) plot using several columns of the dataframe(static as default).

thoth.lab.inspection.create_python_package_df(inspection_df: pandas.core.frame.DataFrame) → Union[pandas.core.frame.DataFrame, dict][source]¶: Create DataFrame with only python packages present in software stacks.

thoth.lab.inspection.create_scatter_and_correlation(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, title_scatter: str = 'Scatter plot')[source]¶: Create Scatter plot and evaluate correlation coefficients.

thoth.lab.inspection.create_scatter_plots_for_multiple_batches(inspection_df_dict: Dict[str, pandas.core.frame.DataFrame], list_batches: List[str], columns: Optional[Union[str, List[str]]] = None, title_scatter: str = ' ', x_label: str = ' ', y_label: str = ' ')[source]¶

Create Scatter plots for multiple batches.

Parameters

inspection_df_dict – dictionary with data frame as returned by `process_inspection_results’ per identifier
list_batches – list of batches to be used for correlation analysis
columns – parameters to be considered, taken from data frame as returned by `process_inspection_results’
title_scatter – scatter plot name
x_label – x label name
y_label – y label name

thoth.lab.inspection.dataframe_statistics(inspection_df: pandas.core.frame.DataFrame, plot_title: str)[source]¶

Output a data frame with relevant statistics on job duration, build duration and time elapsed.

Parameters

inspection_df – data frame to analyze as returned by `process_inspection_results’ (duration [ms])
plot_title – title of fit plot

thoth.lab.inspection.display_jobs_by_subcategories(df: pandas.core.frame.DataFrame)[source]¶

Create dataframe with job counts for each subcategory for every column in the data frame.

Parameters: df – dataframe with columns of unique value counts as returned by columns_to_analyze

thoth.lab.inspection.duration_plots(df: pandas.core.frame.DataFrame)[source]¶

Create plots for job and build duration, elapsed time, and lead time.

Parameters: df – data frame with duration information as returned by process_inspection_results

thoth.lab.inspection.evaluate_inspection_statistics(parameters: list, inspection_df_dict: dict, component: Optional[str] = None) → dict[source]¶

Aggregate statistical quantities per inspection parameter for inspection batches.

Parameters

parameters – inspection parameters used in the analysis
inspection_df_dict – dictionary with data frame as returned by `process_inspection_results’ per identifier

thoth.lab.inspection.evaluate_statistics(inspection_df: pandas.core.frame.DataFrame, inspection_parameter: str) → Dict[source]¶: Evaluate statistical quantities of a specific parameter of inspection results.

thoth.lab.inspection.evaluate_statistics_on_inspection_df(df: pandas.core.frame.DataFrame, column_names: List[str]) → pandas.core.frame.DataFrame[source]¶: Evaluate statistics on performance values selected from Dataframe columns.

thoth.lab.inspection.extract_keys_from_dataframe(df: pandas.core.frame.DataFrame, key: str)[source]¶: Filter the specific dataframe created for a certain key, combination of keys or for a tree depth.

thoth.lab.inspection.extract_specification(inspection_batch_result: Dict[str, Any], inspection_id: str)[source]¶: Extract specification info for the inspection.

thoth.lab.inspection.extract_structure_json(input_json: dict, upper_key: str, depth: int, json_structure)[source]¶

Convert a json file structure into a list with rows showing tree depths, keys and values.

Parameters

input_json – inspection result json taken from Ceph
upper_key – key starting point to recursively traverse all tree
depth – depth in the tree
json_structure – recurrent list to store results while traversing the tree

thoth.lab.inspection.filter_df(df, *args)[source]¶: Filter Dataframe.

thoth.lab.inspection.filter_document_ids(inspection_store, inspection_identifiers: List[str]) → Dict[str, List][source]¶

Filter inspection document ids list according to the inspection identifiers selected.

Parameters: inspection_identifiers – list of identifier/s to filter inspection ids

thoth.lab.inspection.filter_inspection_ids(inspection_identifiers: List[str]) → dict[source]¶

Filter inspection ids list according to the inspection identifier selected.

Parameters: inspection_identifiers – list of identifier/s to filter inspection ids

thoth.lab.inspection.make_subplots(data: pandas.core.frame.DataFrame, columns: Optional[List[str]] = None, *, kind: str = 'box', **kwargs)[source]¶: Make subplots and arrange them in an optimized grid layout.

thoth.lab.inspection.map_column_to_feature_class(column_name: str)[source]¶

Use Helper function that maps a column in the original dataframe to a feature class.

Parameters: column_name – column_name in inspection_df dataframe

obtained by process_inspection_results with no columns dropped (drop=False)

thoth.lab.inspection.plot_distribution_of_jobs_combined_categories(df_hardware_category: pandas.core.frame.DataFrame, df_duration: pandas.core.frame.DataFrame, df_analyze: pandas.core.frame.DataFrame)[source]¶

Plot the job duration distribution for each unique hardware combination/configuration of data.

Parameters

df_hardware_category – dataframe of of parameters to analyze grouped by distinct rows
df_duration – dataframe with duration information as returned by process_inspection_results
df_analyze – dataframe of parameters that show variation across the clusters

thoth.lab.inspection.plot_interpolated_statistics_of_inspection_parameters(statistical_results_dict: dict, identifier_inspection_list: dict, inspection_parameters: List[str], colour_list: List[str], statistical_quantities: List[str], title_plot: Optional[str] = None, title_xlabel: Optional[str] = None, title_ylabel: Optional[str] = None, save_result: bool = False, project_folder: Optional[str] = None, folder_name: Optional[str] = None, componet: Optional[str] = None)[source]¶: Plot interpolated statistical quantity/ies of inspection parameter/s from different inspection batches.

thoth.lab.inspection.plot_subcategories_by_hues(df_cat: pandas.core.frame.DataFrame, df: pandas.core.frame.DataFrame, column)[source]¶

Create scatter plots with parameter categories separated by hues.

Parameters

df_cat – filtered dataframe with columns to analyze as returned by columns_to_analyze
df – data frame with duration information as returned by process_inspection_results
colum – job duration/build duration columns from ‘df’

thoth.lab.inspection.process_empty_or_mutable_parameters(inspection_df: pandas.core.frame.DataFrame)[source]¶

Process empty or mutable parameters in dataframe.

These values will not work with further processing using the groupby function. Prints the unique value count of all columns that are unhashable (all such columns are constant). Drops these columns and returns a new dataframe.

Parameters: inspection_df – data frame as returned by process_inspection_results

with no columns dropped (drop=False)

thoth.lab.inspection.process_inspection_results(inspection_results: List[dict], exclude: Optional[Union[list, set]] = None, apply: Optional[List[Tuple]] = None, drop: bool = True, verbose: bool = False, duration_info: bool = False) → pandas.core.frame.DataFrame[source]¶: Process inspection result into pd.DataFrame.

thoth.lab.inspection.query_inspection_dataframe(inspection_df: pandas.core.frame.DataFrame, *args, **kwargs) → pandas.core.frame.DataFrame[source]¶: Use Wrapper around _.query method which always include duration columns in filter expression.

thoth.lab.inspection.show_categories(inspection_df: pandas.core.frame.DataFrame)[source]¶: List categories in the given inspection pd.DataFrame.

thoth.lab.inspection.show_inspection_inputs(filtered_inspection_ids: List[str], inspection_batch_ids: List[str], filtered_inspection_batch_ids: List[str])[source]¶

Show inspections inputs for the analysis.

Parameters

filtered_inspection_ids – list of inspection ids after filtering
inspection_batch_ids – list of inspection batch ids
filtered_inspection_batch_ids – llist of inspection batch ids after filtering

thoth.lab.inspection.show_unique_value_count_by_feature_class(processed_df: pandas.core.frame.DataFrame)[source]¶

Show unique count values per feature/class.

Show results per feature/class that are subdivided in subclasses that map to it.

Parameters: processed_df – processed dataframe as returned by the process_empty_or_mutable_parameters

thoth.lab.inspection.summary_bar_plot(df: pandas.core.frame.DataFrame, df_categories: pandas.core.frame.DataFrame, clusters: List[pandas.core.frame.DataFrame])[source]¶

Create trace stacked plot scaled by total jobs of each parameter within clusters (if any).

Parameters

df – data frame with duration information as returned by process_inspection_results
df_categories – filtered dataframe with columns to analyze as returned by columns_to_analyze
clusters – list of subset dataframes with the last value in list being the entire data set

thoth.lab.inspection.summary_trace_plot(df: pandas.core.frame.DataFrame, df_categories: pandas.core.frame.DataFrame, dfs: Optional[List[pandas.core.frame.DataFrame]] = None)[source]¶

Create trace plot scaled by percentage of compositions of each parameter separated by hues.

Parameters

df – data frame with duration information as returned by process_inspection_results
df_categories – filtered dataframe with columns to analyze as returned by columns_to_analyze
dfs – dataframes of clustered data (if any) appended to dataframe of

entire dataset (ie: [df_left_cluster, df_right_cluster, df_duration])

thoth.lab.inspection_report module¶

Inspection report generation and visualization.

thoth.lab.inspection_report.create_df_report(df: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶: Show unique values for each column in the dataframe.

thoth.lab.inspection_report.create_dfs_inspection_classes(inspection_df: pandas.core.frame.DataFrame) → dict[source]¶: Create all inspection dataframes per class with unique values and complete values.

thoth.lab.inspection_report.multi_table(table_dict)[source]¶: Accept a list of IpyTable objects and return a table which contains each IpyTable in a cell.

thoth.lab.security module¶

Security results processing and analysis.

class thoth.lab.security.SecurityIndicators[source]¶

Bases: object

Class of methods used to analyze Security Indicators (SI).

static add_release_date(metadata_df: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶: Add release date to metadata.

static aggregate_security_indicator_bandit_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, security_indicator_bandit_repo_path: pathlib.Path = PosixPath('security/si-bandit')) → list[source]¶

Aggregate si_bandit results from jsons stored in Ceph or locally from si_bandit repo.

Parameters

limit_results – reduce the number of si_bandit reports ids considered to max_ids to test analysis
max_ids – maximum number of si_bandit reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
si_bandit_repo_path – path to retrieve the si_bandit dataset locally and is_local is set to True

static aggregate_security_indicator_cloc_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, security_indicator_cloc_repo_path: pathlib.Path = PosixPath('security/si-cloc')) → list[source]¶

Aggregate si_cloc results from jsons stored in Ceph or locally from si_cloc repo.

Parameters

limit_results – reduce the number of si_cloc reports ids considered to max_ids to test analysis
max_ids – maximum number of si_cloc reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
si_cloc_repo_path – path to retrieve the si_cloc dataset locally and is_local is set to True

static create_package_releases_vulnerabilities_trend(si_bandit_df: pandas.core.frame.DataFrame, package_name: str, package_index: str, security_infos: Optional[List[str]] = None, show_vulnerability_data: bool = False)[source]¶

Plot vulnerabilites trend for a Python package from a certain index.

Parameters: si_bandit_df – pandas dataframe given by ‘create_si_bandit_final_dataframe’ method

with use_external_source_data set to True. :param package_name: Python Package name filter :param package_index: Python Package index filter :param security_infos: list of info to be visualized in the plot :param show_vulnerability_data: show all data regarding vulnerabilites if set to True

create_security_confidence_dataframe(si_bandit_report: dict, filters_files: Optional[List[str]] = None) → Tuple[pandas.core.frame.DataFrame, Dict[str, int]][source]¶: Create Security/Confidence dataframe for si-bandit report.

create_si_bandit_final_dataframe(si_bandit_reports: List[dict], use_external_source_data: bool = False, filters_files: Optional[List[str]] = None) → pandas.core.frame.DataFrame[source]¶: Create final si-bandit dataframe.

create_si_bandit_metadata_dataframe(si_bandit_report: dict) → pandas.core.frame.DataFrame[source]¶: Create si-bandit report metadata dataframe.

create_si_cloc_final_dataframe(si_cloc_reports: list) → pandas.core.frame.DataFrame[source]¶: Create final si-cloc dataframe.

create_si_cloc_metadata_dataframe(si_cloc_report: dict) → pandas.core.frame.DataFrame[source]¶: Create si-cloc report metadata dataframe.

create_si_cloc_results_dataframe(si_cloc_report: dict) → pandas.core.frame.DataFrame[source]¶: Create si-cloc report results dataframe.

static create_vulnerabilities_plot(security_df: pandas.core.frame.DataFrame, security_infos: Optional[List[str]] = None, show_vulnerability_data: bool = False) → None[source]¶

Plot vulnerabilites trend for a Python package from a certain index.

Parameters: security_df – pandas dataframe given by ‘create_si_bandit_final_dataframe’ method

with use_external_source_data set to True. :param security_infos: list of info to be visualized in the plot :param show_vulnerability_data: show all data regarding vulnerabilites if set to True

static define_si_scores()[source]¶

Define security scores from si bandit outputs.

WARNING: It depends on all data considered.

static extract_data_from_si_bandit_metadata(report_metadata: dict) → dict[source]¶: Extract data from si-bandit report metadata.

static extract_data_from_si_cloc_metadata(report_metadata: dict) → dict[source]¶: Extract data from si-cloc report metadata.

static extract_severity_confidence_info(si_bandit_report: dict, filters_files: Optional[List[str]] = None) → Tuple[List[dict], Dict[str, int]][source]¶: Extract severity and confidence from result metrics.

static produce_si_bandit_report_summary_dataframe(metadata_df: pandas.core.frame.DataFrame, si_bandit_sec_conf_df: pandas.core.frame.DataFrame, summary_files: Dict[str, int]) → pandas.core.frame.DataFrame[source]¶: Create si-bandit report summary dataframe.

static produce_si_cloc_report_summary_dataframe(metadata_df: pandas.core.frame.DataFrame, cloc_results_df: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶: Create si-cloc report summary dataframe.

thoth.lab.solver module¶

Solver results processing and analysis.

thoth.lab.solver.aggregate_solver_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, solver_repo_path: pathlib.Path = PosixPath('solver')) → list[source]¶

Aggregate solver results from jsons stored in Ceph or locally from solver repo.

Parameters

limit_results – reduce the number of solver reports ids considered to max_ids to test analysis
max_ids – maximum number of solver reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
solver_repo_path – required if you want to retrieve the solver dataset locally and is_local is set to True

thoth.lab.solver.construct_solver_from_metadata(solver_report_metadata: dict) → str[source]¶: Construct solver from solver report metadata.

thoth.lab.solver.extract_data_from_solver_metadata(solver_report_metadata: dict) → dict[source]¶: Extract data from solver report metadata.

thoth.lab.solver.extract_errors_from_solver_result(solver_report_result_errors: list) → list[source]¶: Extract all errors from solver report (if any).

thoth.lab.solver.extract_tree_from_solver_result(solver_report_result: dict) → list[source]¶: Extract data from solver report result.

thoth.lab.underscore module¶

Pandas common operations and utilities.

thoth.lab.utils module¶

Various utilities for notebooks.

thoth.lab.utils.display_page(location: str, verify: bool = True, no_obtain_location: bool = False, width: int = 980, height: int = 900)[source]¶: Display the given page in notebook as iframe.

thoth.lab.utils.get(obj, attr, *, default: Any = <object object>)[source]¶: Combine both getattr and dict.get into universal get.

thoth.lab.utils.get_column_group(df: pandas.core.frame.DataFrame, columns: Optional[Union[List[Union[str, int]], pandas.core.indexes.base.Index]] = None, label: Optional[str] = None) → pandas.core.series.Series[source]¶: Group columns of the DataFrame into a single column group.

thoth.lab.utils.get_index_group(df: pandas.core.frame.DataFrame, names: Optional[List[Union[str, int]]] = None, label: Optional[str] = None) → pandas.core.series.Series[source]¶: Group multiple index levels into single index group.

thoth.lab.utils.group_columns(df: pandas.core.frame.DataFrame, columns: Optional[Union[List[Union[str, int]], pandas.core.indexes.base.Index]] = None, label: Optional[str] = None, inplace: bool = False) → pandas.core.series.Series[source]¶: Group columns of the DataFrame into a single column group and set it to the DataFrame.

thoth.lab.utils.group_index(df: pandas.core.frame.DataFrame, names: Optional[List[Union[str, int]]] = None, label: Optional[str] = None, inplace: bool = False) → pandas.core.frame.DataFrame[source]¶: Group multiple index levels into single index group and set it as index to the DataFrame.

thoth.lab.utils.has(obj, attr)[source]¶: Combine both hasattr and in into universal has.

thoth.lab.utils.highlight(df: pandas.core.frame.DataFrame, content: Optional[str] = None, column_class: Optional[str] = None, colours: Optional[Union[list, str]] = None)[source]¶

Highlight rows of content column of a given DataFrame.

Highlight can be based on column_class or custom colours provided.

thoth.lab.utils.obtain_location(name: str, verify: bool = False, only_netloc: bool = False) → str[source]¶

Obtain location of a service based on it’s name in Red Hat’s internal network.

This function basically checks redirect of URL registered at Red Hat’s internal network. By doing so it is prevented to expose internal URLs. There is queried https://url.corp.redhat.com for redirects.

>>> obtain_location('thoth-sbu', verify=False)

thoth.lab.utils.packages_info(thoth_packages: bool = True) → pandas.core.frame.DataFrame[source]¶: Display information about versions of packages available in the installation.

thoth.lab.utils.resolve_query(query: str, context: Optional[pandas.core.frame.DataFrame] = None, resolvers: Optional[tuple] = None, engine: Optional[str] = None, parser: str = 'pandas')[source]¶: Resolve query in the given context.

thoth.lab.utils.rget(obj: Any, attr: str, default: Any = <object object>) → Any¶

Recursively retrieve nested attributes of an object.

Parameters

f – callable, function to be used as getattr
obj – Any, object to check
attr – str, attribute to find declared by dot notation accessor
default – default attribute, similar to getattr’s default

Returns

Any, retrieved attribute

thoth.lab.utils.rgetattr(obj: Any, attr: str, default: Any = <object object>) → Any¶

Recursively retrieve nested attributes of an object.

Parameters

f – callable, function to be used as getattr
obj – Any, object to check
attr – str, attribute to find declared by dot notation accessor
default – default attribute, similar to getattr’s default

Returns

Any, retrieved attribute

thoth.lab.utils.rhas(obj: Any, attr: str) → bool¶

Recursively check nested attributes of an object.

Parameters

fhas – callable, function to be used as hasattr
fget – callable, function to be used as getattr
obj – Any, object to check
attr – str, attribute to find declared by dot notation accessor

Returns

bool, whether the object has the given attribute

thoth.lab.utils.rhasattr(obj: Any, attr: str) → bool¶

Recursively check nested attributes of an object.

Parameters

fhas – callable, function to be used as hasattr
fget – callable, function to be used as getattr
obj – Any, object to check
attr – str, attribute to find declared by dot notation accessor

Returns

bool, whether the object has the given attribute

thoth.lab.utils.scale_colour_continuous(arr: Iterable, colour_palette=None, n_colours: int = 10, norm=False)[source]¶

Scale given arrays into colour array by specific palette.

The default number of colours is 10, which translates to dividing an array on a scale from 0 to 1 into 0.1 colour bins.

Module contents¶

Routines for experiments in Thoth not only for Jupyter notebooks.

class thoth.lab.GraphQueryResult(result)[source]¶

Bases: object

Wrap results of graph database queries.

plot_bar()[source]¶: Plot histogram of results obtained.

plot_pie()[source]¶: Plot a pie of results into Jupyter notebook.

serialize()[source]¶: Serialize the output of graph query.

to_dataframe()[source]¶: Construct a panda’s dataframe on results.

thoth.lab.obtain_location(name: str, verify: bool = False, only_netloc: bool = False) → str[source]¶

Obtain location of a service based on it’s name in Red Hat’s internal network.

This function basically checks redirect of URL registered at Red Hat’s internal network. By doing so it is prevented to expose internal URLs. There is queried https://url.corp.redhat.com for redirects.

>>> obtain_location('thoth-sbu', verify=False)

thoth.lab.packages_info(thoth_packages: bool = True) → pandas.core.frame.DataFrame[source]¶: Display information about versions of packages available in the installation.

thoth.lab package¶

Subpackages¶

Submodules¶

thoth.lab.adviser module¶

thoth.lab.common module¶

thoth.lab.convert module¶

thoth.lab.dependency_monkey module¶

thoth.lab.exception module¶

thoth.lab.graph module¶

thoth.lab.inspection module¶

thoth.lab.inspection_report module¶

thoth.lab.security module¶

thoth.lab.solver module¶

thoth.lab.underscore module¶

thoth.lab.utils module¶

Module contents¶

Table of Contents

Related Topics

This Page