thoth.lab package¶
Subpackages¶
Submodules¶
thoth.lab.adviser module¶
Adviser results processing and analysis.
- thoth.lab.adviser.aggregate_adviser_results(adviser_version: str, limit_results: bool = False, max_ids: int = 5) pandas.core.frame.DataFrame [source]¶
Aggregate adviser results from jsons stored in Ceph.
- Parameters
adviser_version – minimum adviser version considered for the analysis of adviser runs
limit_results – reduce the number of adviser runs ids considered to max_ids to test analysis
max_ids – maximum number of adviser runs ids considered
- thoth.lab.adviser.create_adviser_heatmap(adviser_justification_df: pandas.core.frame.DataFrame, file_name: Optional[str] = None, save_result: bool = False, output_dir: Optional[str] = None)[source]¶
Create adviser justifications heatmap plot.
- Parameters
adviser_justification_df – data frame as returned by `create_final_dataframe’ per identifier.
file_name – file name used in the name of files saved
save_result – resulting plots created are stored in output_dir.
output_dir – output directory where plots are stored if save_results is set to True.
- thoth.lab.adviser.create_adviser_results_histogram(plot_df: pandas.core.frame.DataFrame)[source]¶
Create inspection performance parameters plot in 3D.
:param plot_df dataframe for plot of adviser results
- thoth.lab.adviser.create_final_dataframe(adviser_dataframe: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]¶
Create final dataframe with all information required for plots.
- Parameters
adviser_dataframe – data frame as returned by aggregate_adviser_results method.
thoth.lab.common module¶
Common methods for thoth-lab.
- thoth.lab.common.aggregate_thoth_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, repo_path: Optional[pathlib.Path] = None, store_name: Optional[str] = None, is_inspection: Optional[str] = None) Union[list, dict] [source]¶
Aggregate results from jsons stored in Ceph for Thoth or locally from repo.
- Parameters
limit_results – reduce the number of reports ids considered to max_ids to test analysis
max_ids – maximum number of reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
repo_path – required if you want to retrieve the dataset locally and is_local is set to True
store – ResultStorageBase type depending on Thoth data (e.g solver, performance, adviser, etc.)
is_inspection – flag used only for InspectionResultStore as we store results in batches
- thoth.lab.common.aggregate_thoth_results_from_ceph(store_name: str, files: Union[dict, list], limit_results: bool = False, max_ids: int = 5) Tuple[Union[dict, list], int] [source]¶
Aggregate Thoth results from Ceph.
- thoth.lab.common.extract_zip_file(file_path: pathlib.Path)[source]¶
Extract files from zip files.
thoth.lab.convert module¶
Utilities to work with package dependencies.
thoth.lab.dependency_monkey module¶
Dependency Monkey results processing and analysis.
- thoth.lab.dependency_monkey.aggregate_dm_results_per_identifier(identifiers_inspection: List[str], limit_results: bool = False, max_batch_identifiers_ids: int = 5) Union[dict, List[str]] [source]¶
Aggregate inspection batch ids and specification from dm documents stored in Ceph.
- Parameters
inspection_identifier – list of identifier/s to filter inspection batch ids
limit_results – limit inspection batch ids considered to max_batch_identifiers_ids to test analysis
max_batch_identifiers_ids – maximum number of inspection batch ids considered
thoth.lab.exception module¶
Exceptions for thoth-lab methods.
thoth.lab.graph module¶
Various helpers and utils for interaction with the graph database.
- class thoth.lab.graph.DependencyGraph(incoming_graph_data=None, **attr)[source]¶
Bases:
networkx.classes.ordered.OrderedDiGraph
Construct a dependency graph by extending nx.OrderedDiGraph.
- adjlist_dict_factory¶
alias of
collections.OrderedDict
- static get_root(tree)[source]¶
Return root of the current graph, if any.
By default, tree topology is considered as input, so if there are multiple roots, only the first one is returned.
- node_dict_factory¶
alias of
collections.OrderedDict
- class thoth.lab.graph.GraphQueryResult(result)[source]¶
Bases:
object
Wrap results of graph database queries.
- thoth.lab.graph.get_root(tree)¶
Return root of the current graph, if any.
By default, tree topology is considered as input, so if there are multiple roots, only the first one is returned.
thoth.lab.inspection module¶
Inspection results processing and analysis.
- thoth.lab.inspection.aggregate_inspection_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, inspection_repo_path: pathlib.Path = PosixPath('performance')) list [source]¶
Aggregate inspection results from jsons stored in Ceph or locally from performance repo.
- Parameters
limit_results – reduce the number of inspection reports ids considered to max_ids to test analysis
max_ids – maximum number of inspection reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
inspection_repo_path – required to retrieve the performance dataset locally and is_local is set to True
- thoth.lab.inspection.aggregate_inspection_results_per_identifier(inspection_ids: List[str], identifier_inspection: List[str], inspection_batch_data: Dict[str, dict]) dict [source]¶
Aggregate inspection results per identifier from inspection documents stored in Ceph.
- Parameters
inspection_ids – list of inspection ids
identifier_inspection – list of identifier/s to filter inspection ids
inspection_batch_data – info to be added to each inspection (e.g. specification)
- thoth.lab.inspection.columns_to_analyze(df: pandas.core.frame.DataFrame, low: int = 0, display_clusters: bool = False, cluster_by_hue: bool = False) pandas.core.frame.DataFrame [source]¶
Print all columns within dataframe and count of unique column values within limit.
- Parameters
df – data frame to analyze as returned by `process_inspection_results’
low – the lower limit (0 if not specified) of distinct value counts
display_clusters – if true, displays grouped counts of parameter and parameter sort_values
cluster_by_hue – if true, displays distribution of parameters to analyze sorted by hues
- thoth.lab.inspection.concatenated_df(dfs: List[pandas.core.frame.DataFrame], column: str)[source]¶
Reorganize dataframe to show the distribution of jobs in a category across different subsets of data.
- Parameters
dfs – list of inspection result dataframes which can be different datasets or subset of datasets
column – column name or category for grouping to see the distribution of results
- thoth.lab.inspection.create_duration_box(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, **kwargs)[source]¶
Create duration Box plot.
- thoth.lab.inspection.create_duration_dataframe(inspection_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]¶
Compute statistics and duration DataFrame.
- thoth.lab.inspection.create_duration_histogram(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, bins: Optional[int] = None, **kwargs)[source]¶
Create duration Histogram plot.
- thoth.lab.inspection.create_duration_scatter(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, **kwargs)[source]¶
Create duration Scatter plot.
- thoth.lab.inspection.create_duration_scatter_with_bounds(data: pandas.core.frame.DataFrame, col: str, index: Optional[Union[list, pandas.core.indexes.base.Index, pandas.core.indexes.range.RangeIndex]] = None, **kwargs)[source]¶
Create duration Scatter plot with upper and lower bounds.
- thoth.lab.inspection.create_filtered_df(df: pandas.core.frame.DataFrame, pi_name: Optional[str] = None, pi_component: Optional[str] = None, runtime_environment: Optional[str] = None, packages: Optional[List[Tuple[str, str, str]]] = None) pandas.core.frame.DataFrame [source]¶
Create dataframe using the filters selected for plots.
- thoth.lab.inspection.create_final_dataframe(packages_versions: dict, python_packages_dataframe: pandas.core.frame.DataFrame, inspection_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]¶
Create final dataframe with all information required for plots.
- Parameters
packages_versions – dict as returned by create_python_package_df method.
python_packages_dataframe – data frame as returned by create_python_package_df method.
inspection_df – data frame containing data of inspections results.
- thoth.lab.inspection.create_inspection_2d_plot(plot_df: pandas.core.frame.DataFrame, quantity: str, components: List[str], color_scales: List[str], identifiers_inspections: List[str], have_annotations: bool = False)[source]¶
Create inspection performance parameters plot in 2D.
:param plot_df dataframe for plot of inspections results
- thoth.lab.inspection.create_inspection_3d_plot(plot_df: pandas.core.frame.DataFrame, quantity: str, identifiers_inspections: List[str])[source]¶
Create inspection performance parameters plot in 3D.
:param plot_df dataframe for plot of inspections results
- thoth.lab.inspection.create_inspection_analysis_plots(inspection_df: pandas.core.frame.DataFrame)[source]¶
Create inspection analysis plots for the inspection pd.Dataframe.
- Parameters
inspection_df – data frame as returned by `process_inspection_results’ for a specific inspection identifier
- thoth.lab.inspection.create_inspection_dataframes(inspection_results_dict: dict, duration_info: bool = False) dict [source]¶
Create dictionary with data frame as returned by `process_inspection_results’ for each inspection identifier.
- Parameters
inspection_results_dict – dictionary containing inspection results per inspection identifier.
- thoth.lab.inspection.create_inspection_parameters_dataframes(parameters: List[str], inspection_df_dict: dict, component: Optional[str] = None) Dict[str, pandas.core.frame.DataFrame] [source]¶
Create pd.DataFrame of selected parameters from inspections results to be used for statistics and error analysis.
It also outputs batches and parameters map that is necessary for plots.
- Parameters
parameters – inspection parameters used in the analysis
inspection_df_dict – dictionary with data frame as returned by `process_inspection_results’ per identifier.
component – PI component name (e.g tensorflow, pytorch).
- thoth.lab.inspection.create_inspection_time_dataframe()[source]¶
Create pd.Dataframe of time of inspections for build and job.
- thoth.lab.inspection.create_multiple_violin_plot(data: pandas.core.frame.DataFrame, quantity: str, x_label: str = '', y_label: str = '', save_result: bool = False, project_folder: str = '', folder_name: str = '', linewidth: int = 1)[source]¶
Create violin plot.
- thoth.lab.inspection.create_plot_from_df(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, title_plot: str = ' ', x_label: str = ' ', y_label: str = ' ', static: str = True, save_result: bool = False, project_folder: str = '', folder_name: str = '', scatter: bool = False)[source]¶
Create plot using two columns of the DataFrame.
- thoth.lab.inspection.create_plot_multiple_batches(data: pandas.core.frame.DataFrame, quantity: str, plot_type: str = 'box', x_label: str = '', y_label: str = '', static: str = True, save_result: bool = False, project_folder: str = '', folder_name: str = '')[source]¶
Create (Histogram or Box) plot using several columns of the dataframe(static as default).
- thoth.lab.inspection.create_python_package_df(inspection_df: pandas.core.frame.DataFrame) Union[pandas.core.frame.DataFrame, dict] [source]¶
Create DataFrame with only python packages present in software stacks.
- thoth.lab.inspection.create_scatter_and_correlation(data: pandas.core.frame.DataFrame, columns: Optional[Union[str, List[str]]] = None, title_scatter: str = 'Scatter plot')[source]¶
Create Scatter plot and evaluate correlation coefficients.
- thoth.lab.inspection.create_scatter_plots_for_multiple_batches(inspection_df_dict: Dict[str, pandas.core.frame.DataFrame], list_batches: List[str], columns: Optional[Union[str, List[str]]] = None, title_scatter: str = ' ', x_label: str = ' ', y_label: str = ' ')[source]¶
Create Scatter plots for multiple batches.
- Parameters
inspection_df_dict – dictionary with data frame as returned by `process_inspection_results’ per identifier
list_batches – list of batches to be used for correlation analysis
columns – parameters to be considered, taken from data frame as returned by `process_inspection_results’
title_scatter – scatter plot name
x_label – x label name
y_label – y label name
- thoth.lab.inspection.dataframe_statistics(inspection_df: pandas.core.frame.DataFrame, plot_title: str)[source]¶
Output a data frame with relevant statistics on job duration, build duration and time elapsed.
- Parameters
inspection_df – data frame to analyze as returned by `process_inspection_results’ (duration [ms])
plot_title – title of fit plot
- thoth.lab.inspection.display_jobs_by_subcategories(df: pandas.core.frame.DataFrame)[source]¶
Create dataframe with job counts for each subcategory for every column in the data frame.
- Parameters
df – dataframe with columns of unique value counts as returned by columns_to_analyze
- thoth.lab.inspection.duration_plots(df: pandas.core.frame.DataFrame)[source]¶
Create plots for job and build duration, elapsed time, and lead time.
- Parameters
df – data frame with duration information as returned by process_inspection_results
- thoth.lab.inspection.evaluate_inspection_statistics(parameters: list, inspection_df_dict: dict, component: Optional[str] = None) dict [source]¶
Aggregate statistical quantities per inspection parameter for inspection batches.
- Parameters
parameters – inspection parameters used in the analysis
inspection_df_dict – dictionary with data frame as returned by `process_inspection_results’ per identifier
- thoth.lab.inspection.evaluate_statistics(inspection_df: pandas.core.frame.DataFrame, inspection_parameter: str) Dict [source]¶
Evaluate statistical quantities of a specific parameter of inspection results.
- thoth.lab.inspection.evaluate_statistics_on_inspection_df(df: pandas.core.frame.DataFrame, column_names: List[str]) pandas.core.frame.DataFrame [source]¶
Evaluate statistics on performance values selected from Dataframe columns.
- thoth.lab.inspection.extract_keys_from_dataframe(df: pandas.core.frame.DataFrame, key: str)[source]¶
Filter the specific dataframe created for a certain key, combination of keys or for a tree depth.
- thoth.lab.inspection.extract_specification(inspection_batch_result: Dict[str, Any], inspection_id: str)[source]¶
Extract specification info for the inspection.
- thoth.lab.inspection.extract_structure_json(input_json: dict, upper_key: str, depth: int, json_structure)[source]¶
Convert a json file structure into a list with rows showing tree depths, keys and values.
- Parameters
input_json – inspection result json taken from Ceph
upper_key – key starting point to recursively traverse all tree
depth – depth in the tree
json_structure – recurrent list to store results while traversing the tree
- thoth.lab.inspection.filter_document_ids(inspection_store, inspection_identifiers: List[str]) Dict[str, List] [source]¶
Filter inspection document ids list according to the inspection identifiers selected.
- Parameters
inspection_identifiers – list of identifier/s to filter inspection ids
- thoth.lab.inspection.filter_inspection_ids(inspection_identifiers: List[str]) dict [source]¶
Filter inspection ids list according to the inspection identifier selected.
- Parameters
inspection_identifiers – list of identifier/s to filter inspection ids
- thoth.lab.inspection.make_subplots(data: pandas.core.frame.DataFrame, columns: Optional[List[str]] = None, *, kind: str = 'box', **kwargs)[source]¶
Make subplots and arrange them in an optimized grid layout.
- thoth.lab.inspection.map_column_to_feature_class(column_name: str)[source]¶
Use Helper function that maps a column in the original dataframe to a feature class.
- Parameters
column_name – column_name in inspection_df dataframe
obtained by process_inspection_results with no columns dropped (drop=False)
- thoth.lab.inspection.plot_distribution_of_jobs_combined_categories(df_hardware_category: pandas.core.frame.DataFrame, df_duration: pandas.core.frame.DataFrame, df_analyze: pandas.core.frame.DataFrame)[source]¶
Plot the job duration distribution for each unique hardware combination/configuration of data.
- Parameters
df_hardware_category – dataframe of of parameters to analyze grouped by distinct rows
df_duration – dataframe with duration information as returned by process_inspection_results
df_analyze – dataframe of parameters that show variation across the clusters
- thoth.lab.inspection.plot_interpolated_statistics_of_inspection_parameters(statistical_results_dict: dict, identifier_inspection_list: dict, inspection_parameters: List[str], colour_list: List[str], statistical_quantities: List[str], title_plot: Optional[str] = None, title_xlabel: Optional[str] = None, title_ylabel: Optional[str] = None, save_result: bool = False, project_folder: Optional[str] = None, folder_name: Optional[str] = None, componet: Optional[str] = None)[source]¶
Plot interpolated statistical quantity/ies of inspection parameter/s from different inspection batches.
- thoth.lab.inspection.plot_subcategories_by_hues(df_cat: pandas.core.frame.DataFrame, df: pandas.core.frame.DataFrame, column)[source]¶
Create scatter plots with parameter categories separated by hues.
- Parameters
df_cat – filtered dataframe with columns to analyze as returned by columns_to_analyze
df – data frame with duration information as returned by process_inspection_results
colum – job duration/build duration columns from ‘df’
- thoth.lab.inspection.process_empty_or_mutable_parameters(inspection_df: pandas.core.frame.DataFrame)[source]¶
Process empty or mutable parameters in dataframe.
These values will not work with further processing using the groupby function. Prints the unique value count of all columns that are unhashable (all such columns are constant). Drops these columns and returns a new dataframe.
- Parameters
inspection_df – data frame as returned by process_inspection_results
with no columns dropped (drop=False)
- thoth.lab.inspection.process_inspection_results(inspection_results: List[dict], exclude: Optional[Union[list, set]] = None, apply: Optional[List[Tuple]] = None, drop: bool = True, verbose: bool = False, duration_info: bool = False) pandas.core.frame.DataFrame [source]¶
Process inspection result into pd.DataFrame.
- thoth.lab.inspection.query_inspection_dataframe(inspection_df: pandas.core.frame.DataFrame, *args, **kwargs) pandas.core.frame.DataFrame [source]¶
Use Wrapper around _.query method which always include duration columns in filter expression.
- thoth.lab.inspection.show_categories(inspection_df: pandas.core.frame.DataFrame)[source]¶
List categories in the given inspection pd.DataFrame.
- thoth.lab.inspection.show_inspection_inputs(filtered_inspection_ids: List[str], inspection_batch_ids: List[str], filtered_inspection_batch_ids: List[str])[source]¶
Show inspections inputs for the analysis.
- Parameters
filtered_inspection_ids – list of inspection ids after filtering
inspection_batch_ids – list of inspection batch ids
filtered_inspection_batch_ids – llist of inspection batch ids after filtering
- thoth.lab.inspection.show_unique_value_count_by_feature_class(processed_df: pandas.core.frame.DataFrame)[source]¶
Show unique count values per feature/class.
Show results per feature/class that are subdivided in subclasses that map to it.
- Parameters
processed_df – processed dataframe as returned by the process_empty_or_mutable_parameters
- thoth.lab.inspection.summary_bar_plot(df: pandas.core.frame.DataFrame, df_categories: pandas.core.frame.DataFrame, clusters: List[pandas.core.frame.DataFrame])[source]¶
Create trace stacked plot scaled by total jobs of each parameter within clusters (if any).
- Parameters
df – data frame with duration information as returned by process_inspection_results
df_categories – filtered dataframe with columns to analyze as returned by columns_to_analyze
clusters – list of subset dataframes with the last value in list being the entire data set
- thoth.lab.inspection.summary_trace_plot(df: pandas.core.frame.DataFrame, df_categories: pandas.core.frame.DataFrame, dfs: Optional[List[pandas.core.frame.DataFrame]] = None)[source]¶
Create trace plot scaled by percentage of compositions of each parameter separated by hues.
- Parameters
df – data frame with duration information as returned by process_inspection_results
df_categories – filtered dataframe with columns to analyze as returned by columns_to_analyze
dfs – dataframes of clustered data (if any) appended to dataframe of
entire dataset (ie: [df_left_cluster, df_right_cluster, df_duration])
thoth.lab.inspection_report module¶
Inspection report generation and visualization.
- thoth.lab.inspection_report.create_df_report(df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]¶
Show unique values for each column in the dataframe.
thoth.lab.security module¶
Security results processing and analysis.
- class thoth.lab.security.SecurityIndicators[source]¶
Bases:
object
Class of methods used to analyze Security Indicators (SI).
- static add_release_date(metadata_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]¶
Add release date to metadata.
- static aggregate_security_indicator_bandit_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, security_indicator_bandit_repo_path: pathlib.Path = PosixPath('security/si-bandit')) list [source]¶
Aggregate si_bandit results from jsons stored in Ceph or locally from si_bandit repo.
- Parameters
limit_results – reduce the number of si_bandit reports ids considered to max_ids to test analysis
max_ids – maximum number of si_bandit reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
si_bandit_repo_path – path to retrieve the si_bandit dataset locally and is_local is set to True
- static aggregate_security_indicator_cloc_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, security_indicator_cloc_repo_path: pathlib.Path = PosixPath('security/si-cloc')) list [source]¶
Aggregate si_cloc results from jsons stored in Ceph or locally from si_cloc repo.
- Parameters
limit_results – reduce the number of si_cloc reports ids considered to max_ids to test analysis
max_ids – maximum number of si_cloc reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
si_cloc_repo_path – path to retrieve the si_cloc dataset locally and is_local is set to True
- static create_package_releases_vulnerabilities_trend(si_bandit_df: pandas.core.frame.DataFrame, package_name: str, package_index: str, security_infos: Optional[List[str]] = None, show_vulnerability_data: bool = False)[source]¶
Plot vulnerabilites trend for a Python package from a certain index.
- Parameters
si_bandit_df – pandas dataframe given by ‘create_si_bandit_final_dataframe’ method
with use_external_source_data set to True. :param package_name: Python Package name filter :param package_index: Python Package index filter :param security_infos: list of info to be visualized in the plot :param show_vulnerability_data: show all data regarding vulnerabilites if set to True
- create_security_confidence_dataframe(si_bandit_report: dict, filters_files: Optional[List[str]] = None) Tuple[pandas.core.frame.DataFrame, Dict[str, int]] [source]¶
Create Security/Confidence dataframe for si-bandit report.
- create_si_bandit_final_dataframe(si_bandit_reports: List[dict], use_external_source_data: bool = False, filters_files: Optional[List[str]] = None) pandas.core.frame.DataFrame [source]¶
Create final si-bandit dataframe.
- create_si_bandit_metadata_dataframe(si_bandit_report: dict) pandas.core.frame.DataFrame [source]¶
Create si-bandit report metadata dataframe.
- create_si_cloc_final_dataframe(si_cloc_reports: list) pandas.core.frame.DataFrame [source]¶
Create final si-cloc dataframe.
- create_si_cloc_metadata_dataframe(si_cloc_report: dict) pandas.core.frame.DataFrame [source]¶
Create si-cloc report metadata dataframe.
- create_si_cloc_results_dataframe(si_cloc_report: dict) pandas.core.frame.DataFrame [source]¶
Create si-cloc report results dataframe.
- static create_vulnerabilities_plot(security_df: pandas.core.frame.DataFrame, security_infos: Optional[List[str]] = None, show_vulnerability_data: bool = False) None [source]¶
Plot vulnerabilites trend for a Python package from a certain index.
- Parameters
security_df – pandas dataframe given by ‘create_si_bandit_final_dataframe’ method
with use_external_source_data set to True. :param security_infos: list of info to be visualized in the plot :param show_vulnerability_data: show all data regarding vulnerabilites if set to True
- static define_si_scores()[source]¶
Define security scores from si bandit outputs.
WARNING: It depends on all data considered.
- static extract_data_from_si_bandit_metadata(report_metadata: dict) dict [source]¶
Extract data from si-bandit report metadata.
- static extract_data_from_si_cloc_metadata(report_metadata: dict) dict [source]¶
Extract data from si-cloc report metadata.
- static extract_severity_confidence_info(si_bandit_report: dict, filters_files: Optional[List[str]] = None) Tuple[List[dict], Dict[str, int]] [source]¶
Extract severity and confidence from result metrics.
thoth.lab.solver module¶
Solver results processing and analysis.
- thoth.lab.solver.aggregate_solver_results(limit_results: bool = False, max_ids: int = 5, is_local: bool = True, solver_repo_path: pathlib.Path = PosixPath('solver')) list [source]¶
Aggregate solver results from jsons stored in Ceph or locally from solver repo.
- Parameters
limit_results – reduce the number of solver reports ids considered to max_ids to test analysis
max_ids – maximum number of solver reports ids considered
is_local – flag to retreive the dataset locally or from S3 (credentials are required)
solver_repo_path – required if you want to retrieve the solver dataset locally and is_local is set to True
- thoth.lab.solver.construct_solver_from_metadata(solver_report_metadata: dict) str [source]¶
Construct solver from solver report metadata.
- thoth.lab.solver.extract_data_from_solver_metadata(solver_report_metadata: dict) dict [source]¶
Extract data from solver report metadata.
thoth.lab.underscore module¶
Pandas common operations and utilities.
thoth.lab.utils module¶
Various utilities for notebooks.
- thoth.lab.utils.display_page(location: str, verify: bool = True, no_obtain_location: bool = False, width: int = 980, height: int = 900)[source]¶
Display the given page in notebook as iframe.
- thoth.lab.utils.get(obj, attr, *, default: Any = <object object>)[source]¶
Combine both getattr and dict.get into universal get.
- thoth.lab.utils.get_column_group(df: pandas.core.frame.DataFrame, columns: Optional[Union[List[Union[str, int]], pandas.core.indexes.base.Index]] = None, label: Optional[str] = None) pandas.core.series.Series [source]¶
Group columns of the DataFrame into a single column group.
- thoth.lab.utils.get_index_group(df: pandas.core.frame.DataFrame, names: Optional[List[Union[str, int]]] = None, label: Optional[str] = None) pandas.core.series.Series [source]¶
Group multiple index levels into single index group.
- thoth.lab.utils.group_columns(df: pandas.core.frame.DataFrame, columns: Optional[Union[List[Union[str, int]], pandas.core.indexes.base.Index]] = None, label: Optional[str] = None, inplace: bool = False) pandas.core.series.Series [source]¶
Group columns of the DataFrame into a single column group and set it to the DataFrame.
- thoth.lab.utils.group_index(df: pandas.core.frame.DataFrame, names: Optional[List[Union[str, int]]] = None, label: Optional[str] = None, inplace: bool = False) pandas.core.frame.DataFrame [source]¶
Group multiple index levels into single index group and set it as index to the DataFrame.
- thoth.lab.utils.highlight(df: pandas.core.frame.DataFrame, content: Optional[str] = None, column_class: Optional[str] = None, colours: Optional[Union[list, str]] = None)[source]¶
Highlight rows of content column of a given DataFrame.
Highlight can be based on column_class or custom colours provided.
- thoth.lab.utils.obtain_location(name: str, verify: bool = False, only_netloc: bool = False) str [source]¶
Obtain location of a service based on it’s name in Red Hat’s internal network.
This function basically checks redirect of URL registered at Red Hat’s internal network. By doing so it is prevented to expose internal URLs. There is queried https://url.corp.redhat.com for redirects.
>>> obtain_location('thoth-sbu', verify=False)
- thoth.lab.utils.packages_info(thoth_packages: bool = True) pandas.core.frame.DataFrame [source]¶
Display information about versions of packages available in the installation.
- thoth.lab.utils.resolve_query(query: str, context: Optional[pandas.core.frame.DataFrame] = None, resolvers: Optional[tuple] = None, engine: Optional[str] = None, parser: str = 'pandas')[source]¶
Resolve query in the given context.
- thoth.lab.utils.rget(obj: Any, attr: str, default: Any = <object object>) Any ¶
Recursively retrieve nested attributes of an object.
- Parameters
f – callable, function to be used as getattr
obj – Any, object to check
attr – str, attribute to find declared by dot notation accessor
default – default attribute, similar to getattr’s default
- Returns
Any, retrieved attribute
- thoth.lab.utils.rgetattr(obj: Any, attr: str, default: Any = <object object>) Any ¶
Recursively retrieve nested attributes of an object.
- Parameters
f – callable, function to be used as getattr
obj – Any, object to check
attr – str, attribute to find declared by dot notation accessor
default – default attribute, similar to getattr’s default
- Returns
Any, retrieved attribute
- thoth.lab.utils.rhas(obj: Any, attr: str) bool ¶
Recursively check nested attributes of an object.
- Parameters
fhas – callable, function to be used as hasattr
fget – callable, function to be used as getattr
obj – Any, object to check
attr – str, attribute to find declared by dot notation accessor
- Returns
bool, whether the object has the given attribute
- thoth.lab.utils.rhasattr(obj: Any, attr: str) bool ¶
Recursively check nested attributes of an object.
- Parameters
fhas – callable, function to be used as hasattr
fget – callable, function to be used as getattr
obj – Any, object to check
attr – str, attribute to find declared by dot notation accessor
- Returns
bool, whether the object has the given attribute
Module contents¶
Routines for experiments in Thoth not only for Jupyter notebooks.
- class thoth.lab.GraphQueryResult(result)[source]¶
Bases:
object
Wrap results of graph database queries.
- thoth.lab.obtain_location(name: str, verify: bool = False, only_netloc: bool = False) str [source]¶
Obtain location of a service based on it’s name in Red Hat’s internal network.
This function basically checks redirect of URL registered at Red Hat’s internal network. By doing so it is prevented to expose internal URLs. There is queried https://url.corp.redhat.com for redirects.
>>> obtain_location('thoth-sbu', verify=False)