smefit.analyze package

smefit.analyze.run_report(report_card_file)[source]

Run the analysis given a report card name

Parameters:

report_card_file (pathlib:Path) – report configuration dictionary name

Submodules

smefit.analyze.chi2_utils module

class smefit.analyze.chi2_utils.Chi2tableCalculator(data_info)[source]

Bases: object

Compute the \(\chi^2\) for each replica and produce:

  • Tables with \(\chi^2\) for each dataset and datagroup.

  • Plot of \(\chi^2\) for each dataset.

  • Plot of \(\chi^2\) for each replica

Parameters:

data_info (pandas.DataFrame) – datasets information (references and data groups)

static add_normalized_chi2(chi2_df)[source]

Add the normalized \(\chi^2\) to the table.

Parameters:

chi2_df (pd.DataFrame) – \(\chi^2\) table for each dataset

Returns:

\(\chi^2\) table for each dataset with normalization

Return type:

pd.DataFrame

static compute(datasets, smeft_predictions)[source]

Compute the \(\chi^2\) for each replica and dataset.

Parameters:
  • datasets (smefit.loader.DataTuple) – loaded datasets

  • smeft_predictions (np.ndarray) – array with all the predictions for each replica

Returns:

  • pd.DataFrame\(\chi^2\) for each dataset

  • np.ndarray\(\chi^2/n_{pts}\) for each replica

group_chi2_df(chi2_df)[source]

Group the \(\chi^2\) according to the data type.

Parameters:

chi2_df (pd.DataFrame) – \(\chi^2\) table for each dataset

Returns:

\(\chi^2\) table with deviation info

Return type:

pd.DataFrame

plot_dist(chi2_hist, fig_name, figsize=(7, 5))[source]

Plots the \(\chi^2\) distribution.

plot_exp(chi2_dict, fig_name, figsize=(10, 15))[source]

Plots a bar plot of the \(\chi^2\) values per experiment

write(chi2_dict, chi2_dict_group)[source]

Write the \(\chi^2\) latex tables.

Parameters:
  • chi2_dict (dict) – tables computed with compute() method for each fit

  • chi2_dict_group (dict) – tables obtained with group_chi2_df() method for each fit

Returns:

list with the latex commands

Return type:

list(str)

write_chi2_grouped(chi2_dict, chi2_dict_group)[source]

Write the \(\chi^2\) latex tables for each data group.

Parameters:

chi2_dict (dict) – tables computed with compute() method per each fit

Returns:

list with the latex commands

Return type:

list(str)

write_chi2_summary(chi2_dict_group)[source]

Write the summary \(\chi^2\) table for grouped data.

Parameters:

chi2_dict_group (dict) – tables obtained with group_chi2_df() method for each fit

Returns:

list with the latex commands

Return type:

list(str)

smefit.analyze.coefficients_utils module

class smefit.analyze.coefficients_utils.CoefficientsPlotter(report_path, coeff_config, logo=False)[source]

Bases: object

Plots central values + 95% CL errors, 95% CL bounds, probability distributions, residuals, residual distribution, and energy reach.

Also writes a table displaying values for 68% CL bounds and central value + 95% errors.

Takes into account parameter constraints and displays all non-zero parameters.

Note: coefficients that are known to have disjoint probability distributions (i.e. multiple solutions) are manually separated by including the coefficient name in disjointed_list for disjointed_list2 for global and single fit results, respectively.

Parameters:
  • report_path (pathlib.Path, str) – path to base folder, where the reports will be stored.

  • coeff_config (pandas.DataFrame) – coefficients latex names by gropup type

  • logo (bool) – if True dispaly the logo on scatter and bar plots

plot_coeffs(bounds, figsize=(10, 15), x_min=-400, x_max=400, x_log=True, lin_thr=0.1)[source]

Plot central value + 95% CL errors

Parameters:

bounds (dict) – confidence level bounds per fit and coefficient Note: double solutions are appended under “2”

plot_coeffs_bar(error, figsize=(10, 15), plot_cutoff=400, x_log=True, x_min=0.01, x_max=500, legend_loc='best')[source]

Plot error bars at given confidence level

Parameters:
  • error (dict) – confidence level bounds per fit and coefficient

  • figsize (list, optional) – Figure size, (10, 15) by default

  • plot_cutoff (float) – Only show bounds up to here

  • x_log (bool, optional) – Use a log scale on the x-axis, true by default

  • x_min (float, optional) – Minimum x-value, 1e-2 by default

  • x_max (float, optional) – Maximum x-value, 500 by default

  • legend_loc (string, optional) – Legend location, “best” by default

plot_contours_2d(posteriors, labels, confidence_level=95, dofs_show=None, double_solution=None)[source]

Plots 2D marginalised projections confidence level contours

Parameters:
  • posteriors (list) – posterior distributions per fit and coefficient

  • labels (list) – list of fit names

  • dofs_show (list, optional) – List of coefficients to include in the cornerplot, set to None by default, i.e. all fitted coefficients are included.

  • double_solution (dict, optional) – Dictionary of operators with double (disjoint) solution per fit

plot_posteriors(posteriors, labels, disjointed_lists=None)[source]

Plot posteriors histograms.

Parameters:
  • posteriors (list) – posterior distributions per fit and coefficient

  • labels (list) – list of fit names

  • disjointed_list (list, optional) – list of coefficients with double solutions per fit

plot_pull(pull, x_min=-3, x_max=3, figsize=(10, 15), legend_loc='best')[source]

Plot error bars at given confidence level

Parameters:
  • pull (dict) – Fit residuals per fit and coefficient

  • x_min (float, optional) – Minimum sigma to display, -3 by default

  • x_max (float, optional) – Maximum sigma to display, +3 by default

  • figsize (list, optional) – Figure size, (10, 15) by default

  • legend_loc (string, optional) – Legend location, “best” by default

plot_spider(error, labels, title, marker_styles, ncol, ymax=100, log_scale=True, fontsize=12, figsize=(9, 9), legend_loc='best', radial_lines=None, class_order=None)[source]
Creates a spider plot that displays the ratio of uncertainties to a baseline fit,

which is taken as the first fit specified in the report runcard

Parameters:
  • error (dict) – confidence level bounds per fit and coefficient

  • labels (list) – Fit labels, taken from the report runcard

  • title (string) – Plot title

  • marker_styles (list, optional) – Marker styles per fit

  • ncol (int, optional) – Number of columns in the legend. Uses a single row by default.

  • ymax (float, optional) – Radius in percentage

  • log_scale (bool, optional) – Use a logarithmic radial scale, true by default

  • fontsize (int, optional) – Font size

  • figsize (list, optional) – Figure size, (9, 9) by default

  • legend_loc (string, optional) – Location of the legend, “best” by default

  • radial_lines (list, optional) – Location of radial lines in percentage

  • class_order (list, optional) – Order of operator classes, starting at 12’o clock anticlockwise

write_cl_table(bounds, round_val=3)[source]

Coefficients latex table

smefit.analyze.coefficients_utils.compute_confidence_level(posterior, coeff_info, has_posterior, disjointed_list=None)[source]

Compute central value, 95 % and 68 % confidence levels and store the result in a dictionary given a posterior distribution :param posterior: posterior distributions per coefficient :type posterior: dict :param coeff_info: coefficients list for which the bounds are computed with latex names :type coeff_info: pandas.Series :param disjointed_list: list of coefficients with double solutions :type disjointed_list: list, optional

Returns:

bounds – confidence level bounds per coefficient Note: double solutions are appended under “2”

Return type:

pandas.DataFrame

smefit.analyze.coefficients_utils.get_confidence_values(dist, has_posterior=True)[source]

Get confidence level bounds given the distribution

smefit.analyze.contours_2d module

smefit.analyze.contours_2d.confidence_ellipse(coeff1, coeff2, ax, facecolor='none', confidence_level=95, **kwargs)[source]

Draws 95% C.L. ellipse for data points x and y

Parameters:
  • coeff1 (array_like) – (N,) ndarray containing N posterior samples for the first coefficient

  • coeff2 (array_like) – (N,) ndarray containing N posterior samples for the first coefficient

  • ax (matplotlib.axes) – Axes object to plot on

  • facecolor (str, optional) – Color of the ellipse

  • **kwargs – Additional plotting settings passed to matplotlib.patches.Ellipse

Returns:

Ellipse object

Return type:

matplotlib.patches.Ellipse

smefit.analyze.contours_2d.plot_contours(ax, posterior, ax_labels, coeff1, coeff2, kde, clr_idx, confidence_level=95, double_solution=None)[source]
Parameters:
  • ax (matplotlib.axes) – Axes object to plot on

  • posterior (pandas.DataFrame) – Posterior samples per coefficient

  • ax_labels (list) – Latex names

  • coeff1 (str) – Name of first coefficient

  • coeff2 (str) – Name of second coefficient

  • kde (bool) – Performs kernel density estimate (kde) when quadratics are turned on

  • clr_idx (int) – Color index that makes sure each fit gets associated a different color

  • confidence_level (int, optional) – Confidence level interval, set to 95% by default

  • double_solution (dict, optional) – Dictionary of operators with double (disjoint) solution per fit

Returns:

hndls – List of Patch objects

Return type:

list

smefit.analyze.contours_2d.split_solution(full_solution)[source]

Split the posterior solution

smefit.analyze.correlations module

smefit.analyze.correlations.plot_correlations(posterior_df, latex_names, fig_name, thr_show=None, hide_dofs=None, title=None, figsize=(10, 10))[source]

Computes and displays the correlation coefficients between parameters in a heat map

Parameters:
  • posterior_df (pd.DataFrame) – fit results

  • latex_names (pd.DataFrame) – coefficnet latex name table

  • fig_name (str) – path to save the plot

  • thr_show (float, None) – if given shows only off diagonal entries higher than the threshold

  • hide_dofs (list, None) – coefficients to hide

  • title (str, None) – plot title

  • figsize (tuple, (10, 10)) – Figure size

smefit.analyze.fisher module

class smefit.analyze.fisher.FisherCalculator(coefficients, datasets, compute_quad)[source]

Bases: object

Computes and writes the Fisher information table, and plots heat map.

Linear Fisher information depends only on the theoretical corrections, while quadratic information requires fit results. Parameter constraints are also taken into account. Only fitted degrees of freedom are shown in the tables.

Parameters:
  • coefficients (smefit.coefficients.CoefficienManager) – coefficient manager

  • datasets (smefit.loader.DataTuple) – DataTuple object with all the data information

compute_linear()[source]

Compute linear Fisher information.

compute_quadratic(posterior_df, smeft_predictions)[source]

Compute quadratic Fisher information.

groupby_data(table, data_groups, norm, log)[source]

Merge fisher per data group.

static normalize(table, norm, log)[source]

Normalize a Pandas DataFrame

Parameters:
  • table (pandas.DataFrame) – table to normalize

  • norm ("data", "coeff") – if “data” it normalize by columns, if “coeff” by rows

  • log (bool) – presents the log of the Fisher if True

Returns:

normalized table

Return type:

pandas.DataFrame

plot(latex_names, fig_name, title=None, summary_only=True, figsize=(11, 15))[source]

Plot the heat map of Fisher table.

Parameters:
  • latex_names (list) – list of coefficients latex names

  • fig_name (str) – figure path

  • summary_only – if True plot the fisher grouped per datsets, else the fine grained dataset per dataset

  • figsize (tuple) – figure size

  • title (str, None) – plot title

write_grouped(coeff_config, data_groups, summary_only)[source]

Write Fisher information tables in latex, both for grouped data and for summary.

Parameters:
  • coeff_config (dict) – coefficient dictionary per group with latex names

  • data_groups (dict) – dictionary with datasets per group and relative links

  • summary_only (bool) – if True only the summary Fisher table fro grouped data is written

Returns:

list of the latex commands

Return type:

list(str)

smefit.analyze.html_utils module

smefit.analyze.html_utils.dump_html_index(html_report, html_index, report_path, report_title)[source]

Dump report index to html.

Parameters:
  • html_report (str) – html report content

  • html_index (str) – html index content

  • report_path (pathlib.Path) – report path

  • report_title (str) – report title

HTML link relative to report folder.

Parameters:
  • file (str) – file name

  • label (str) – label to dispaly

  • add_meta (bool, optional) – if True add ‘meta/’ to file name

Returns:

HTML link

Return type:

str

smefit.analyze.html_utils.run_htlatex(report_path, tex_file)[source]

Run pandoc to generate HTML files.

Parameters:
  • report_path (str) – report path

  • tex_file (pathlib.Path) – path to souce file

smefit.analyze.html_utils.write_html_container(title, figs=None, links=None, dataFrame=None)[source]

Write the content of single report section in HTML.

Parameters:
  • title (str) – section title

  • figs (list, optional) – list of figures to dispaly

  • links (dict, optional) – links to tables

  • dataFrame (pd.DataFrame) – table to display

Returns:

HTML section content

Return type:

str

smefit.analyze.latex_tools module

smefit.analyze.latex_tools.chi2table_header(L, fit_labels)[source]
smefit.analyze.latex_tools.compile_tex(report, L, filename)[source]

Compile tex file.

Parameters:
  • report (str) – report path

  • L (list(str)) – latex lines

  • filename (str) – file name

smefit.analyze.latex_tools.dump_to_tex(tex_file, L)[source]

Dump a string to a tex file.

Parameters:
smefit.analyze.latex_tools.latex_packages()[source]
smefit.analyze.latex_tools.multicolum_table_header(fit_labels, ncolumn=2)[source]

Append the multicolumn table header

smefit.analyze.latex_tools.run_pdflatex(report, filename)[source]

Run pdflatex.

Parameters:
  • report (str) – report path

  • filename (str) – file name

smefit.analyze.pca module

class smefit.analyze.pca.PcaCalculator(datasets, coefficients, latex_names)[source]

Bases: object

Computes and writes PCA table and heat map.

Note: matrix being decomposed by SVD are the linear corrections multiplied by the inverse covariance matrix.

Parameters:
  • dataset (smefit.loader.DataTuple) – loaded datasets

  • coefficients (smefit.coefficients.CoefficienManager) – coefficient manager

  • latex_names (dict) – coefficient latex names

compute()[source]

Compute PCA.

plot_heatmap(fig_name, sv_min=0.0001, sv_max=100000.0, thr_show=0.1, figsize=(15, 15), title=None)[source]

Heat Map of PC coefficients.

Parameters:
  • fig_name (str) – plot name

  • sv_min (float) – minimum singular value range shown in the top heatmap plot

  • sv_max (float) – maximum singular value range shown in the top heatmap plot

  • thr_show (float) – minimal threshold to show in the PCA decomposition

  • title (str, None) – plot title

write(fit_label, thr_show=0.01)[source]

Write PCA latex table.

Parameters:
  • fit_label (str) – fit label

  • thr_show (float) – minimal threshold to show in the PCA decomposition

class smefit.analyze.pca.RotateToPca(loaded_datasets, coefficients, config)[source]

Bases: object

Contruct a new fit runcard using PCA.

Parameters:
  • loaded_datasets (smefit.loader.DataTuple) – loaded datasets

  • coefficients (smefit.coeffiecients.CoefficientManager) – coeffiecient list

  • config (dict) – runcard configuration dictionary

compute()[source]

Compute the roation matrix. This is composed by two blocks: PCA and an identity for the constrained dofs.

classmethod from_dict(config)[source]

Build the class from a runcard dictionary.

Parameters:

config (dict) – runcard configuration dictionary

save()[source]

Dump updated runcard and roation matrix into the reult folder.

update_runcard()[source]

Update the runcard object.

smefit.analyze.pca.impose_constrain(dataset, coefficients, update_quad=False)[source]

Propagate coefficient constraint into the theory tables.

Note: only linear contraints are allowed in this method. Non linear contrains not always make sense here.

Parameters:
  • dataset (smefit.loader.DataTuple) – loaded datasets

  • coefficient (smefit.coefficients.CoefficienManager) – coefficient manager

  • update_quad (bool, optional) – if True update also quadratic corrections

Returns:

  • np.ndarray – array of updated linear corrections (n_free_op, n_dat)

  • np.ndarray, optional – array of updated quadratic corrections (n_free_op, n_free_op, n_dat)

smefit.analyze.pca.make_sym_matrix(vals, n_op)[source]

Build a square tensor (n_op,n_op,vals.shape[0]), starting from the upper tiangular part.

Parameters:
  • vals (np.ndarray) – traingular part

  • n_op (int) – dimension of the final matrix

Returns:

square tensor.

Return type:

np.ndarry

Examples

make_sym_matrix(array([1,2,3,4,5,6]), 3) -> array([[1,2,3],[0,4,5],[0,0,6]])

smefit.analyze.report module

class smefit.analyze.report.Report(report_path, result_path, report_config)[source]

Bases: object

Report class manager.

If \(\chi^2\), Fisher or Data vs Theory plots are produced it computes the best fit theory predictions.

report

path to report folder

Type:

str

fits

array with fits (instances of smefit.fit_manager.FitManger) included in the report

Type:

numpy.ndarray

data_info

datasets information (references and data groups)

Type:

pandas.DataFrame

coeff_info

coefficients information (group and latex name)

Type:

pandas.DataFrame

Parameters:
  • report_path (pathlib.Path, str) – path to base folder, where the reports will be stored.

  • result_path (pathlib.Path, str) – path to base folder, where the results are stored.

  • report_config (dict) – dictionary with report configuration, see /run_cards/analyze/report_runcard.yaml for an example

chi2(table=True, plot_experiment=None, plot_distribution=None)[source]

\(\chi^2\) table and plots runner.

Parameters:
  • table (bool, optional) – write the latex \(\chi^2\) table per dataset

  • plot_experiment (bool, optional) – plot the \(\chi^2\) per dataset

  • plot_distribution (bool, optional) – plot the \(\chi^2\) distribution per each replica

coefficients(scatter_plot=None, confidence_level_bar=None, pull_bar=None, spider_plot=None, posterior_histograms=True, contours_2d=None, hide_dofs=None, show_only=None, logo=True, table=None, double_solution=None)[source]

Coefficients plots and table runner.

Parameters:
  • hide_dofs (list) – list of operator not to display

  • show_only (list) – list of all the operator to display, if None all the free dof are presented

  • logo (bool) – if True add logo to the plots

  • scatter_plot (None, dict) – kwarg confidence level bar plot or None

  • confidence_level_bar (None, dict) – kwarg scatter plot or None

  • posterior_histograms (bool) – if True plot the posterior distribution for each coefficient

  • table (None, dict) – kwarg the latex confidence level table per coefficient or None

  • double_solution (dict) – operator with double solution per fit

correlations(hide_dofs=None, thr_show=0.1, title=True, fit_list=None, figsize=(10, 10))[source]

Plot coefficients correlation matrix.

Parameters:
  • hide_dofs (list) – list of operator not to display.

  • thr_show (float, None) – minimum threshold value to show. If None the full correlation matrix is displayed.

  • title (bool) – if True display fit label name as title

  • fit_list (list, optional) – list of fit names for which the correlation is computed. By default all the fits included in the report

fisher(norm='coeff', summary_only=True, plot=None, fit_list=None, log=False)[source]

Fisher information table and plots runner.

Summary table and plots are the default

Parameters:
  • norm ("coeff", "dataset") – fisher information normalization: per coefficient, or per dataset

  • summary_only (bool, optional) – if False writes the fine grained fisher tables per dataset and group if True only the summary table with grouped a datsets is written

  • plot (None, dict) – plot options

  • fit_list (list, optional) – list of fit names for which the fisher information is computed. By default all the fits included in the report

  • log (bool, optional) – if True shows the log of the Fisher informaltion

pca(table=True, plot=None, thr_show=0.01, fit_list=None)[source]

Principal Components Analysis runner.

Parameters:
  • table (bool, optional) – if True writes the PC directions in a latex list

  • plot (bool, optional) – if True produces a PC heatmap

  • thr_show (float) – minimum threshold value to show

  • fit_list (list, optional) – list of fit names for which the PCA is computed. By default all the fits included in the report

summary()[source]

Summary Table runner.

smefit.analyze.spider module

smefit.analyze.spider.radar_factory(num_vars, frame='circle')[source]

This function is copied from https://matplotlib.org/stable/gallery/specialty_plots/radar_chart.html

Create a radar chart with num_vars axes.

This function creates a RadarAxes projection and registers it.

Parameters:
  • num_vars (int) – Number of variables for radar chart.

  • frame ({'circle', 'polygon'}) – Shape of frame surrounding axes.

smefit.analyze.summary module

class smefit.analyze.summary.SummaryWriter(fits, data_groups, coeff_config)[source]

Bases: object

Provides a summary of the fits included in the report: the fit settings, fitted parameters (including any parameter constraints), and datasets.

Uses data_references.yaml, data_groups.yaml, and coeff_groups.yaml YAML files. The first gives the references used for hyperlinks, and the other two organizes the data and parameters into groups (top, higgs, etc.).

fit_settings()[source]

Fit settings table.

Returns:

table with the most relevant fit settings

Return type:

pd.DataFrame

write_coefficients_table()[source]

Write the coefficients tables

Returns:

L – list of the latex commands

Return type:

list(str)

write_dataset_table()[source]

Write the summary tables

Returns:

L – list of the latex commands

Return type:

list(str)

smefit.analyze.summary.add_bounded_coeff(coeff_dict, coeff_df)[source]

Build constrain formula