smefit.analyze package
- smefit.analyze.run_report(report_card_file)[source]
Run the analysis given a report card name
- Parameters:
report_card_file (pathlib:Path) – report configuration dictionary name
Submodules
smefit.analyze.chi2_utils module
- class smefit.analyze.chi2_utils.Chi2tableCalculator(data_info)[source]
Bases:
object
Compute the \(\chi^2\) for each replica and produce:
Tables with \(\chi^2\) for each dataset and datagroup.
Plot of \(\chi^2\) for each dataset.
Plot of \(\chi^2\) for each replica
- Parameters:
data_info (pandas.DataFrame) – datasets information (references and data groups)
- static add_normalized_chi2(chi2_df)[source]
Add the normalized \(\chi^2\) to the table.
- Parameters:
chi2_df (pd.DataFrame) – \(\chi^2\) table for each dataset
- Returns:
\(\chi^2\) table for each dataset with normalization
- Return type:
pd.DataFrame
- static compute(datasets, smeft_predictions)[source]
Compute the \(\chi^2\) for each replica and dataset.
- Parameters:
datasets (smefit.loader.DataTuple) – loaded datasets
smeft_predictions (np.ndarray) – array with all the predictions for each replica
- Returns:
pd.DataFrame – \(\chi^2\) for each dataset
np.ndarray – \(\chi^2/n_{pts}\) for each replica
- group_chi2_df(chi2_df)[source]
Group the \(\chi^2\) according to the data type.
- Parameters:
chi2_df (pd.DataFrame) – \(\chi^2\) table for each dataset
- Returns:
\(\chi^2\) table with deviation info
- Return type:
pd.DataFrame
- plot_exp(chi2_dict, fig_name, figsize=(10, 15))[source]
Plots a bar plot of the \(\chi^2\) values per experiment
smefit.analyze.coefficients_utils module
- class smefit.analyze.coefficients_utils.CoefficientsPlotter(report_path, coeff_config, logo=False)[source]
Bases:
object
Plots central values + 95% CL errors, 95% CL bounds, probability distributions, residuals, residual distribution, and energy reach.
Also writes a table displaying values for 68% CL bounds and central value + 95% errors.
Takes into account parameter constraints and displays all non-zero parameters.
Note: coefficients that are known to have disjoint probability distributions (i.e. multiple solutions) are manually separated by including the coefficient name in disjointed_list for disjointed_list2 for global and single fit results, respectively.
- Parameters:
report_path (pathlib.Path, str) – path to base folder, where the reports will be stored.
coeff_config (pandas.DataFrame) – coefficients latex names by gropup type
logo (bool) – if True dispaly the logo on scatter and bar plots
- plot_coeffs(bounds, figsize=(10, 15), x_min=-400, x_max=400, x_log=True, lin_thr=0.1)[source]
Plot central value + 95% CL errors
- Parameters:
bounds (dict) – confidence level bounds per fit and coefficient Note: double solutions are appended under “2”
- plot_coeffs_bar(error, figsize=(10, 15), plot_cutoff=400, x_log=True, x_min=0.01, x_max=500)[source]
Plot error bars at given confidence level
- Parameters:
error (dict) – confidence level bounds per fit and coefficient
figsize (list, optional) – Figure size, (10, 15) by default
plot_cutoff (float) – Only show bounds up to here
x_log (bool, optional) – Use a log scale on the x-axis, true by default
x_min (float, optional) – Minimum x-value, 1e-2 by default
x_max (float, optional) – Maximum x-value, 500 by default
legend_loc (string, optional) – Legend location, “best” by default
- plot_contours_2d(posteriors, labels, confidence_level=95, dofs_show=None, double_solution=None)[source]
Plots 2D marginalised projections confidence level contours
- Parameters:
posteriors (list) – posterior distributions per fit and coefficient
labels (list) – list of fit names
dofs_show (list, optional) – List of coefficients to include in the cornerplot, set to
None
by default, i.e. all fitted coefficients are included.double_solution (dict, optional) – Dictionary of operators with double (disjoint) solution per fit
- plot_pull(pull, x_min=-3, x_max=3, figsize=(10, 15))[source]
Plot error bars at given confidence level
- Parameters:
pull (dict) – Fit residuals per fit and coefficient
x_min (float, optional) – Minimum sigma to display, -3 by default
x_max (float, optional) – Maximum sigma to display, +3 by default
figsize (list, optional) – Figure size, (10, 15) by default
legend_loc (string, optional) – Legend location, “best” by default
- plot_spider(error, labels, title, marker_styles, ncol, ymax=100, log_scale=True, fontsize=12, figsize=(9, 9), legend_loc='best', radial_lines=None, class_order=None)[source]
- Creates a spider plot that displays the ratio of uncertainties to a baseline fit,
which is taken as the first fit specified in the report runcard
- Parameters:
error (dict) – confidence level bounds per fit and coefficient
labels (list) – Fit labels, taken from the report runcard
title (string) – Plot title
marker_styles (list, optional) – Marker styles per fit
ncol (int, optional) – Number of columns in the legend. Uses a single row by default.
ymax (float, optional) – Radius in percentage
log_scale (bool, optional) – Use a logarithmic radial scale, true by default
fontsize (int, optional) – Font size
figsize (list, optional) – Figure size, (9, 9) by default
legend_loc (string, optional) – Location of the legend, “best” by default
radial_lines (list, optional) – Location of radial lines in percentage
class_order (list, optional) – Order of operator classes, starting at 12’o clock anticlockwise
- smefit.analyze.coefficients_utils.compute_confidence_level(posterior, coeff_info, has_posterior, disjointed_list=None)[source]
Compute central value, 95 % and 68 % confidence levels and store the result in a dictionary given a posterior distribution :param posterior: posterior distributions per coefficient :type posterior: dict :param coeff_info: coefficients list for which the bounds are computed with latex names :type coeff_info: pandas.Series :param disjointed_list: list of coefficients with double solutions :type disjointed_list: list, optional
- Returns:
bounds – confidence level bounds per coefficient Note: double solutions are appended under “2”
- Return type:
pandas.DataFrame
smefit.analyze.contours_2d module
- smefit.analyze.contours_2d.confidence_ellipse(coeff1, coeff2, ax, facecolor='none', confidence_level=95, **kwargs)[source]
Draws 95% C.L. ellipse for data points x and y
- Parameters:
coeff1 (array_like) –
(N,) ndarray
containingN
posterior samples for the first coefficientcoeff2 (array_like) –
(N,) ndarray
containingN
posterior samples for the first coefficientax (matplotlib.axes) – Axes object to plot on
facecolor (str, optional) – Color of the ellipse
**kwargs – Additional plotting settings passed to matplotlib.patches.Ellipse
- Returns:
Ellipse object
- Return type:
matplotlib.patches.Ellipse
- smefit.analyze.contours_2d.plot_contours(ax, posterior, ax_labels, coeff1, coeff2, kde, clr_idx, confidence_level=95, double_solution=None)[source]
- Parameters:
ax (matplotlib.axes) – Axes object to plot on
posterior (pandas.DataFrame) – Posterior samples per coefficient
ax_labels (list) – Latex names
coeff1 (str) – Name of first coefficient
coeff2 (str) – Name of second coefficient
kde (bool) – Performs kernel density estimate (kde) when quadratics are turned on
clr_idx (int) – Color index that makes sure each fit gets associated a different color
confidence_level (int, optional) – Confidence level interval, set to 95% by default
double_solution (dict, optional) – Dictionary of operators with double (disjoint) solution per fit
- Returns:
hndls – List of Patch objects
- Return type:
smefit.analyze.correlations module
- smefit.analyze.correlations.plot_correlations(posterior_df, latex_names, fig_name, thr_show=None, hide_dofs=None, title=None, figsize=(10, 10))[source]
Computes and displays the correlation coefficients between parameters in a heat map
- Parameters:
posterior_df (pd.DataFrame) – fit results
latex_names (pd.DataFrame) – coefficnet latex name table
fig_name (str) – path to save the plot
thr_show (float, None) – if given shows only off diagonal entries higher than the threshold
hide_dofs (list, None) – coefficients to hide
title (str, None) – plot title
figsize (tuple, (10, 10)) – Figure size
smefit.analyze.fisher module
- class smefit.analyze.fisher.FisherCalculator(coefficients, datasets, compute_quad)[source]
Bases:
object
Computes and writes the Fisher information table, and plots heat map.
Linear Fisher information depends only on the theoretical corrections, while quadratic information requires fit results. Parameter constraints are also taken into account. Only fitted degrees of freedom are shown in the tables.
- Parameters:
coefficients (smefit.coefficients.CoefficienManager) – coefficient manager
datasets (smefit.loader.DataTuple) – DataTuple object with all the data information
- static normalize(table, norm, log)[source]
Normalize a Pandas DataFrame
- Parameters:
table (pandas.DataFrame) – table to normalize
norm ("data", "coeff") – if “data” it normalize by columns, if “coeff” by rows
log (bool) – presents the log of the Fisher if True
- Returns:
normalized table
- Return type:
pandas.DataFrame
- plot(latex_names, fig_name, title=None, summary_only=True, figsize=(11, 15))[source]
Plot the heat map of Fisher table.
smefit.analyze.html_utils module
- smefit.analyze.html_utils.dump_html_index(html_report, html_index, report_path, report_title)[source]
Dump report index to html.
- Parameters:
html_report (str) – html report content
html_index (str) – html index content
report_path (pathlib.Path) – report path
report_title (str) – report title
- smefit.analyze.html_utils.html_link(file, label, add_meta=True)[source]
HTML link relative to report folder.
- smefit.analyze.html_utils.run_htlatex(report_path, tex_file)[source]
Run pandoc to generate HTML files.
- Parameters:
report_path (str) – report path
tex_file (pathlib.Path) – path to souce file
smefit.analyze.latex_tools module
- smefit.analyze.latex_tools.dump_to_tex(tex_file, L)[source]
Dump a string to a tex file.
- Parameters:
tex_file (pathlib.Path) – path to tex file
smefit.analyze.pca module
- class smefit.analyze.pca.PcaCalculator(datasets, coefficients, latex_names)[source]
Bases:
object
Computes and writes PCA table and heat map.
Note: matrix being decomposed by SVD are the linear corrections multiplied by the inverse covariance matrix.
- Parameters:
dataset (smefit.loader.DataTuple) – loaded datasets
coefficients (smefit.coefficients.CoefficienManager) – coefficient manager
latex_names (dict) – coefficient latex names
- class smefit.analyze.pca.RotateToPca(loaded_datasets, coefficients, config)[source]
Bases:
object
Contruct a new fit runcard using PCA.
- Parameters:
loaded_datasets (smefit.loader.DataTuple) – loaded datasets
coefficients (smefit.coeffiecients.CoefficientManager) – coeffiecient list
config (dict) – runcard configuration dictionary
- compute()[source]
Compute the roation matrix. This is composed by two blocks: PCA and an identity for the constrained dofs.
- smefit.analyze.pca.impose_constrain(dataset, coefficients, update_quad=False)[source]
Propagate coefficient constraint into the theory tables.
Note: only linear contraints are allowed in this method. Non linear contrains not always make sense here.
- Parameters:
dataset (smefit.loader.DataTuple) – loaded datasets
coefficient (smefit.coefficients.CoefficienManager) – coefficient manager
update_quad (bool, optional) – if True update also quadratic corrections
- Returns:
np.ndarray – array of updated linear corrections (n_free_op, n_dat)
np.ndarray, optional – array of updated quadratic corrections (n_free_op, n_free_op, n_dat)
- smefit.analyze.pca.make_sym_matrix(vals, n_op)[source]
Build a square tensor (n_op,n_op,vals.shape[0]), starting from the upper tiangular part.
- Parameters:
vals (np.ndarray) – traingular part
n_op (int) – dimension of the final matrix
- Returns:
square tensor.
- Return type:
np.ndarry
Examples
make_sym_matrix(array([1,2,3,4,5,6]), 3) -> array([[1,2,3],[0,4,5],[0,0,6]])
smefit.analyze.report module
- class smefit.analyze.report.Report(report_path, result_path, report_config)[source]
Bases:
object
Report class manager.
If \(\chi^2\), Fisher or Data vs Theory plots are produced it computes the best fit theory predictions.
- fits
array with fits (instances of smefit.fit_manager.FitManger) included in the report
- Type:
- data_info
datasets information (references and data groups)
- Type:
pandas.DataFrame
- coeff_info
coefficients information (group and latex name)
- Type:
pandas.DataFrame
- Parameters:
report_path (pathlib.Path, str) – path to base folder, where the reports will be stored.
result_path (pathlib.Path, str) – path to base folder, where the results are stored.
report_config (dict) – dictionary with report configuration, see /run_cards/analyze/report_runcard.yaml for an example
- chi2(table=True, plot_experiment=None, plot_distribution=None)[source]
\(\chi^2\) table and plots runner.
- coefficients(scatter_plot=None, confidence_level_bar=None, pull_bar=None, spider_plot=None, posterior_histograms=True, contours_2d=None, hide_dofs=None, show_only=None, logo=True, table=None, double_solution=None)[source]
Coefficients plots and table runner.
- Parameters:
hide_dofs (list) – list of operator not to display
show_only (list) – list of all the operator to display, if None all the free dof are presented
logo (bool) – if True add logo to the plots
scatter_plot (None, dict) – kwarg confidence level bar plot or None
confidence_level_bar (None, dict) – kwarg scatter plot or None
posterior_histograms (bool) – if True plot the posterior distribution for each coefficient
table (None, dict) – kwarg the latex confidence level table per coefficient or None
double_solution (dict) – operator with double solution per fit
- correlations(hide_dofs=None, thr_show=0.1, title=True, fit_list=None, figsize=(10, 10))[source]
Plot coefficients correlation matrix.
- Parameters:
hide_dofs (list) – list of operator not to display.
thr_show (float, None) – minimum threshold value to show. If None the full correlation matrix is displayed.
title (bool) – if True display fit label name as title
fit_list (list, optional) – list of fit names for which the correlation is computed. By default all the fits included in the report
- fisher(norm='coeff', summary_only=True, plot=None, fit_list=None, log=False)[source]
Fisher information table and plots runner.
Summary table and plots are the default
- Parameters:
norm ("coeff", "dataset") – fisher information normalization: per coefficient, or per dataset
summary_only (bool, optional) – if False writes the fine grained fisher tables per dataset and group if True only the summary table with grouped a datsets is written
plot (None, dict) – plot options
fit_list (list, optional) – list of fit names for which the fisher information is computed. By default all the fits included in the report
log (bool, optional) – if True shows the log of the Fisher informaltion
- pca(table=True, plot=None, thr_show=0.01, fit_list=None)[source]
Principal Components Analysis runner.
- Parameters:
table (bool, optional) – if True writes the PC directions in a latex list
plot (bool, optional) – if True produces a PC heatmap
thr_show (float) – minimum threshold value to show
fit_list (list, optional) – list of fit names for which the PCA is computed. By default all the fits included in the report
smefit.analyze.spider module
- smefit.analyze.spider.radar_factory(num_vars, frame='circle')[source]
This function is copied from https://matplotlib.org/stable/gallery/specialty_plots/radar_chart.html
Create a radar chart with num_vars axes.
This function creates a RadarAxes projection and registers it.
- Parameters:
num_vars (int) – Number of variables for radar chart.
frame ({'circle', 'polygon'}) – Shape of frame surrounding axes.
smefit.analyze.summary module
- class smefit.analyze.summary.SummaryWriter(fits, data_groups, coeff_config)[source]
Bases:
object
Provides a summary of the fits included in the report: the fit settings, fitted parameters (including any parameter constraints), and datasets.
Uses data_references.yaml, data_groups.yaml, and coeff_groups.yaml YAML files. The first gives the references used for hyperlinks, and the other two organizes the data and parameters into groups (top, higgs, etc.).
- fit_settings()[source]
Fit settings table.
- Returns:
table with the most relevant fit settings
- Return type:
pd.DataFrame