How to run the code
In the following we provide detailed instructions on how to use the code in its different running modes and on how to analyse the results.
Runcard specifications
The basic object required to run the code is a runcard.
In this section we document the different parameters which have to be specified here.
As example we will refer to the runcard to reproduce smefit2.0
,
available from the repository smefit_database
together with the files containing experimental data and the corresponding theory predictions.
After cloning the repository, run
python update_runcards_path.py -d /path/to/runcard/destination/ runcards/NS_GLOBAL_NLO_NHO.yaml
This will create in /path/to/runcard/destination/
a smefit2.0
runcard ready to be used on the local machine of the user, pointing to the experimental data and an theory tables in the repository smefit_database.
In the folder smefit_database/runcards
the input runcards for MC and NS fits with both linear (NHO)
and linear+quadratic corrections (HO) are available.
Input and output path
The path to where the experimental data and the corresponding theory tables are stored
is automatically set to those contained in smefit_dayabase
by the script update_runcards_path.py
.
The user can change them manually if other set of data are desired.
The folder where the results will be saved can be set using result_path
. The file containing
the posterior of the fitted Wilson coefficient will be saved in resulth_path/result_ID
.
If result_ID
is not provided, it will be automatically set to the name of the runcard (and any already existing result will be overwritten).
result_ID:
result_path:
data_path:
theory_path:
Theory specifications
The default perturbative order of the theory prediction is set by the key default_order
. Orders may also be specified
per datset, see here for more details.
The order in the EFT expansion should be specified by setting use_quad
to either True
or False
to include quadratic or only linear corrections respectively. The option use_t0
controls the use
of the t0
prescription and use_theory_covmat
specifies whether or not to use the theory covariance matrix
which can be specified in the theory files.
default_order: LO
use_quad: False
use_t0: False
use_theory_covmat: True
cutoff_scale: 1000
Here cutoff_scale
specifies the scale (in GeV) above which all datapoints will be excluded from the fit.
Minimizer specifications
The different parameters controlling the minimizer used in the analysis are specified here.
If single_parameter_fits
is set to True
, the Wilson coefficient specified in the run-card
will be fit one at time, setting all the others to 0. See here for more details.
If pairwise_fits
is set to True
, the minimizer carries out an automated series of pair-wise fits to all possible pairs of
Wilson coefficients that are specified in the run-card.
Pairwise fits are supported only with |NS|.
pairwise_fits: False
single_parameter_fits: False
bounds: Null
# NS settings
nlive: 400 # number of live points used during sampling
lepsilon: 0.05 # Terminate when live point likelihoods are all the same, within Lepsilon tolerance.
target_evidence_unc: 0.5 # target evidence uncertanty
target_post_unc: 0.5 # target posterior uncertanty
frac_remain: 0.01 # Set to a higher number (0.5) if you know the posterior is simple.
store_raw: false # if true, store the raw result and enable resuming the job.
vectorized: false # if true, ultranest samples a vector from the prior (recommended for large scale problems)
float64: false # double precision
#MC settings
mc_minimiser: 'cma' # Allowed options are: 'cma', 'dual_annealing', 'trust-constr'
restarts: 7 # number of restarts (only for cma)
maxiter: 100000 # max number of iteration
chi2_threshold: 3.0 # post fit chi2 threshold
#A settings
n_samples: 1000 # number of the required samples of the posterior distribution
Datasets to consider and coefficients to fit
The datasets and Wilson coefficients to be included in the analysis must be listed under datasets
and coefficients
respectively. The default order for each dataset is taken from default_order
. However, it is
possible to specify specific orders per dataset. To do this, add the key order
to the dataset entry as follows.
datasets:
- name: ATLAS_tt_8TeV_ljets_Mtt
- name: ATLAS_tt_8TeV_dilep_Mtt
order: NLO_QCD
- name: CMS_tt_8TeV_ljets_Ytt
order: NLO_QCD
- name: CMS_tt2D_8TeV_dilep_MttYtt
order: NLO_QCD
- name: CMS_tt_13TeV_ljets_2015_Mtt
order: NLO_QCD
- name: CMS_tt_13TeV_dilep_2015_Mtt
order: NLO_QCD
- name: CMS_tt_13TeV_ljets_2016_Mtt
order: NLO_QCD
- name: CMS_tt_13TeV_dilep_2016_Mtt
order: NLO_QCD
- name: ATLAS_tt_13TeV_ljets_2016_Mtt
order: NLO_QCD
- name: ATLAS_CMS_tt_AC_8TeV
order: NLO_QCD
- name: ATLAS_tt_AC_13TeV
...
...
# Coefficients to fit
coefficients:
OpQM: {'min': -10, 'max': 10}
O3pQ3: {'min': -2.0, 'max': 2.0}
Opt: {'min': -25.0, 'max': 15.0}
OtW: {'min': -1.0, 'max': 1.0}
...
...
As exemplified above, the syntax to specify the Wilson coefficient corresponding to the operator
O1
is O1 : {'min': , 'max':}
where min
and max
indicate the bounds within
the sampling is performed.
Constrains between coefficients
Some Wilson coefficients are not directly fit, but rather constrained to be linear combinations
of the other ones. Taking as example some coefficient of the Higgs sector considered in smefit2.0
,
this can be specified in the runcard in the following way
OpWB: {'min': -0.3, 'max': 0.5}
OpD: {'min': -1.0, 'max': 1.0}
OpqMi: {'constrain': [{'OpD': 0.9248},{'OpWB': 1.8347}], 'min': -30, 'max': 30}
O3pq: {'constrain': [{'OpD': -0.8415},{'OpWB': -1.8347}], 'min': -0.5, 'max': 1.0}
In this example the coefficients CpqMi
and C3pq
correspondong to the operators
OpqMi
and O3pq
will be set to
CpqMi = 0.9248*CpD + 1.8347*CpWB
C3pq = -0.8415*CpD -1.8347*CpWB
Fit in different basis
It is possible to run a fit using a different basis. In this case the coefficients specified in the runcard should be the ones to fit, and a rotation basis defining the new basis in terms of the Warsaw basis should be given as input. This can be done using the option below.
rot_to_fit_basis: /path/to/rotation/rotation.json
In addition, it is possible to perform a fit in a PCA rotated basis. This corresponds to the basis spanned by the eigenvectors of the Fisher information matrix at the linear level in the EFT expansion. To carry out such a fit, add the following flag
smefit NS --rotate_to_pca path/to/the/runcard/runcard.yaml
Adding custom likelihoods
SMEFiT supports the addition of customised likelihoods. This can be relevant when an external likelihood is already at hand and one would like to combine it with the one constructed internally in SMEFiT. To make use of this feature, one should add the following to the runcard:
external_chi2:
'ExternalChi2': /path/to/external/chi2.py
Here, ExternalChi2
is the name of the class that must be defined in the referenced python file as follows:
import numpy as np
class ExternalChi2:
def __init__(self, coefficients):
"""
Constructor that allows one to set attributes that can be called in the compute_chi2 method
Parameters
----------
coefficients: smefit.coefficients.CoefficientManager
attributes: name, value
"""
self.example_attribute = coefficients.name
def compute_chi2(self, coefficient_values):
"""
Parameters
----------
coefficients_values : numpy.ndarray
|EFT| coefficients values
"""
# example
chi2_value = np.sum(coefficient_values**2)
return chi2_value
One is free to set custom attributes in the constructor. The coefficient values during optimisation
are accesible via coefficient_values
in the compute_chi2
method. In order for the external chi2
to work, it is important one does not change the name of the compute_chi2
method!
Running a fit with NS
To run a fiy using Nested Sampling use the command
smefit NS path/to/the/runcard/runcard.yaml
This will generate a file named posterior.json
in the result folder,
containing the posterior distribution of the coefficients specified in the runcard.
Running a fit with MC
Disclaimer: the MC mode is only supported for linear fits.
The basic command to run a fit using Monte Carlo is
smefit MC path/to/the/runcard/runcard.yaml -n replica_number
This will produce a file called replica_<replica_number>/coefficients_rep_<replica_number>.json
in the result folder, containing the values of the Wilson coefficients for the replica.
Once an high enough number of replicas have been produced, the results can be merged into the final posterior
running PostFit
smefit POSTFIT path/to/the/result/ -n number_of_replicas
where <number_of_replicas>
specifies the number of replicas to be used to build the posterior.
Replicas not satisfying the PostFit criteria will be discarded. If the final number of good replicas is lower than
<number_of_replicas>
the code will output an error message asking to produce more replicas first.
The final output is the file posterior.json
containing the full posterior of the Wilson coefficients.
Solving the linear problem
In case only linear cocrrections are used, one can find the analytic solution to the linear problem by
smefit A path/to/the/runcard/runcard.yaml
This will also sample the posterior distribution according to the runcard.
Single parameter fits
Given a runcard with a number of Wilson coefficients specified, it is possible to fit each of them in turn, keeping all the other ones fix to 0. To do this add to the runcard
single_parameter_fits: True
and proceed as documented above for a normal fit.
For both NS, MC and A the final output will be the file posterior.json
containing the independent posterior of the fitted Wilson coefficients, obtained by a series os independent single parameter fits.
Individual parameter scan
The code can also be used to produce 1-dimensional scans of the chi2 function. The command
smefit SCAN /path/to/the/runcard/runcard.yaml
will produce in the results folder a series of pdf files containing plots for 1-dimensional scans of the chi2 with respect to each parameter in the runcard.