EELSFitter.core package
EELSFitter.core.spectral_image module
- class EELSFitter.core.spectral_image.SpectralImage(data, deltaE=None, pixel_size=None, beam_energy=None, collection_angle=None, convergence_angle=None, name=None, **kwargs)[source]
- KK_pixel(i, j, signal_type='EELS', iterations=1, **kwargs)[source]
Perform a Kramer-Krönig analysis on pixel (
i
,j
).- Parameters:
i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel.
signal_type (str, optional) – Type of spectrum. Set to EELS by default.
iterations (int) – Number of the iterations for the internal loop to remove the surface plasmon contribution. If 1 the surface plasmon contribution is not estimated and subtracted (the default is 1).
- Returns:
dielectric_functions (numpy.ndarray, shape=(M,)) – Collection dielectric-functions replicas at pixel (
i
,j
).ts (float) – Thickness.
S_ss (array_like) – Surface scatterings.
signal_ssds (array_like) – Deconvoluted EELS spectrum.
- calc_axes()[source]
Determines the x_axis and y_axis of the spectral image. Stores them in
self.x_axis
andself.y_axis
respectively.
- calc_thickness(signal, n=None, rho=None, n_zlp=None)[source]
- Calculates thickness from sample data by one of two methods:
Kramer-Kronig sum rule using the refractive index [Egerton and Cheng, 1987]
Log ratio using mass density [Iakoubovskii et al., 2008]
- Parameters:
signal (numpy.ndarray, shape=(M,)) – spectrum
n (float) – refraction index
rho (float) – mass density
n_zlp (float or int) – Set to 1 by default, for already normalized EELS spectra.
- Returns:
t – thickness
- Return type:
Notes
If using the refractive index, surface scatterings are not corrected for. If you wish to correct for surface scatterings, please extract the thickness
t
fromkramers_kronig_analysis()
.
- cluster(n_clusters=5, based_on='log_zlp', init='k-means++', n_times=10, max_iter=300, seed=None, save_seed=False, algorithm='lloyd', **kwargs)[source]
Clusters the spectral image into clusters according to the (log) integrated intensity at each pixel. Cluster means are stored in the attribute
self.cluster_centroids
. This is then passed on to the cluster_on_centroids function where the index to which each cluster belongs is stored in the attributeself.cluster_labels
.- Parameters:
n_clusters (int, optional) – Number of clusters, 5 by default
based_on (str, optional) – One can cluster either on the sum of the intensities (pass
'sum'
), the log of the sum (pass'log_sum'
), the log of the ZLP peak value (pass'log_peak'
), the log of the ZLP peak value + the two bins next to the peak value (pass'log_zlp'
), the log of the sum of the bulk spectrum (no zlp) (pass'log_bulk'
), the thickness (pass'thickness'
). The default is'log_zlp'
.init ({'k-means++', 'random'}, callable or array-like of shape ) – (n_clusters, n_features), default=’k-means++’
n_times (int, default=10) – Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.
max_iter (int, default=300) – Maximum number of iterations of the k-means algorithm for a single run.
seed (int or None, default=None) – Determines random number generation for centroid initialization. Use an int to make the randomness deterministic.
save_seed (bool, default=False) – save the seed with corresponding settings to get to the same result
algorithm ({'lloyd', 'elkan'}, default='lloyd') – K-means algorithm to use. The classical EM-style algorithm is
'lloyd'
. The'elkan'
variation can be more efficient on some datasets with well-defined clusters, by using the triangle inequality. However, it’s more memory intensive due to the allocation of an extra array of shape (n_samples, n_clusters).kwargs –
- cluster_on_centroids(cluster_centroids, based_on='log_zlp', **kwargs)[source]
If the image has been clustered before and the cluster centroids are already known, one can use this function to reconstruct the original clustering of the image.
- Parameters:
cluster_centroids (numpy.ndarray, shape=(M,)) – Array with the cluster centroids
based_on (str, optional) – One can cluster either on the sum of the intensities (pass
'sum'
), the log of the sum (pass'log_sum'
), the log of the ZLP peak value (pass'log_peak'
), the log of the ZLP peak value + the two bins next to the peak value (pass'log_zlp'
), the log of the sum of the bulk spectrum (no zlp) (pass'log_bulk'
), the thickness (pass'thickness'
). The default is'log_zlp'
.kwargs –
- cut_image_energy(e1=None, e2=None, include_edge=True)[source]
Cuts the spectral image at
E1
andE2
and keeps only the part in between.- Parameters:
- cut_image_pixels(range_width, range_height)[source]
Cuts the spectral image
- Parameters:
range_width (numpy.ndarray, shape=(2,)) – Contains the horizontal selection cut
range_height (numpy.ndarray, shape=(2,)) – Contains the vertical selection cut
- static decompress_pickle(file)[source]
Opens, decompresses and returns the pickle file at location
file
.- Parameters:
file (str) – location where the pickle file is stored
- Returns:
data
- Return type:
- deconvolution(signal, zlp, correction=True)[source]
Perform deconvolution on a given signal with a given Zero Loss Peak. This removes both the ZLP and plural scattering. Based on the Fourier Log Method [Johnson and Spence, 1974, Egerton, 2011]
- Parameters:
signal (numpy.ndarray, shape=(M,)) – Raw signal of length M
zlp (numpy.ndarray, shape=(M,)) – zero-loss peak of length M
correction (Bool) – Sometimes a decreasing linear slope occurs on the place of the ZLP after deconvolution. This correction fits a linear function and subtracts that from the signal. Default is True.
- Returns:
output – deconvoluted spectrum.
- Return type:
numpy.ndarray, shape=(M,)
- extrp_signal(signal, r=None)[source]
Extrapolate your signal. Extrapolation model, generate vanishing to zero data after the real exp. data See Egerton paragraph 4.2.2 for details. extrapolation of the form A*E^-r is used.
- Parameters:
signal (numpy.ndarray, shape=(M,)) – spectrum
r (float, optional) – extrapolation parameter
- find_optimal_amount_of_clusters(cluster_start=None, sigma=0.05, n_models=1000, conf_interval=1, signal_type='EELS', **kwargs)[source]
Finds the optimal amount of clusters based on the amount of models that need to be trained.
- Parameters:
cluster_start –
sigma –
n_models –
conf_interval –
signal_type –
kwargs –
- get_cluster_signals(conf_interval=1, signal_type='EELS')[source]
Get the signals ordered per cluster. Cluster signals are stored in attribute
self.cluster_signals
. Note that the pixel location information is lost.- Parameters:
- Returns:
cluster_signals – An array with size equal to the number of clusters. Each entry is a 2D array that contains all the spectra within that cluster.
- Return type:
numpy.ndarray, shape=(M,)
- get_extrp_param(signal, range_perc=0.1)[source]
Retrieve the extrapolation parameter from the last 10% of a given signal
- Parameters:
signal (numpy.ndarray, shape=(M,)) –
range_perc (float, optional) –
- Returns:
r – extrapolation parameter
- Return type:
float.
- get_image_signals(signal_type='EELS', **kwargs)[source]
Get all the signals of the image.
- Parameters:
signal_type (str, optional) – Description of signal,
'EELS'
by default.kwargs –
- Returns:
image_signals
- Return type:
numpy.ndarray, shape=(M,N)
- get_pixel_matched_zlp_models(i, j, signal_type='EELS', signal=None, **kwargs)[source]
Returns the shape-(M, N) array of matched ZLP model predictions at pixel (
i
,j
) after training. M and N correspond to the number of model predictions and \(\Delta E\) s respectively.- Parameters:
i (int) – y-coordinate of the pixel.
j (int) – x-coordinate of the pixel.
signal_type (str, bool) – Description of signal type. Set to
'EELS'
by default.signal (array, bool,) – signal you want to match the zlps to. Important to do if you did not do any pooling, pca or nmf on the whole image, otherwise it will calculate the denoised signal twice.
kwargs (dict, optional) – Additional keyword arguments.
- Returns:
predictions – The matched ZLP predictions at pixel (
i
,j
).- Return type:
numpy.ndarray, shape=(M, N)
- get_pixel_signal(i, j, signal_type='EELS', **kwargs)[source]
Retrieves the spectrum at pixel (
i
,j
).- Parameters:
- Returns:
signal – Array with the requested signal from the requested pixel
- Return type:
numpy.ndarray, shape=(M,)
- get_pixel_signal_subtracted(i, j, signal_type='EELS', **kwargs)[source]
Retrieves the spectrum at pixel (
i
,j
) after subtraction of the ZLP.- Parameters:
- Returns:
signal_subtracted – Array with the requested subtracted signal from the requested pixel
- Return type:
numpy.ndarray, shape=(M,)
- get_pixel_signal_subtracted_or_not(i, j, signal_type='EELS', zlp=False, **kwargs)[source]
Function to retrieve non-subtracted data or data after subtraction of the ZLP.
- Parameters:
i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
signal_type (str, optional) – The type of signal that is requested, should comply with the defined names. Set to EELS by default
zlp (bool) – If True, return the subtracted spectrum. Else, return the non-subtracted spectrum
kwargs (dict, optional) – Additional keyword arguments, which are passed to self.get_pixel_signal
- static get_prefix(unit, SIunit=None, numeric=True)[source]
Method to convert units to their associated SI values.
- Parameters:
- Returns:
prefix – The character of the prefix or the numeric value of the prefix
- Return type:
- get_zlp_models(int_i, **kwargs)[source]
Returns the shape-(M, N) array of zlp model predictions at the integrated intensity
int_i
. The logarithm of the integrated intensity is taken, as the data is always trained to log of the signal.- Parameters:
int_i (float) – Integrated intensity
- kramers_kronig_analysis(signal_ssd, n_zlp=None, iterations=1, n=None, t=None, delta=0.5)[source]
Computes the complex dielectric function from the single scattering distribution (SSD)
signal_ssd
following the Kramers-Krönig relations. This code is based on Egerton’s MATlab code [Egerton, 2011].- Parameters:
signal_ssd (numpy.ndarray, shape=(M,)) – SSD of length energy-loss (M)
n_zlp (float) – Total integrated intensity of the ZLP
iterations (int) – Number of the iterations for the internal loop to remove the surface plasmon contribution. If 1 the surface plasmon contribution is not estimated and subtracted (the default is 1).
n (float) – The medium refractive index. Used for normalization of the SSD to obtain the energy loss function. If given the thickness is estimated and returned. It is only required when t is None.
t (float) – The sample thickness in nm.
delta (float) – Factor added to aid stability for calculating surface losses
- Returns:
eps (numpy.ndarray) –
The complex dielectric function,
\[\epsilon = \epsilon_1 + i*\epsilon_2,\]te (float) – local thickness
srf_int (numpy.ndarray) – Surface losses correction
Notes
Relativistic effects are not considered when correcting surface scattering.
The value of delta depends on the thickness of your sample and is qualitatively determined by how realistic the output is.
- classmethod load_compressed_Spectral_image(path_to_compressed_pickle)[source]
Loads spectral image from a compressed pickled file. This will take longer than loading from non-compressed pickle.
- Parameters:
path_to_compressed_pickle (str) – path to the compressed pickle image file.
- Raises:
ValueError – If path_to_compressed_pickle does not end on the desired format .pbz2.
FileNotFoundError – If path_to_compressed_pickle does not exist.
- Returns:
image –
spectral_image.SpectralImage
instance loaded from the compressed pickle file.- Return type:
- classmethod load_data(path_to_dmfile, load_survey_data=False)[source]
Load the .dm4 (or .dm3) data and return a
spectral_image.SpectralImage
instance.- Parameters:
- Returns:
spectral_image.SpectralImage
instance of the dm4 file- Return type:
- classmethod load_spectral_image(path_to_pickle)[source]
Loads
spectral_image.SpectralImage
instance from a pickled file.- Parameters:
path_to_pickle (str) – path to the pickled image file.
- Raises:
ValueError – If path_to_pickle does not end on the desired format .pkl.
FileNotFoundError – If path_to_pickle does not exist.
- Returns:
spectral_image.SpectralImage
object (i.e. including all attributes) loaded from pickle file.- Return type:
- load_zlp_models(path_to_models, plot_chi2=False, plot_pred=False, idx=None, **kwargs)[source]
Loads the trained ZLP models and stores them in
self.zlp_models
. Models that have a \(\chi^2 > \chi^2_{\mathrm{mean}} + 5\sigma\) are discarded, where \(\sigma\) denotes the 68% CI.- Parameters:
path_to_models (str) – Location where the model predictions have been stored after training.
plot_chi2 (bool, optional) – When set to True, plot and save the \(\chi^2\) distribution.
plot_pred (bool, optional) – When set to True, plot and save the ZLP predictions per cluster.
idx (int, optional) – When specified, only the zlp labelled by
idx
is loaded, instead of all model predictions.
- match_zlp_to_signal(signal, zlp, de1, de2, fwhm=None)[source]
Apply the matching to the subtracted spectrum.
- Parameters:
signal (numpy.ndarray, shape=(M,)) – Signal to be matched
zlp (numpy.ndarray, shape=(M,)) – ZLP model to be matched, must match length of Signal.
de1 (float) – Value of the hyperparameter \(\Delta E_{I}\)
de2 (float) – Value of the hyperparameter \(\Delta E_{II}\)
fwhm (float) – Value of the hyperparameter \(FWHM\). If none is given, a fwhm is determined from the signal. Default is None
- Returns:
output – Matched ZLP model
- Return type:
numpy.ndarray, shape=(M,)
- property n_clusters
Returns the number of clusters in the
spectral_image.SpectralImage
object.
- property n_spectra
Returns the number of spectra present in
spectral_image.SpectralImage
object.
- nmf_cluster(cluster, n_components=0.9, max_iter=100000)[source]
Use non-negative matrix factorization on a cluster of the spectral image. The signals of the cluster are already in reduced format (pixel location is lost).
- Parameters:
cluster (numpy.ndarray, shape=(M,)) – An array with size equal to the number of clusters. Each entry is a 2D array that contains all the spectra within that cluster.
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value by PCA first. Default is 0.9.
max_iter (int,) – Default is 100000
- nmf_image(area_type='segment', n_components=0.9, max_iter=100000, segments_x=4, segments_y=4, **kwargs)[source]
Use non-negative matrix factorization on the spectral image.
- Parameters:
area_type (str) –
- type of area used for principal component analysis. Usage types as follows:
'segment'
, the image is segmented and nmf is only done per segmented areas.'cluster'
, the data per cluster is used for nmf within that cluster.'pixel'
, only the data used around a pixel is used for nmf of that pixel.
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value by first running PCA. Default is 0.9.
max_iter (int,) – Default is 100000
segments_x (int) – For
'segment'
option, number of segments the x-axis is divided upon. Default is 4.segments_y (int) – For
'segment'
option, number of segments the y-axis is divided upon. Default is 4.kwargs (dict, optional) – Additional keyword arguments.
- nmf_pixel(i, j, area=9, n_components=0.9, max_iter=100000)[source]
Use principal component analysis on the spectral image, using the data of a squared window of size
n_p
around pixel (i
,j
).- Parameters:
i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
area (int) – PCA area parameter. Area around the pixel used for principal component analysis, must be an odd number
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value by PCA first. Default is 0.9.
max_iter (int,) – Default is 100000
- Returns:
output – PCA spectrum of the pixel
- Return type:
numpy.ndarray, shape=(M,)
- pca_cluster(cluster, n_components=0.9)[source]
Use principal component analysis on a cluster of the spectral image. The signals of the cluster are already in reduced format (pixel location is lost).
- Parameters:
cluster (numpy.ndarray, shape=(M,)) – An array with size equal to the number of clusters. Each entry is a 2D array that contains all the spectra within that cluster.
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value. Default is 0.9.
- pca_image(area_type='segment', n_components=0.9, segments_x=4, segments_y=4, **kwargs)[source]
Use principal component analysis on the spectral image.
- Parameters:
area_type (str) –
- type of area used for principal component analysis. Usage types as follows:
'segment'
, the image is segmented and pca is only done per segmented areas.'cluster'
, the data per cluster is used for pca within that cluster.'pixel'
, only the data used around a pixel is used for pca of that pixel.
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value. Default is 0.9.
segments_x (int) – For
'segment'
option, number of segments the x-axis is divided upon. Default is 4.segments_y (int) – For
'segment'
option, number of segments the y-axis is divided upon. Default is 4.kwargs (dict, optional) – Additional keyword arguments.
- pca_pixel(i, j, area=9, n_components=0.9)[source]
Use principal component analysis on the spectral image, using the data of a squared window of size
n_p
around pixel (i
,j
).- Parameters:
i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
area (int) – PCA area parameter. Area around the pixel used for principal component analysis, must be an odd number
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value. Default is 0.9.
deconv –
zlp_num –
- Returns:
output – PCA spectrum of the pixel
- Return type:
numpy.ndarray, shape=(M,)
- pool_image(area=9, **kwargs)[source]
Pools spectral image using a squared window of size
area
around each pixel
- pool_pixel(i, j, area=9, gaussian=True, **kwargs)[source]
Pools the data of a squared window of size
area
around pixel (i
,j
).- Parameters:
i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
area (int) – Pooling parameter: area around the pixel used for pooling, must be an odd number
gaussian (boolean) – If true the pooling weights will use a gaussian distribution
kwargs (dict, optional) – Additional keyword arguments.
- Returns:
output – Pooled spectrum of the pixel
- Return type:
numpy.ndarray, shape=(M,)
- save_compressed_image(filename)[source]
- Function to save image, including all attributes, in compressed pickle (.pbz2) format. Image will
be saved at location
filename
. Advantage oversave_image()
is that the saved file has a reduced file size, disadvantage is that saving and reloading the image takes significantly longer.
- Parameters:
filename (str) – path to save location plus filename. If it does not end on “.pbz2”, “.pbz2” will be added.
- save_image(filename)[source]
Function to save image, including all attributes, in pickle (.pkl) format. Image will be saved at indicated location and name in filename input.
- Parameters:
filename (str) – path to save location plus filename. If it does not end on “.pkl”, “.pkl” will be added.
- set_eaxis()[source]
Determines the energy losses of the spectral image, based on the bin width of the energy loss. It shifts the
self.eaxis
attribute such that the zero point corresponds with the point of the highest intensity.It also set the extrapolated eaxis for calculations that require extrapolation.
- Returns:
eaxis – Array of \(\Delta E\) values
- Return type:
numpy.ndarray, shape=(M,)
- set_mass_density(rho=None, rho_background=None)[source]
Sets value of mass density for the image as attribute self.rho. If not clustered, rho will be an array of length one, otherwise it is an array of length n_clusters. If rho_background is defined, the cluster with the lowest thickness (cluster 0) will be assumed to be the vacuum/background, and gets the value of the background mass density.
If there are more specimen present in the image, it is wise to check by hand what cluster belongs to what specimen, and set the values by running:
image.n[cluster_i] = n_i
- Parameters:
rho –
rho_background –
- set_refractive_index(n=None, n_background=None)[source]
Sets value of refractive index for the image as attribute self.n. If not clustered, n will be an array of length one, otherwise it is an array of length n_clusters. If n_background is defined, the cluster with the lowest thickness (cluster 0) will be assumed to be the vacuum/background, and gets the value of the background refractive index.
If there are more specimen present in the image, it is wise to check by hand what cluster belongs to what specimen, and set the values by running:
image.n[cluster_i] = n_i
- property shape
Returns 3D-shape of
spectral_image.SpectralImage
object
- static smooth_signal(signal, window_length=51, window_type='hanning')[source]
Smooth a signal using a window length
window_length
and a window typewindow_type
.This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the beginning and end part of the output signal.
- Parameters:
signal (numpy.ndarray, shape=(M,)) – Signal of length M
window_length (int, optional) – The dimension of the smoothing window; should be an odd integer. Default is 51.
window_type (str, optional) – the type of window from
'flat'
,'hanning'
,'hamming'
,'bartlett'
,'blackman'
and'kasier'
.'flat'
window will produce a moving average smoothing. Default is'hanning'
- Returns:
signal_smooth – The smoothed signal
- Return type:
numpy.ndarray, shape=(M,)
- train_zlp_models(conf_interval=1, lr=0.001, signal_type='EELS', **kwargs)[source]
Train the ZLP on the spectral image.
The spectral image is clustered in
n_clusters
clusters, according to e.g. the integrated intensity or thickness. A random spectrum is then taken from each cluster, which together defines one replica. The training is initiated by callingtrain_zlp_models_scaled()
.
EELSFitter.core.training module
- class EELSFitter.core.training.MultilayerPerceptron(num_inputs, num_outputs)[source]
Bases:
Module
Multilayer Perceptron (MLP) class. It uses the following architecture
\[[n_i, 10, 15, 5, n_f],\]where \(n_i\) and \(n_f\) denote the number of input features and output target values respectively.
- Parameters:
- class EELSFitter.core.training.TrainZeroLossPeak(spectra, eaxis, cluster_centroids=None, display_step=1000, training_report_step=1, n_batch_of_replica=1, n_batches=1, n_replica=100, n_epochs=1000, shift_de1=1.0, shift_de2=1.0, regularisation_constant=10.0, path_to_models='./models/', remove_temp_files=True, **kwargs)[source]
Bases:
object
- calc_scale_var_log_int_i(based_on='log_zlp')[source]
Calculate the scale variables of the log of the integrated intensity of the spectra for the three highest bins of the Zero Loss Peak.
- calculate_hyperparameters()[source]
Calculate the values of the hyperparameters in the gain and loss region, dE1 and mdE1 are calculated by taking the location of the kneedles at each side of the ZLP and shifting them with the gives shift value.
dE2 and mdE2 are calcualted by taking the value of the eaxis where a fit of the log10 function intersects with a single count. If this value is not found the end point of the signal is taken as location for dE2.
- cleanup_files()[source]
Cleans up the files generated by train_zlp_models_scaled. costs_train_*, costs_test_*, and nn_rep_* files are merged into single files costs_train, costs_test, and nn_parameters respectively.
- find_fwhm_idx()[source]
- Determine the FWHM indices per cluster (Full Width at Half Maximum):
indices of the left and right side of the ZLP
indices of the left and right side of the log of the ZLP
These are all determine by taking the local minimum and maximum of the dy/dx
- find_kneedle_idx()[source]
Find the kneedle index per cluster. The kneedle algorithm is used to find the point of highest curvature in your concave or convex data set.
- find_local_min_idx()[source]
Determine the first local minimum index of the signals per cluster by setting it to the point where the derivative crosses zero.
- initialize_x_y_sigma_input(cluster_label)[source]
Initialize the x, y and sigma input for the Neural Network. The spectrum is split into the 3 regions as given by the toy model. For the y data, the data in region I is set to the log intensity up to dE1, the data in region III is set to zero. For the x data two input features, first is the values of the energy axis in region I and III, second is the rescaled log of the total integrated intensity. This factor is to ensure symmetry is retained between input and output values. For the sigma data
- Parameters:
cluster_label (int) – Label of the cluster
- loss_function(output, output_for_derivative, target, error)[source]
The loss function to train the ZLP takes the model
output
, the raw spectrumtarget
and the associatederror
. The latter corresponds to the one sigma spread within a given cluster at fixed \(\Delta E\). It returns the cost function \(C_{\mathrm{ZLP}}^{(m)}\) associated with the replica \(m\) as(38)\[C_{\mathrm{ZLP}}^{(m)} = \frac{1}{n_{E}} \sum_{k=1}^K \sum_{\ell_k=1}^{n_E^{(k)}} \frac{\left[I^{(i_{m,k}, j_{m,k})}(E_{\ell_k}) - I_{\rm ZLP}^{({\mathrm{NN}})(m)} \left(E_{\ell_k},\ln \left( N_{\mathrm{ tot}}^{(i_{m,k},j_{m,k})} \right) \right) \right]^2}{\sigma^2_k \left(E_{\ell_k} \right)}.\]- Parameters:
eaxis (np.ndarray) – Energy-loss axis
output (torch.tensor) – Neural Network output
output_for_derivative (list of torch.tensor) – Each entry in the list should correspond to the neural network output between de1 and de2 of a single spectrum in the replica
target (torch.tensor) – Raw EELS spectrum
error (torch.tensor) – Uncertainty on \(\log I_{\mathrm{EELS}}(\Delta E)\).
- Returns:
loss – Loss associated with the model
output
.- Return type:
torch.tensor
- plot_hp_cluster(**kwargs)[source]
Create a plot of the hyperparameters plotted on top of the spectra per cluster.
- Parameters:
kwargs (dict, optional) – Additional keyword arguments.
- plot_hp_cluster_slope(**kwargs)[source]
Create a plot of the hyperparameters plotted on top of the slopes of the spectra per cluster.
- Parameters:
kwargs (dict, optional) – Additional keyword arguments.
- plot_training_report()[source]
Creat the training report plot: evolution of the training and validation loss per epoch.
- save_figplot(fig, title='no_title.pdf')[source]
Display the computed values of dE1 (both methods) together with the raw EELS spectrum.
- Parameters:
fig (matplotlib.Figure) – Figure to be saved.
title (str) – Filename to store the plot in.
- save_hyperparameters()[source]
- Save the hyperparameters in hyperparameters.txt. These are:
cluster centroids, keep note if they were determined from the raw data, or if the log had been taken.
dE1 for all clusters
dE2 for all clusters
FWHM for all clusters
- save_scale_var_log_int_i()[source]
Save the scale variables of the log of the total integrated intensity of the spectra, denoted
I
.
- scale_eaxis()[source]
Scales the features of the energy axis between [0.1, 0.9]. This is to optimize the speed of the neural network.
- set_dydx_data()[source]
Determines the slope of all spectra per cluster, smooths the slope and takes the median per cluster.
- set_path_for_training_report(j)[source]
Set the save directory for the training report of the replica being trained on.
- Parameters:
j (int) – Index of the replica being trained on.
- set_test_x_y_sigma()[source]
Take the x, y and sigma data for the test set and reshape them for neural network input
- set_train_x_y_sigma()[source]
Take the x, y and sigma data for the train set and reshape them for neural network input
- train_and_evaluate_model(i, j, lr)[source]
Train and evaluate the model. Also saves the values of cost_train and cost_test per epoch.
- EELSFitter.core.training.find_scale_var(inp, min_out=0.1, max_out=0.9)[source]
Computes the scaling parameters needed to rescale the training data to lie between
min_out
andmax_out
. For our neural network the value range [0.1, 0.9] ensures the neuron activation states will typically lie close to the linear region of the sigmoid activation function.- Parameters:
inp (numpy.ndarray, shape=(M,)) – training data to be rescaled
min_out (float) – lower limit. Set to 0.1 by default.
max_out (float) – upper limit. Set to 0.9 by default
- Returns:
a, b – list of rescaling parameters
- Return type:
- EELSFitter.core.training.scale(inp, ab)[source]
Rescale the training data to lie between 0.1 and 0.9. Rescaling features is to help speed up the neural network training process. The value range [0.1, 0.9] ensures the neuron activation states will typically lie close to the linear region of the sigmoid activation function.
- Parameters:
inp (numpy.ndarray, shape=(M,)) – training data to be rescaled, e.g. \(\Delta E\)
ab (numpy.ndarray, shape=(M,)) – scaling parameters, which can be found with
find_scale_var()
.
- Return type:
Rescaled training data
- EELSFitter.core.training.smooth_signals_per_cluster(signals, window_length=51, window_type='hanning')[source]
Smooth all signals in a cluster using a window length
window_len
and a window typewindow
.This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the beginning and end part of the output signal.
- Parameters:
signals (numpy.ndarray, shape=(M,)) – The input data
window_length (int, optional) – The dimension of the smoothing window; should be an odd integer. Default is 51
window_type (str, optional) – the type of window from
"flat"
,"hanning"
,"hamming"
,"bartlett"
,"blackman"
and"kasier"
."flat"
window will produce a moving average smoothing. Default is"hanning"
- Returns:
signal_smooth – The smoothed signal
- Return type:
numpy.ndarray, shape=(M,)
EELSFitter.core.gainpeakfitter module
- class EELSFitter.core.gainpeakfitter.GainPeakFitter(x_signal, y_signal, image_shape, image=None)[source]
This class extracts gain peaks from an EEL spectrum. The GainPeakFitter fits the ZLP to a Gaussian based on the ZLP’s FWHM. The ZLP is then subtractred from the EEL spectrum (also referred to as ‘the signal’). The subtracted signal contains a peak which is then fitted to a Lorentzian. From the Lorentzian, the energy of the gain peak is determined.
INPUT
- imagespectral image
4D data from a .dm4 file
Example
To fit the gain/loss peaks:
lrtz = GainPeakFitter(x_signal, y_signal, image_shape)
lrtz.generate_best_fits()
lrtz.fit_gain_peak()
lrtz.fit_loss_peak()
To plot the gain/loss peaks:
lrtz.create_new_plot()
lrtz.plot_all_results()
lrtz.ax.set_ylim(0, 1e4)
lrtz.fig
lrtz.plot_signal()
lrtz.plot_model()
lrtz.plot_subtracted()
lrtz.plot_gain_peak()
lrtz.plot_loss_peak()
lrtz.print_results()
- array2D_of_intensity_at_energy_loss(eloss=-1.1)[source]
Generate 2D array of intensities for each SI-pixel at a chosen energy loss.
- curve_fit_between_background(x_left=-1.6, x_right=-0.6)[source]
Curve fit between chosen eloss fitting range.
- determine_parameters(function)[source]
Determines the parameters specifying a Gaussian or Lorentzian function using the signal’s peak height and the FWHM of the peak.
- Parameters:
function (str, {'Gaussian', 'Lorentzian'}) – Model choice for the ZLP.
- fit_gain_peak_mc(i, j, L_bound=-3.5, R_bound=-0.5, return_all=False, return_conf_interval=False, **kwargs)[source]
Use the Monte Carlo replica method to fit a Lorentzian model to the subtracted spectrum in a specified energy interval [
L_bound, R_bound
].- Parameters:
i (int) – y-coordinate of the pixel.
j (int) – x-coordinate of the pixel.
L_bound (float, optional) – Left bound of the interval in which to fit to the subtracted spectrum.
R_bound (float, optional) – Right bound of the interval in which to fit to the subtracted spectrum.
return_all (bool, optional) – Option to return the subtracted spectra for all replicas corresponding to this pixel.
return_conf_interval (bool, optional) – Option to specify if the upper and lower bounds of the confidence interval must be returned.
kwargs (dict, optional) – Additional keyword arguments.
- Returns:
gain (numpy.ndarray) – Array with the median Lorentzian fit to the subtracted spectrum.
gain_low (numpy.ndarray, optional) – Lower bound of the Lorentzian fit.
gain_high (numpy.ndarray, optional) – Upper bound of the Lorentzian fit.
- fit_models(n_rep=500, n_clusters=5, function='Gaussian', conf_interval=1, signal_type='EELS', **kwargs)[source]
Use the Monte Carlo replica method to fit chosen
function
model to the ZLP. In this method it is assumed that in each cluster the ZLP is sampled from the same underlying distribution in that particular cluster. This methods samples the underlying distribution in order to obtain the median, low, and high predictions for the ZLP at teach loss value.The model predictions are stored in self.zlp_models_all, where the median, low, and high values are the first, second, and third element respectively.
- Parameters:
n_rep (int, default=500) – Number of Monte Carlo replicas to use.
n_clusters (int, default=5) – Number of clusters to use.
function (str, {'Gaussian', 'Split Gausian', 'Lorentzian', 'Pearson VII', 'Split Pearson VII', 'Pseudo-Voigt', 'Split Pseudo-Voigt', 'Generalised Peak', 'Kaiser window'}) – Model choice for the ZLP.
conf_interval (float, optional) – The ratio of spectra returned. The spectra are selected based on the based_on value. The default is 1.
signal_type (str, optional) – Description of signal,
'EELS'
by default.kwargs (dict, optional) – Additional keyword arguments.
- fwhm(row=0, col=0, do_fit=True, **kwargs)[source]
Calculates the FWHM of the ZLP in a particular pixel (
row
,col
). Optionally calculate FWHM through fitting a Gaussian to the ZLP and extracting the FWHM.- Parameters:
do_fit (bool, default=True) – Option to obtain FWHM through fitting a Gaussian.
- Returns:
fwhm (float) – FWHM of the ZLP in chosen pixel.
fit (float) – FWHM obtained through fitting a Gaussian to the ZLP.
- gaussian(x, a, sigma, x0=0)[source]
Gaussian centered around
x
=x0
.- Parameters:
- Returns:
Gaussian.
- Return type:
- generalised_peak(x, x_0, delta, nu)[source]
Generalised Peak function calculated as
\[\frac{2}{\pi \delta} \Bigg|\frac{\Gamma\left[\frac{\nu}{2} + i \gamma_\nu \left(\frac{4 x_s^2}{\pi^2 \delta^2} \right)^2 \right]}{\Gamma\left[\frac{\nu}{2}] \right]}\Bigg|^2,\]where \(\gamma_\nu = \sqrt{\pi} \frac{\Gamma\left[\frac{\nu + 1}{2} \right]}{\Gamma\left[\nu + \frac{1}{2} \right]}\).
- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
x_0 (float) – Energy offset.
delta (float) – Parameter describing the peak width.
nu (float) – Parameter describing the peak shape.
- Returns:
Generalised Peak function.
- Return type:
- get_model(i, j, signal_type='EELS', return_all=False, return_conf_interval=False, **kwargs)[source]
Retrieves the model fit to the ZLP at pixel (
i
,j
).- Parameters:
i (int) – y-coordinate of the pixel.
j (int) – x-coordinate of the pixel.
signal_type (str, optional) – The type of signal that is requested, should comply with the defined names. Set to
'EELS'
by default.return_all (bool, optional) – Option to return all models.
return_conf_interval (bool, optional) – Option to specify if the upper and lower bounds of the confidence interval must be returned.
kwargs (dict, optional) – Additional keyword arguments.
- Returns:
model (numpy.ndarray) – Array with the median model fit to the ZLP from the requested pixel.
model_low (numpy.ndarray, optional) – Lower bound of the confidence interval of the model fit to the ZLP.
model_high (numpy.ndarray, optional) – Upper bound of the confidence interval of the model fit to the ZLP.
- get_subtracted_spectrum(i, j, signal_type='EELS', return_all=False, return_conf_interval=False, **kwargs)[source]
Retrieves the subtracted spectrum at pixel (
i
,j
).- Parameters:
i (int) – y-coordinate of the pixel.
j (int) – x-coordinate of the pixel.
signal_type (str, optional) – The type of signal that is requested, should comply with the defined names. Set to
'EELS'
by default.return_all (bool, optional) – Option to return the subtracted spectra for all replicas corresponding to this pixel.
return_conf_interval (bool, optional) – Option to specify if the upper and lower bounds of the confidence interval must be returned.
kwargs (dict, optional) – Additional keyword arguments.
- Returns:
signal (numpy.ndarray) – Array with the median subtracted spectrum from the requested pixel.
signal_low (numpy.ndarray, optional) – Lower bound of the confidence interval of the subtracted spectrum.
signal_high (numpy.ndarray, optional) – Upper bound of the confidence interval of the subtracted spectrum.
- inspect_spectrum(row=0, col=0, function='Gaussian', method='fit', **kwargs)[source]
Fit chosen
function
model to the ZLP and obtain subtracted spectrum.- Parameters:
function (str, {'Gaussian', 'Split Gausian', 'Lorentzian', 'Pearson VII', 'Split Pearson VII', 'Pseudo-Voigt', 'Split Pseudo-Voigt', 'Generalised Peak', 'Kaiser window'}) – Model choice for the ZLP.
method (str, {
'fit'
,'FWHM'
}) – Method to use to extract model fit parameters.'FWHM'
is only supported for Gaussian and Lorentzian ZLP models.kwargs (dict, optional) – Additional keyword arguments.
- kaiser(x, L, m)[source]
Kaiser window function, calculated as
\[\begin{split}w_0(x) \triangleq \begin{array}{cl} \frac{1}{L} \frac{I_0\left[m \sqrt{1-(2 x / L)^2}\right]}{I_0[m]}, & |x| \leq L / 2 \\ 0, & |x|>L / 2 \end{array},\end{split}\]where \(I_0\) is the zeroth-order modified Bessel function of the first kind.
- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
L (float) – Window duration.
m (float) – Parameter determining the window shape.
- Returns:
Kaiser window function.
- Return type:
- lorentzian(x, x0, a, gam)[source]
Lorentzian centered around
x
=x0
.- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
x0 (float) – Energy loss at the center of the Lorentzian.
a (float) – Height of the Lorentzian peak.
gam (float) – 2 *
gamma
is full width at half maximum (FWHM).
- Returns:
Lorentzian.
- Return type:
- lorentzian_background(x, x0, a, gam, E_0, b)[source]
Lorentzian centered around
x
=x0
.- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
x0 (float) – Energy loss at the center of the Lorentzian.
a (float) – Height of the Lorentzian peak.
gam (float) – 2 *
gamma
is full width at half maximum (FWHM).
- Returns:
Lorentzian.
- Return type:
- model(x, function, **kwargs)[source]
Calculates the ZLP model using the optimal fit parameters.
- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
function (str, {'Gaussian', 'Split Gausian', 'Lorentzian', 'Pearson VII', 'Split Pearson VII', 'Pseudo-Voigt', 'Split Pseudo-Voigt', 'Generalised Peak', 'Kaiser window'}) – Model choice for the ZLP.
kwargs (dict, optional) – Additional keyword arguments.
- Returns:
ZLP model fit.
- Return type:
- model_fit_between(function='Gaussian', **kwargs)[source]
Model fit of the ZLP. Delete x coordinates that contain information about gain/loss peaks.
- Parameters:
- Returns:
popt (numpy.ndarray) – Optimal values for the parameters so that the sum of the squared residuals of f(xdata, *popt) - ydata is minimized.
pcov (numpy.ndarray) – The estimated covariance of popt. The diagonals provide the variance of the parameter estimates. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).
- pearson(x, I_max, x_0, w, m)[source]
Pearson VII function calculated as
\[I(x, I_{\mathrm{max}}, x_0, w, m) = I_{\mathrm{max}} \frac{w^{2m}}{\left[w^2 + \left(2^\frac{1}{m} - 1 \right) \left(x - x_0 \right)^2 \right]^2}.\]- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
I_max (float) – Height of the peak.
x_0 (float) – Energy loss at the center of the peak.
w (float) – Parameter related to the width of the peak.
m (float) – Parameter chosen to suit a particular peak shape.
- Returns:
Pearson VII function.
- Return type:
- plot_all_results(i=None, j=None, monte_carlo=False)[source]
Plot all components: original signal, Model-fit ZLP, subtracted spectrum, Lorentzian-fit gain peak.
- plot_array2D_of_intensity_at_energy_loss(eloss=-1.1, cmap='turbo', cbar_title='intensity', dm4_filename='')[source]
Example heatmap plot: SI with only intensity at energy loss -1.1eV.
- Parameters:
- Returns:
pyplot figure ax: pyplot axis
- Return type:
fig
- pseudo_voigt(x, I_max, x_0, f, eta)[source]
Linear combination of a Gaussian and Lorentzian function, both described by the same FWHM
f
.- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
I_max (float) – Height of the peak.
x_0 (float) – Energy loss at the center of the peak.
f (float) – FWHM.
eta (float) – Mixing parameter.
- Returns:
Pseudo-Voigt function.
- Return type:
- signal_subtracted(function='Gaussian', **kwargs)[source]
Generate signal spectrum minus model fitted ZLP.
- Parameters:
- Returns:
1D array of signal spectrum minus model fitted ZLP.
- Return type:
- split_gaussian(x, a, sigma_left, sigma_right, x0=0)[source]
Gaussian centered around
x
=x0
with a different standard deviation forx
<x0
andx
>x0
.- Parameters:
- Returns:
Split Gaussian.
- Return type:
- split_lorentzian(x, x0, a, gam_left, gam_right)[source]
Lorentzian centered around
x
=x0
with a different FWHM forx
<x0
andx
>x0
.- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
x0 (float) – Energy loss at the center of the Lorentzian.
a (float) – Height of the Lorentzian peak.
gam_left (float) – 2 *
gam_left
is full width at half maximum (FWHM) of the left half of the Lorentzian.gam_right (float) – 2 *
gam_right
is full width at half maximum (FWHM) of the right half of the Lorentzian.
- Returns:
Split Lorentzian.
- Return type:
- split_pearson(x, I_max, x_0, w_left, w_right, m)[source]
Pearson VII function with a different
w
forx
<x_0
andx
>x_0
.- Parameters:
x (numpy.ndarray) – 1D array of energy loss.
I_max (float) – Height of the peak.
x_0 (float) – Energy loss at the center of the peak.
w (float) – Parameter related to the width of the peak.
m (float) – Parameter chosen to suit a particular peak shape.
- Returns:
Split Pearson VII function.
- Return type: