EELSFitter.core package

EELSFitter.core.spectral_image module

class EELSFitter.core.spectral_image.SpectralImage(data, deltaE=None, pixel_size=None, beam_energy=None, collection_angle=None, convergence_angle=None, name=None, **kwargs)[source]

KK_pixel(i, j, signal_type='EELS', iterations=1, **kwargs)[source]

Perform a Kramer-Krönig analysis on pixel (i, j).

Parameters:

i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel.
signal_type (str, optional) – Type of spectrum. Set to EELS by default.
iterations (int) – Number of the iterations for the internal loop to remove the surface plasmon contribution. If 1 the surface plasmon contribution is not estimated and subtracted (the default is 1).

Returns:

dielectric_functions (numpy.ndarray, shape=(M,)) – Collection dielectric-functions replicas at pixel (i, j).
ts (float) – Thickness.
S_ss (array_like) – Surface scatterings.
signal_ssds (array_like) – Deconvoluted EELS spectrum.

calc_axes()[source]: Determines the x_axis and y_axis of the spectral image. Stores them in self.x_axis and self.y_axis respectively.

calc_thickness(signal, n=None, rho=None, n_zlp=None)[source]

Calculates thickness from sample data by one of two methods:

Kramer-Kronig sum rule using the refractive index [Egerton and Cheng, 1987]
Log ratio using mass density [Iakoubovskii et al., 2008]

Parameters:

signal (numpy.ndarray, shape=(M,)) – spectrum
n (float) – refraction index
rho (float) – mass density
n_zlp (float or int) – Set to 1 by default, for already normalized EELS spectra.

Returns:

t – thickness

Return type:

float

Notes

If using the refractive index, surface scatterings are not corrected for. If you wish to correct for surface scatterings, please extract the thickness t from kramers_kronig_analysis().

cluster(n_clusters=5, based_on='log_zlp', init='k-means++', n_times=10, max_iter=300, seed=None, save_seed=False, algorithm='lloyd', **kwargs)[source]

Clusters the spectral image into clusters according to the (log) integrated intensity at each pixel. Cluster means are stored in the attribute self.cluster_centroids. This is then passed on to the cluster_on_centroids function where the index to which each cluster belongs is stored in the attribute self.cluster_labels.

Parameters:

n_clusters (int, optional) – Number of clusters, 5 by default
based_on (str, optional) – One can cluster either on the sum of the intensities (pass 'sum'), the log of the sum (pass 'log_sum'), the log of the ZLP peak value (pass 'log_peak'), the log of the ZLP peak value + the two bins next to the peak value (pass 'log_zlp'), the log of the sum of the bulk spectrum (no zlp) (pass 'log_bulk'), the thickness (pass 'thickness'). The default is 'log_zlp'.
init ({'k-means++', 'random'}, callable or array-like of shape ) – (n_clusters, n_features), default=’k-means++’
n_times (int, default=10) – Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.
max_iter (int, default=300) – Maximum number of iterations of the k-means algorithm for a single run.
seed (int or None, default=None) – Determines random number generation for centroid initialization. Use an int to make the randomness deterministic.
save_seed (bool, default=False) – save the seed with corresponding settings to get to the same result
algorithm ({'lloyd', 'elkan'}, default='lloyd') – K-means algorithm to use. The classical EM-style algorithm is 'lloyd'. The 'elkan' variation can be more efficient on some datasets with well-defined clusters, by using the triangle inequality. However, it’s more memory intensive due to the allocation of an extra array of shape (n_samples, n_clusters).
kwargs –

cluster_on_centroids(cluster_centroids, based_on='log_zlp', **kwargs)[source]

If the image has been clustered before and the cluster centroids are already known, one can use this function to reconstruct the original clustering of the image.

Parameters:

cluster_centroids (numpy.ndarray, shape=(M,)) – Array with the cluster centroids
based_on (str, optional) – One can cluster either on the sum of the intensities (pass 'sum'), the log of the sum (pass 'log_sum'), the log of the ZLP peak value (pass 'log_peak'), the log of the ZLP peak value + the two bins next to the peak value (pass 'log_zlp'), the log of the sum of the bulk spectrum (no zlp) (pass 'log_bulk'), the thickness (pass 'thickness'). The default is 'log_zlp'.
kwargs –

static compressed_pickle(title, data)[source]: Saves data at location title as compressed pickle.

cut_image_energy(e1=None, e2=None, include_edge=True)[source]

Cuts the spectral image at E1 and E2 and keeps only the part in between.

Parameters:

e1 (float, optional) – lower cut. The default is None, which means no cut is applied.
e2 (float, optional) – upper cut. The default is None, which means no cut is applied.
include_edge (Boolean, optional) – If True, the edge values given in E1 and E2 are included in the cut result. Default is True

cut_image_pixels(range_width, range_height)[source]

Cuts the spectral image

Parameters:

range_width (numpy.ndarray, shape=(2,)) – Contains the horizontal selection cut
range_height (numpy.ndarray, shape=(2,)) – Contains the vertical selection cut

static decompress_pickle(file)[source]

Opens, decompresses and returns the pickle file at location file.

Parameters:: file (str) – location where the pickle file is stored
Returns:: data
Return type:: SpectralImage

deconvolution(signal, zlp, correction=True)[source]

Perform deconvolution on a given signal with a given Zero Loss Peak. This removes both the ZLP and plural scattering. Based on the Fourier Log Method [Johnson and Spence, 1974, Egerton, 2011]

Parameters:

signal (numpy.ndarray, shape=(M,)) – Raw signal of length M
zlp (numpy.ndarray, shape=(M,)) – zero-loss peak of length M
correction (Bool) – Sometimes a decreasing linear slope occurs on the place of the ZLP after deconvolution. This correction fits a linear function and subtracts that from the signal. Default is True.

Returns:

output – deconvoluted spectrum.

Return type:

numpy.ndarray, shape=(M,)

extrp_signal(signal, r=None)[source]

Extrapolate your signal. Extrapolation model, generate vanishing to zero data after the real exp. data See Egerton paragraph 4.2.2 for details. extrapolation of the form A*E^-r is used.

Parameters:

signal (numpy.ndarray, shape=(M,)) – spectrum
r (float, optional) – extrapolation parameter

find_optimal_amount_of_clusters(cluster_start=None, sigma=0.05, n_models=1000, conf_interval=1, signal_type='EELS', **kwargs)[source]

Finds the optimal amount of clusters based on the amount of models that need to be trained.

Parameters:

cluster_start –
sigma –
n_models –
conf_interval –
signal_type –
kwargs –

get_cluster_signals(conf_interval=1, signal_type='EELS')[source]

Get the signals ordered per cluster. Cluster signals are stored in attribute self.cluster_signals. Note that the pixel location information is lost.

Parameters:

conf_interval (float, optional) – The ratio of spectra returned. The spectra are selected based on the based_on value. The default is 1.
signal_type (str, optional) – Description of signal, 'EELS' by default.

Returns:

cluster_signals – An array with size equal to the number of clusters. Each entry is a 2D array that contains all the spectra within that cluster.

Return type:

numpy.ndarray, shape=(M,)

get_extrp_param(signal, range_perc=0.1)[source]

Retrieve the extrapolation parameter from the last 10% of a given signal

Parameters:

signal (numpy.ndarray, shape=(M,)) –
range_perc (float, optional) –

Returns:

r – extrapolation parameter

Return type:

float.

get_image_signals(signal_type='EELS', **kwargs)[source]

Get all the signals of the image.

Parameters:

signal_type (str, optional) – Description of signal, 'EELS' by default.
kwargs –

Returns:

image_signals

Return type:

numpy.ndarray, shape=(M,N)

get_pixel_matched_zlp_models(i, j, signal_type='EELS', signal=None, **kwargs)[source]

Returns the shape-(M, N) array of matched ZLP model predictions at pixel (i, j) after training. M and N correspond to the number of model predictions and \(\Delta E\) s respectively.

Parameters:

i (int) – y-coordinate of the pixel.
j (int) – x-coordinate of the pixel.
signal_type (str, bool) – Description of signal type. Set to 'EELS' by default.
signal (array, bool,) – signal you want to match the zlps to. Important to do if you did not do any pooling, pca or nmf on the whole image, otherwise it will calculate the denoised signal twice.
kwargs (dict, optional) – Additional keyword arguments.

Returns:

predictions – The matched ZLP predictions at pixel (i, j).

Return type:

numpy.ndarray, shape=(M, N)

get_pixel_signal(i, j, signal_type='EELS', **kwargs)[source]

Retrieves the spectrum at pixel (i, j).

Parameters:

i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
signal_type (str, optional) – The type of signal that is requested, should comply with the defined names. Set to EELS by default.

Returns:

signal – Array with the requested signal from the requested pixel

Return type:

numpy.ndarray, shape=(M,)

get_pixel_signal_subtracted(i, j, signal_type='EELS', **kwargs)[source]

Retrieves the spectrum at pixel (i, j) after subtraction of the ZLP.

Parameters:

i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
signal_type (str, optional) – The type of signal that is requested, should comply with the defined names. Set to EELS by default.

Returns:

signal_subtracted – Array with the requested subtracted signal from the requested pixel

Return type:

numpy.ndarray, shape=(M,)

get_pixel_signal_subtracted_or_not(i, j, signal_type='EELS', zlp=False, **kwargs)[source]

Function to retrieve non-subtracted data or data after subtraction of the ZLP.

Parameters:

i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
signal_type (str, optional) – The type of signal that is requested, should comply with the defined names. Set to EELS by default
zlp (bool) – If True, return the subtracted spectrum. Else, return the non-subtracted spectrum
kwargs (dict, optional) – Additional keyword arguments, which are passed to self.get_pixel_signal

static get_prefix(unit, SIunit=None, numeric=True)[source]

Method to convert units to their associated SI values.

Parameters:

unit (str,) – unit of which the prefix is requested
SIunit (str, optional) – The SI unit of the unit
numeric (bool, optional) – Default is True. If True the prefix is translated to the numeric value (e.g. \(10^3\) for k)

Returns:

prefix – The character of the prefix or the numeric value of the prefix

Return type:

str or int

get_zlp_models(int_i, **kwargs)[source]

Returns the shape-(M, N) array of zlp model predictions at the integrated intensity int_i. The logarithm of the integrated intensity is taken, as the data is always trained to log of the signal.

Parameters:: int_i (float) – Integrated intensity

kramers_kronig_analysis(signal_ssd, n_zlp=None, iterations=1, n=None, t=None, delta=0.5)[source]

Computes the complex dielectric function from the single scattering distribution (SSD) signal_ssd following the Kramers-Krönig relations. This code is based on Egerton’s MATlab code [Egerton, 2011].

Parameters:

signal_ssd (numpy.ndarray, shape=(M,)) – SSD of length energy-loss (M)
n_zlp (float) – Total integrated intensity of the ZLP
iterations (int) – Number of the iterations for the internal loop to remove the surface plasmon contribution. If 1 the surface plasmon contribution is not estimated and subtracted (the default is 1).
n (float) – The medium refractive index. Used for normalization of the SSD to obtain the energy loss function. If given the thickness is estimated and returned. It is only required when t is None.
t (float) – The sample thickness in nm.
delta (float) – Factor added to aid stability for calculating surface losses

Returns:

eps (numpy.ndarray) –

The complex dielectric function,

\[\epsilon = \epsilon_1 + i*\epsilon_2,\]
te (float) – local thickness
srf_int (numpy.ndarray) – Surface losses correction

Notes

Relativistic effects are not considered when correcting surface scattering.
The value of delta depends on the thickness of your sample and is qualitatively determined by how realistic the output is.

classmethod load_compressed_Spectral_image(path_to_compressed_pickle)[source]

Loads spectral image from a compressed pickled file. This will take longer than loading from non-compressed pickle.

Parameters:

path_to_compressed_pickle (str) – path to the compressed pickle image file.

Raises:

ValueError – If path_to_compressed_pickle does not end on the desired format .pbz2.
FileNotFoundError – If path_to_compressed_pickle does not exist.

Returns:

image – spectral_image.SpectralImage instance loaded from the compressed pickle file.

Return type:

SpectralImage

classmethod load_data(path_to_dmfile, load_survey_data=False)[source]

Load the .dm4 (or .dm3) data and return a spectral_image.SpectralImage instance.

Parameters:

path_to_dmfile (str) – location of .dm4 file
load_survey_data (bool, optional) – If there is HAADF survey data in your file, you can choose to also load that in. Default is False.

Returns:

spectral_image.SpectralImage instance of the dm4 file

Return type:

SpectralImage

classmethod load_spectral_image(path_to_pickle)[source]

Loads spectral_image.SpectralImage instance from a pickled file.

Parameters:

path_to_pickle (str) – path to the pickled image file.

Raises:

ValueError – If path_to_pickle does not end on the desired format .pkl.
FileNotFoundError – If path_to_pickle does not exist.

Returns:

spectral_image.SpectralImage object (i.e. including all attributes) loaded from pickle file.

Return type:

SpectralImage

load_zlp_models(path_to_models, plot_chi2=False, plot_pred=False, idx=None, **kwargs)[source]

Loads the trained ZLP models and stores them in self.zlp_models. Models that have a \(\chi^2 > \chi^2_{\mathrm{mean}} + 5\sigma\) are discarded, where \(\sigma\) denotes the 68% CI.

Parameters:

path_to_models (str) – Location where the model predictions have been stored after training.
plot_chi2 (bool, optional) – When set to True, plot and save the \(\chi^2\) distribution.
plot_pred (bool, optional) – When set to True, plot and save the ZLP predictions per cluster.
idx (int, optional) – When specified, only the zlp labelled by idx is loaded, instead of all model predictions.

match_zlp_to_signal(signal, zlp, de1, de2, fwhm=None)[source]

Apply the matching to the subtracted spectrum.

Parameters:

signal (numpy.ndarray, shape=(M,)) – Signal to be matched
zlp (numpy.ndarray, shape=(M,)) – ZLP model to be matched, must match length of Signal.
de1 (float) – Value of the hyperparameter \(\Delta E_{I}\)
de2 (float) – Value of the hyperparameter \(\Delta E_{II}\)
fwhm (float) – Value of the hyperparameter \(FWHM\). If none is given, a fwhm is determined from the signal. Default is None

Returns:

output – Matched ZLP model

Return type:

numpy.ndarray, shape=(M,)

property n_clusters: Returns the number of clusters in the spectral_image.SpectralImage object.

property n_spectra: Returns the number of spectra present in spectral_image.SpectralImage object.

nmf_cluster(cluster, n_components=0.9, max_iter=100000)[source]

Use non-negative matrix factorization on a cluster of the spectral image. The signals of the cluster are already in reduced format (pixel location is lost).

Parameters:

cluster (numpy.ndarray, shape=(M,)) – An array with size equal to the number of clusters. Each entry is a 2D array that contains all the spectra within that cluster.
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value by PCA first. Default is 0.9.
max_iter (int,) – Default is 100000

nmf_image(area_type='segment', n_components=0.9, max_iter=100000, segments_x=4, segments_y=4, **kwargs)[source]

Use non-negative matrix factorization on the spectral image.

Parameters:

area_type (str) –
type of area used for principal component analysis. Usage types as follows:
- 'segment', the image is segmented and nmf is only done per segmented areas.
- 'cluster', the data per cluster is used for nmf within that cluster.
- 'pixel', only the data used around a pixel is used for nmf of that pixel.
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value by first running PCA. Default is 0.9.
max_iter (int,) – Default is 100000
segments_x (int) – For 'segment' option, number of segments the x-axis is divided upon. Default is 4.
segments_y (int) – For 'segment' option, number of segments the y-axis is divided upon. Default is 4.
kwargs (dict, optional) – Additional keyword arguments.

nmf_pixel(i, j, area=9, n_components=0.9, max_iter=100000)[source]

Use principal component analysis on the spectral image, using the data of a squared window of size n_p around pixel (i, j).

Parameters:

i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
area (int) – PCA area parameter. Area around the pixel used for principal component analysis, must be an odd number
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value by PCA first. Default is 0.9.
max_iter (int,) – Default is 100000

Returns:

output – PCA spectrum of the pixel

Return type:

numpy.ndarray, shape=(M,)

pca_cluster(cluster, n_components=0.9)[source]

Use principal component analysis on a cluster of the spectral image. The signals of the cluster are already in reduced format (pixel location is lost).

Parameters:

cluster (numpy.ndarray, shape=(M,)) – An array with size equal to the number of clusters. Each entry is a 2D array that contains all the spectra within that cluster.
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value. Default is 0.9.

pca_image(area_type='segment', n_components=0.9, segments_x=4, segments_y=4, **kwargs)[source]

Use principal component analysis on the spectral image.

Parameters:

area_type (str) –
type of area used for principal component analysis. Usage types as follows:
- 'segment', the image is segmented and pca is only done per segmented areas.
- 'cluster', the data per cluster is used for pca within that cluster.
- 'pixel', only the data used around a pixel is used for pca of that pixel.
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value. Default is 0.9.
segments_x (int) – For 'segment' option, number of segments the x-axis is divided upon. Default is 4.
segments_y (int) – For 'segment' option, number of segments the y-axis is divided upon. Default is 4.
kwargs (dict, optional) – Additional keyword arguments.

pca_pixel(i, j, area=9, n_components=0.9)[source]

Use principal component analysis on the spectral image, using the data of a squared window of size n_p around pixel (i, j).

Parameters:

i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
area (int) – PCA area parameter. Area around the pixel used for principal component analysis, must be an odd number
n_components (float,) – number components to calculate. If between 0 and 1 the amount of components will be determined based on the sum of the variance of the components below the given value. Default is 0.9.
deconv –
zlp_num –

Returns:

output – PCA spectrum of the pixel

Return type:

numpy.ndarray, shape=(M,)

pool_image(area=9, **kwargs)[source]

Pools spectral image using a squared window of size area around each pixel

Parameters:

area (int) – Pooling parameter: area around the pixel, must be an odd number
kwargs (dict, optional) – Additional keyword arguments.

pool_pixel(i, j, area=9, gaussian=True, **kwargs)[source]

Pools the data of a squared window of size area around pixel (i, j).

Parameters:

i (int) – y-coordinate of the pixel
j (int) – x-coordinate of the pixel
area (int) – Pooling parameter: area around the pixel used for pooling, must be an odd number
gaussian (boolean) – If true the pooling weights will use a gaussian distribution
kwargs (dict, optional) – Additional keyword arguments.

Returns:

output – Pooled spectrum of the pixel

Return type:

numpy.ndarray, shape=(M,)

save_compressed_image(filename)[source]

Function to save image, including all attributes, in compressed pickle (.pbz2) format. Image will: be saved at location filename. Advantage over save_image() is that the saved file has a reduced file size, disadvantage is that saving and reloading the image takes significantly longer.

Parameters:: filename (str) – path to save location plus filename. If it does not end on “.pbz2”, “.pbz2” will be added.

save_image(filename)[source]

Function to save image, including all attributes, in pickle (.pkl) format. Image will be saved at indicated location and name in filename input.

Parameters:: filename (str) – path to save location plus filename. If it does not end on “.pkl”, “.pkl” will be added.

set_eaxis()[source]

Determines the energy losses of the spectral image, based on the bin width of the energy loss. It shifts the self.eaxis attribute such that the zero point corresponds with the point of the highest intensity.

It also set the extrapolated eaxis for calculations that require extrapolation.

Returns:: eaxis – Array of \(\Delta E\) values
Return type:: numpy.ndarray, shape=(M,)

set_mass_density(rho=None, rho_background=None)[source]

Sets value of mass density for the image as attribute self.rho. If not clustered, rho will be an array of length one, otherwise it is an array of length n_clusters. If rho_background is defined, the cluster with the lowest thickness (cluster 0) will be assumed to be the vacuum/background, and gets the value of the background mass density.

If there are more specimen present in the image, it is wise to check by hand what cluster belongs to what specimen, and set the values by running:

image.n[cluster_i] = n_i

Parameters:

rho –
rho_background –

set_refractive_index(n=None, n_background=None)[source]

Sets value of refractive index for the image as attribute self.n. If not clustered, n will be an array of length one, otherwise it is an array of length n_clusters. If n_background is defined, the cluster with the lowest thickness (cluster 0) will be assumed to be the vacuum/background, and gets the value of the background refractive index.

If there are more specimen present in the image, it is wise to check by hand what cluster belongs to what specimen, and set the values by running:

image.n[cluster_i] = n_i

Parameters:

n (float) – refractive index of sample.
n_background (float, optional) – if defined: the refractive index of the background/vacuum. This value will automatically be assigned to pixels belonging to the thinnest cluster.

property shape: Returns 3D-shape of spectral_image.SpectralImage object

static smooth_signal(signal, window_length=51, window_type='hanning')[source]

Smooth a signal using a window length window_length and a window type window_type.

This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the beginning and end part of the output signal.

Parameters:

signal (numpy.ndarray, shape=(M,)) – Signal of length M
window_length (int, optional) – The dimension of the smoothing window; should be an odd integer. Default is 51.
window_type (str, optional) – the type of window from 'flat', 'hanning', 'hamming', 'bartlett', 'blackman' and 'kasier'. 'flat' window will produce a moving average smoothing. Default is 'hanning'

Returns:

signal_smooth – The smoothed signal

Return type:

numpy.ndarray, shape=(M,)

train_zlp_models(conf_interval=1, lr=0.001, signal_type='EELS', **kwargs)[source]

Train the ZLP on the spectral image.

The spectral image is clustered in n_clusters clusters, according to e.g. the integrated intensity or thickness. A random spectrum is then taken from each cluster, which together defines one replica. The training is initiated by calling train_zlp_models_scaled().

Parameters:

conf_interval (int, optional) – Default is 1
lr (float, optional) – Default is 1
signal_type (str, optional) – Type of spectrum. Set to EELS by default.
**kwargs – Additional keyword arguments that are passed to the method train_zlp_models_scaled() in the training module.

EELSFitter.core.training module

class EELSFitter.core.training.MultilayerPerceptron(num_inputs, num_outputs)[source]

Bases: Module

Multilayer Perceptron (MLP) class. It uses the following architecture

\[[n_i, 10, 15, 5, n_f],\]

where \(n_i\) and \(n_f\) denote the number of input features and output target values respectively.

Parameters:

num_inputs (int) – number of input features
num_outputs (int) – dimension of the target output.

forward(x)[source]

Propagates the input features x through the MLP.

Parameters:: x (torch.tensor) – input features
Returns:: x – MLP outcome
Return type:: torch.tensor

class EELSFitter.core.training.TrainZeroLossPeak(spectra, eaxis, cluster_centroids=None, display_step=1000, training_report_step=1, n_batch_of_replica=1, n_batches=1, n_replica=100, n_epochs=1000, shift_de1=1.0, shift_de2=1.0, regularisation_constant=10.0, path_to_models='./models/', remove_temp_files=True, **kwargs)[source]

Bases: object

calc_scale_var_log_int_i(based_on='log_zlp')[source]: Calculate the scale variables of the log of the integrated intensity of the spectra for the three highest bins of the Zero Loss Peak.

calculate_hyperparameters()[source]

Calculate the values of the hyperparameters in the gain and loss region, dE1 and mdE1 are calculated by taking the location of the kneedles at each side of the ZLP and shifting them with the gives shift value.

dE2 and mdE2 are calcualted by taking the value of the eaxis where a fit of the log10 function intersects with a single count. If this value is not found the end point of the signal is taken as location for dE2.

cleanup_files()[source]: Cleans up the files generated by train_zlp_models_scaled. costs_train_*, costs_test_*, and nn_rep_* files are merged into single files costs_train, costs_test, and nn_parameters respectively.

find_fwhm_idx()[source]

Determine the FWHM indices per cluster (Full Width at Half Maximum):

indices of the left and right side of the ZLP
indices of the left and right side of the log of the ZLP

These are all determine by taking the local minimum and maximum of the dy/dx

find_kneedle_idx()[source]: Find the kneedle index per cluster. The kneedle algorithm is used to find the point of highest curvature in your concave or convex data set.

find_local_min_idx()[source]: Determine the first local minimum index of the signals per cluster by setting it to the point where the derivative crosses zero.

initialize_x_y_sigma_input(cluster_label)[source]

Initialize the x, y and sigma input for the Neural Network. The spectrum is split into the 3 regions as given by the toy model. For the y data, the data in region I is set to the log intensity up to dE1, the data in region III is set to zero. For the x data two input features, first is the values of the energy axis in region I and III, second is the rescaled log of the total integrated intensity. This factor is to ensure symmetry is retained between input and output values. For the sigma data

Parameters:: cluster_label (int) – Label of the cluster

loss_function(output, output_for_derivative, target, error)[source]

The loss function to train the ZLP takes the model output, the raw spectrum target and the associated error. The latter corresponds to the one sigma spread within a given cluster at fixed \(\Delta E\). It returns the cost function \(C_{\mathrm{ZLP}}^{(m)}\) associated with the replica \(m\) as

(38)\[C_{\mathrm{ZLP}}^{(m)} = \frac{1}{n_{E}} \sum_{k=1}^K \sum_{\ell_k=1}^{n_E^{(k)}} \frac{\left[I^{(i_{m,k}, j_{m,k})}(E_{\ell_k}) - I_{\rm ZLP}^{({\mathrm{NN}})(m)} \left(E_{\ell_k},\ln \left( N_{\mathrm{ tot}}^{(i_{m,k},j_{m,k})} \right) \right) \right]^2}{\sigma^2_k \left(E_{\ell_k} \right)}.\]

Parameters:

eaxis (np.ndarray) – Energy-loss axis
output (torch.tensor) – Neural Network output
output_for_derivative (list of torch.tensor) – Each entry in the list should correspond to the neural network output between de1 and de2 of a single spectrum in the replica
target (torch.tensor) – Raw EELS spectrum
error (torch.tensor) – Uncertainty on \(\log I_{\mathrm{EELS}}(\Delta E)\).

Returns:

loss – Loss associated with the model output.

Return type:

torch.tensor

plot_hp_cluster(**kwargs)[source]

Create a plot of the hyperparameters plotted on top of the spectra per cluster.

Parameters:: kwargs (dict, optional) – Additional keyword arguments.

plot_hp_cluster_slope(**kwargs)[source]

Create a plot of the hyperparameters plotted on top of the slopes of the spectra per cluster.

Parameters:: kwargs (dict, optional) – Additional keyword arguments.

plot_training_report()[source]: Creat the training report plot: evolution of the training and validation loss per epoch.

save_figplot(fig, title='no_title.pdf')[source]

Display the computed values of dE1 (both methods) together with the raw EELS spectrum.

Parameters:

fig (matplotlib.Figure) – Figure to be saved.
title (str) – Filename to store the plot in.

save_hyperparameters()[source]

Save the hyperparameters in hyperparameters.txt. These are:

cluster centroids, keep note if they were determined from the raw data, or if the log had been taken.
dE1 for all clusters
dE2 for all clusters
FWHM for all clusters

save_scale_var_log_int_i()[source]: Save the scale variables of the log of the total integrated intensity of the spectra, denoted I.

scale_eaxis()[source]: Scales the features of the energy axis between [0.1, 0.9]. This is to optimize the speed of the neural network.

set_dydx_data()[source]: Determines the slope of all spectra per cluster, smooths the slope and takes the median per cluster.

set_minimum_intensities()[source]: Set all features smaller than 1 to 1.

set_path_for_training_report(j)[source]

Set the save directory for the training report of the replica being trained on.

Parameters:: j (int) – Index of the replica being trained on.

set_sigma()[source]: Determine the sigma (spread of spectra per cluster) per cluster.

set_test_x_y_sigma()[source]: Take the x, y and sigma data for the test set and reshape them for neural network input

set_train_x_y_sigma()[source]: Take the x, y and sigma data for the train set and reshape them for neural network input

set_y_data()[source]: Smooths all the spectra per cluster and takes the median per cluster.

train_and_evaluate_model(i, j, lr)[source]

Train and evaluate the model. Also saves the values of cost_train and cost_test per epoch.

Parameters:

i (int) –
j (int) –
lr (float) – learning rate

train_zlp_models_scaled(lr=0.001)[source]

Train the ZLP models. This functions calls up the other functions step by step to complete the whole process. Refer to each function for details what they specifically perform.

Parameters:: lr (float,) – Learning rate of the neural network

write_txt_of_loss(base_path, filename, loss)[source]

Write train/test loss to a .txt file.

Parameters:

base_path (str) – Directory to store the report in.
filename (str) – Filename of .txt file to store report in.
loss (list) – List containing loss value per epoch.

EELSFitter.core.training.find_scale_var(inp, min_out=0.1, max_out=0.9)[source]

Computes the scaling parameters needed to rescale the training data to lie between min_out and max_out. For our neural network the value range [0.1, 0.9] ensures the neuron activation states will typically lie close to the linear region of the sigmoid activation function.

Parameters:

inp (numpy.ndarray, shape=(M,)) – training data to be rescaled
min_out (float) – lower limit. Set to 0.1 by default.
max_out (float) – upper limit. Set to 0.9 by default

Returns:

a, b – list of rescaling parameters

Return type:

list

EELSFitter.core.training.log10_fit(x, a, b, order=1)[source]

EELSFitter.core.training.power_fit(x, a, r)[source]

EELSFitter.core.training.scale(inp, ab)[source]

Rescale the training data to lie between 0.1 and 0.9. Rescaling features is to help speed up the neural network training process. The value range [0.1, 0.9] ensures the neuron activation states will typically lie close to the linear region of the sigmoid activation function.

Parameters:

inp (numpy.ndarray, shape=(M,)) – training data to be rescaled, e.g. \(\Delta E\)
ab (numpy.ndarray, shape=(M,)) – scaling parameters, which can be found with find_scale_var().

Return type:

Rescaled training data

EELSFitter.core.training.smooth_signals_per_cluster(signals, window_length=51, window_type='hanning')[source]

Smooth all signals in a cluster using a window length window_len and a window type window.

This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the beginning and end part of the output signal.

Parameters:

signals (numpy.ndarray, shape=(M,)) – The input data
window_length (int, optional) – The dimension of the smoothing window; should be an odd integer. Default is 51
window_type (str, optional) – the type of window from "flat", "hanning", "hamming", "bartlett", "blackman" and "kasier". "flat" window will produce a moving average smoothing. Default is "hanning"

Returns:

signal_smooth – The smoothed signal

Return type:

numpy.ndarray, shape=(M,)

EELSFitter.core.training.weight_reset(m)[source]

Reset the weights and biases associated with the model m.

Parameters:: m (MLP) – Model of type MLP.

EELSFitter.core.gainpeakfitter module

class EELSFitter.core.gainpeakfitter.GainPeakFitter(x_signal, y_signal, image_shape, image=None)[source]

This class extracts gain peaks from an EEL spectrum. The GainPeakFitter fits the ZLP to a Gaussian based on the ZLP’s FWHM. The ZLP is then subtractred from the EEL spectrum (also referred to as ‘the signal’). The subtracted signal contains a peak which is then fitted to a Lorentzian. From the Lorentzian, the energy of the gain peak is determined.

INPUT

imagespectral image: 4D data from a .dm4 file

Example

To fit the gain/loss peaks:

lrtz = GainPeakFitter(x_signal, y_signal, image_shape)

lrtz.generate_best_fits()

lrtz.fit_gain_peak()

lrtz.fit_loss_peak()

To plot the gain/loss peaks:

lrtz.create_new_plot()

lrtz.plot_all_results()

lrtz.ax.set_ylim(0, 1e4)

lrtz.fig

lrtz.plot_signal()

lrtz.plot_model()

lrtz.plot_subtracted()

lrtz.plot_gain_peak()

lrtz.plot_loss_peak()

lrtz.print_results()

array2D_of_intensity_at_energy_loss(eloss=-1.1)[source]: Generate 2D array of intensities for each SI-pixel at a chosen energy loss.

create_new_plot()[source]: Create pyplot.subplots figure with axes.

curve_fit_between(x_left=-1.6, x_right=-0.6)[source]

Curve fit between chosen eloss fitting range.

Parameters:

x (np.array) – eloss [in eV]
y (np.array) – signal, intensity, electron count [in a.u.]
x_left (float) – left margin of eloss fitting range
x_right (float) – right margin of eloss fitting range

curve_fit_between_background(x_left=-1.6, x_right=-0.6)[source]

Curve fit between chosen eloss fitting range.

Parameters:

x (np.array) – eloss [in eV]
y (np.array) – signal, intensity, electron count [in a.u.]
x_left (float) – left margin of eloss fitting range
x_right (float) – right margin of eloss fitting range

determine_parameters(function)[source]

Determines the parameters specifying a Gaussian or Lorentzian function using the signal’s peak height and the FWHM of the peak.

Parameters:: function (str, {'Gaussian', 'Lorentzian'}) – Model choice for the ZLP.

fit_gain_peak_mc(i, j, L_bound=-3.5, R_bound=-0.5, return_all=False, return_conf_interval=False, **kwargs)[source]

Use the Monte Carlo replica method to fit a Lorentzian model to the subtracted spectrum in a specified energy interval [L_bound, R_bound].

Parameters:

i (int) – y-coordinate of the pixel.
j (int) – x-coordinate of the pixel.
L_bound (float, optional) – Left bound of the interval in which to fit to the subtracted spectrum.
R_bound (float, optional) – Right bound of the interval in which to fit to the subtracted spectrum.
return_all (bool, optional) – Option to return the subtracted spectra for all replicas corresponding to this pixel.
return_conf_interval (bool, optional) – Option to specify if the upper and lower bounds of the confidence interval must be returned.
kwargs (dict, optional) – Additional keyword arguments.

Returns:

gain (numpy.ndarray) – Array with the median Lorentzian fit to the subtracted spectrum.
gain_low (numpy.ndarray, optional) – Lower bound of the Lorentzian fit.
gain_high (numpy.ndarray, optional) – Upper bound of the Lorentzian fit.

fit_models(n_rep=500, n_clusters=5, function='Gaussian', conf_interval=1, signal_type='EELS', **kwargs)[source]

Use the Monte Carlo replica method to fit chosen function model to the ZLP. In this method it is assumed that in each cluster the ZLP is sampled from the same underlying distribution in that particular cluster. This methods samples the underlying distribution in order to obtain the median, low, and high predictions for the ZLP at teach loss value.

The model predictions are stored in self.zlp_models_all, where the median, low, and high values are the first, second, and third element respectively.

Parameters:

n_rep (int, default=500) – Number of Monte Carlo replicas to use.
n_clusters (int, default=5) – Number of clusters to use.
function (str, {'Gaussian', 'Split Gausian', 'Lorentzian', 'Pearson VII', 'Split Pearson VII', 'Pseudo-Voigt', 'Split Pseudo-Voigt', 'Generalised Peak', 'Kaiser window'}) – Model choice for the ZLP.
conf_interval (float, optional) – The ratio of spectra returned. The spectra are selected based on the based_on value. The default is 1.
signal_type (str, optional) – Description of signal, 'EELS' by default.
kwargs (dict, optional) – Additional keyword arguments.

fwhm(row=0, col=0, do_fit=True, **kwargs)[source]

Calculates the FWHM of the ZLP in a particular pixel (row, col ). Optionally calculate FWHM through fitting a Gaussian to the ZLP and extracting the FWHM.

Parameters:

do_fit (bool, default=True) – Option to obtain FWHM through fitting a Gaussian.

Returns:

fwhm (float) – FWHM of the ZLP in chosen pixel.
fit (float) – FWHM obtained through fitting a Gaussian to the ZLP.

gaussian(x, a, sigma, x0=0)[source]

Gaussian centered around x = x0.

Parameters:

x (numpy.ndrray) – 1D array of the energy loss.
x0 (float) – Energy loss at the center of the Gaussian.
a (float) – Height of the Gaussian peak.
sigma (float) – Standard deviation of the Gaussian.

Returns:

Gaussian.

Return type:

numpy.ndarray

generalised_peak(x, x_0, delta, nu)[source]

Generalised Peak function calculated as

\[\frac{2}{\pi \delta} \Bigg|\frac{\Gamma\left[\frac{\nu}{2} + i \gamma_\nu \left(\frac{4 x_s^2}{\pi^2 \delta^2} \right)^2 \right]}{\Gamma\left[\frac{\nu}{2}] \right]}\Bigg|^2,\]

where \(\gamma_\nu = \sqrt{\pi} \frac{\Gamma\left[\frac{\nu + 1}{2} \right]}{\Gamma\left[\nu + \frac{1}{2} \right]}\).

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
x_0 (float) – Energy offset.
delta (float) – Parameter describing the peak width.
nu (float) – Parameter describing the peak shape.

Returns:

Generalised Peak function.

Return type:

numpy.ndarray

get_model(i, j, signal_type='EELS', return_all=False, return_conf_interval=False, **kwargs)[source]

Retrieves the model fit to the ZLP at pixel (i, j).

Parameters:

i (int) – y-coordinate of the pixel.
j (int) – x-coordinate of the pixel.
signal_type (str, optional) – The type of signal that is requested, should comply with the defined names. Set to 'EELS' by default.
return_all (bool, optional) – Option to return all models.
return_conf_interval (bool, optional) – Option to specify if the upper and lower bounds of the confidence interval must be returned.
kwargs (dict, optional) – Additional keyword arguments.

Returns:

model (numpy.ndarray) – Array with the median model fit to the ZLP from the requested pixel.
model_low (numpy.ndarray, optional) – Lower bound of the confidence interval of the model fit to the ZLP.
model_high (numpy.ndarray, optional) – Upper bound of the confidence interval of the model fit to the ZLP.

get_subtracted_spectrum(i, j, signal_type='EELS', return_all=False, return_conf_interval=False, **kwargs)[source]

Retrieves the subtracted spectrum at pixel (i, j).

Parameters:

i (int) – y-coordinate of the pixel.
j (int) – x-coordinate of the pixel.
signal_type (str, optional) – The type of signal that is requested, should comply with the defined names. Set to 'EELS' by default.
return_all (bool, optional) – Option to return the subtracted spectra for all replicas corresponding to this pixel.
return_conf_interval (bool, optional) – Option to specify if the upper and lower bounds of the confidence interval must be returned.
kwargs (dict, optional) – Additional keyword arguments.

Returns:

signal (numpy.ndarray) – Array with the median subtracted spectrum from the requested pixel.
signal_low (numpy.ndarray, optional) – Lower bound of the confidence interval of the subtracted spectrum.
signal_high (numpy.ndarray, optional) – Upper bound of the confidence interval of the subtracted spectrum.

inspect_spectrum(row=0, col=0, function='Gaussian', method='fit', **kwargs)[source]

Fit chosen function model to the ZLP and obtain subtracted spectrum.

Parameters:

function (str, {'Gaussian', 'Split Gausian', 'Lorentzian', 'Pearson VII', 'Split Pearson VII', 'Pseudo-Voigt', 'Split Pseudo-Voigt', 'Generalised Peak', 'Kaiser window'}) – Model choice for the ZLP.
method (str, {'fit', 'FWHM'}) – Method to use to extract model fit parameters. 'FWHM' is only supported for Gaussian and Lorentzian ZLP models.
kwargs (dict, optional) – Additional keyword arguments.

kaiser(x, L, m)[source]

Kaiser window function, calculated as

\[\begin{split}w_0(x) \triangleq \begin{array}{cl} \frac{1}{L} \frac{I_0\left[m \sqrt{1-(2 x / L)^2}\right]}{I_0[m]}, & |x| \leq L / 2 \\ 0, & |x|>L / 2 \end{array},\end{split}\]

where \(I_0\) is the zeroth-order modified Bessel function of the first kind.

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
L (float) – Window duration.
m (float) – Parameter determining the window shape.

Returns:

Kaiser window function.

Return type:

numpy.ndarray

lorentzian(x, x0, a, gam)[source]

Lorentzian centered around x = x0.

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
x0 (float) – Energy loss at the center of the Lorentzian.
a (float) – Height of the Lorentzian peak.
gam (float) – 2 * gamma is full width at half maximum (FWHM).

Returns:

Lorentzian.

Return type:

numpy.ndarray

lorentzian_background(x, x0, a, gam, E_0, b)[source]

Lorentzian centered around x = x0.

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
x0 (float) – Energy loss at the center of the Lorentzian.
a (float) – Height of the Lorentzian peak.
gam (float) – 2 * gamma is full width at half maximum (FWHM).

Returns:

Lorentzian.

Return type:

numpy.ndarray

model(x, function, **kwargs)[source]

Calculates the ZLP model using the optimal fit parameters.

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
function (str, {'Gaussian', 'Split Gausian', 'Lorentzian', 'Pearson VII', 'Split Pearson VII', 'Pseudo-Voigt', 'Split Pseudo-Voigt', 'Generalised Peak', 'Kaiser window'}) – Model choice for the ZLP.
kwargs (dict, optional) – Additional keyword arguments.

Returns:

ZLP model fit.

Return type:

numpy.ndarray

model_fit_between(function='Gaussian', **kwargs)[source]

Model fit of the ZLP. Delete x coordinates that contain information about gain/loss peaks.

Parameters:

function (str, {'Gaussian', 'Split Gausian', 'Lorentzian', 'Pearson VII', 'Split Pearson VII', 'Pseudo-Voigt', 'Split Pseudo-Voigt', 'Generalised Peak', 'Kaiser window'}) – Model choice for the ZLP.
kwargs (dict, optional) – Additional keyword arguments.

Returns:

popt (numpy.ndarray) – Optimal values for the parameters so that the sum of the squared residuals of f(xdata, *popt) - ydata is minimized.
pcov (numpy.ndarray) – The estimated covariance of popt. The diagonals provide the variance of the parameter estimates. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).

pearson(x, I_max, x_0, w, m)[source]

Pearson VII function calculated as

\[I(x, I_{\mathrm{max}}, x_0, w, m) = I_{\mathrm{max}} \frac{w^{2m}}{\left[w^2 + \left(2^\frac{1}{m} - 1 \right) \left(x - x_0 \right)^2 \right]^2}.\]

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
I_max (float) – Height of the peak.
x_0 (float) – Energy loss at the center of the peak.
w (float) – Parameter related to the width of the peak.
m (float) – Parameter chosen to suit a particular peak shape.

Returns:

Pearson VII function.

Return type:

numpy.ndarray

plot_all_results(i=None, j=None, monte_carlo=False)[source]: Plot all components: original signal, Model-fit ZLP, subtracted spectrum, Lorentzian-fit gain peak.

plot_array2D_of_intensity_at_energy_loss(eloss=-1.1, cmap='turbo', cbar_title='intensity', dm4_filename='')[source]

Example heatmap plot: SI with only intensity at energy loss -1.1eV.

Parameters:

eloss (float, optional) – energy loss value. Defaults to -1.1.
cmap (str, optional) – color map. Defaults to ‘turbo’.
cbar_title (str, optional) – color bar title. Defaults to ‘intensity’.
dm4_filename (str, optional) – file name. Defaults to ‘’.

Returns:

pyplot figure ax: pyplot axis

Return type:

fig

pseudo_voigt(x, I_max, x_0, f, eta)[source]

Linear combination of a Gaussian and Lorentzian function, both described by the same FWHM f.

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
I_max (float) – Height of the peak.
x_0 (float) – Energy loss at the center of the peak.
f (float) – FWHM.
eta (float) – Mixing parameter.

Returns:

Pseudo-Voigt function.

Return type:

numpy.ndarray

signal_subtracted(function='Gaussian', **kwargs)[source]

Generate signal spectrum minus model fitted ZLP.

Parameters:

function (str, {'Gaussian', 'Split Gausian', 'Lorentzian', 'Pearson VII', 'Split Pearson VII', 'Pseudo-Voigt', 'Split Pseudo-Voigt', 'Generalised Peak', 'Kaiser window'}) – Model choice for the ZLP.
kwargs (dict, optional) – Additional keyword arguments.

Returns:

1D array of signal spectrum minus model fitted ZLP.

Return type:

numpy.ndarray

split_gaussian(x, a, sigma_left, sigma_right, x0=0)[source]

Gaussian centered around x = x0 with a different standard deviation for x < x0 and x > x0.

Parameters:

x (numpy.ndrray) – 1D array of the energy loss.
x0 (float) – Energy loss at the center of the Gaussian.
a (float) – Height of the Gaussian peak.
sigma_left (float) – Standard deviation of the left half of the Gaussian.
sigma_right (float) – Standard deviation of the right half of the Gaussian.

Returns:

Split Gaussian.

Return type:

numpy.ndarray

split_lorentzian(x, x0, a, gam_left, gam_right)[source]

Lorentzian centered around x = x0 with a different FWHM for x < x0 and x > x0.

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
x0 (float) – Energy loss at the center of the Lorentzian.
a (float) – Height of the Lorentzian peak.
gam_left (float) – 2 * gam_left is full width at half maximum (FWHM) of the left half of the Lorentzian.
gam_right (float) – 2 * gam_right is full width at half maximum (FWHM) of the right half of the Lorentzian.

Returns:

Split Lorentzian.

Return type:

numpy.ndarray

split_pearson(x, I_max, x_0, w_left, w_right, m)[source]

Pearson VII function with a different w for x < x_0 and x > x_0.

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
I_max (float) – Height of the peak.
x_0 (float) – Energy loss at the center of the peak.
w (float) – Parameter related to the width of the peak.
m (float) – Parameter chosen to suit a particular peak shape.

Returns:

Split Pearson VII function.

Return type:

numpy.ndarray

split_pseudo_voigt(x, I_max, x_0, f_left, f_right, eta)[source]

Pseudo-Voigt function with a different FWHM for x < x_0 and x > x_0.

Parameters:

x (numpy.ndarray) – 1D array of energy loss.
I_max (float) – Height of the peak.
x_0 (float) – Energy loss at the center of the peak.
f_left (float) – FWHM for the left half of the function.
f_right (float) – FWHM for the right half of the function.
eta (float) – Mixing parameter.

Returns:

Split Pseudo-Voigt function.

Return type:

numpy.ndarray