2. Documentation for the som module

class som.SOM(x: int, y: int, alpha_start: float = 0.6, sigma_start: Optional[float] = None, seed: Optional[int] = None)

Class implementing a self-organizing map with periodic boundary conditions. It has the following methods:

cycle(vector: ndarray, verbose: bool = True)

Perform one iteration in adapting the SOM towards a chosen data point

Parameters

vector (np.ndarray) – current data point
verbose (bool) – verbosity control

distance_map(metric: str = 'euclidean')

Get the distance map of the neuron weights. Every cell is the normalised average of all distances between the neuron and all other neurons.

Parameters: metric (str) – distance metric to be used (see scipy.spatial.distance.cdist)
Returns: normalized sum of distances for every neuron to its neighbors, stored in SOM.distmap

fit(data: ndarray, epochs: int = 0, save_e: bool = False, interval: int = 1000, decay: str = 'hill', verbose: bool = True)

Train the SOM on the given data for several iterations

Parameters

data (np.ndarray) – data to train on
epochs (int, optional) – number of iterations to train; if 0, epochs=len(data) and every data point is used once
save_e (bool, optional) – whether to save the error history
interval (int, optional) – interval of epochs to use for saving training errors
decay (str, optional) – type of decay for alpha and sigma. Choose from ‘hill’ (Hill function) and ‘linear’, with ‘hill’ having the form y = 1 / (1 + (x / 0.5) **4)
verbose (bool) – verbosity control

get_neighbors(datapoint: ndarray, data: ndarray, labels: ndarray, d: int = 0) → ndarray

return the labels of the neighboring data instances at distance d for a given data point of interest

Parameters

datapoint (np.ndarray) – descriptor vector of the data point of interest to check for neighbors
data (np.ndarray) – reference data to compare datapoint to
labels (np.ndarray) – array of labels describing the target classes for every data point in data
d (int) – length of Manhattan distance to explore the neighborhood (0: same neuron as data point)

Returns

found neighbors (labels)

Return type

np.ndarray

initialize(data: ndarray, how: str = 'pca')

Initialize the SOM neurons

Parameters

data (numpy.ndarray) – data to use for initialization
how (str) – how to initialize the map, available: pca (via 4 first eigenvalues) or random (via random values normally distributed in the shape of data)

Returns

initialized map in SOM.map

load(filename: str)

Load a SOM instance from a pickle file.

Parameters: filename (str) – filename (best to end with .p)
Returns: updated instance with data from filename

plot_class_density(data: ndarray, targets: Union[list, ndarray], t: int = 1, name: str = 'actives', colormap: str = 'gray', example_dict: Optional[dict] = None, filename: Optional[str] = None)

Plot a density map only for the given class

Parameters

data (np.ndarray) – data to visualize the SOM density (number of times a neuron was winner)
targets (list, np.ndarray) – array of target classes (0 to len(targetnames)) corresponding to data
t (int) – target class to plot the density map for
name (str) – target name corresponding to target given in t
colormap (str) – colormap to use, select from matplolib sequential colormaps
example_dict (dict) – dictionary containing names of examples as keys and corresponding descriptor values as values. These examples will be mapped onto the density map and marked
filename (str) – optional, if given, the plot is saved to this location

Returns