Module for computing clusters in EOPatch

class eolearn.features.clustering.ClusteringTask(features, new_feature_name, distance_threshold=None, n_clusters=None, affinity='cosine', linkage='single', remove_small=0, connectivity=None, mask_name=None)[source]

Bases: eolearn.core.eotask.EOTask

Tasks computes clusters on selected features using sklearn.cluster.AgglomerativeClustering.

The algorithm produces a timeless data feature where each cell has a natural number which corresponds to specific group. The cells marked with -1 are not marking clusters. They are either being excluded by a mask or later removed by depending on the ‘remove_small’ threshold.

Class constructor

  • features (dict(FeatureType.DATA_TIMELESS: set(str))) – A collection of features used for clustering. The features need to be of type DATA_TIMELESS

  • new_feature_name (str) – Name of feature that is the result of clustering

  • distance_threshold (float or None) – The linkage distance threshold above which, clusters will not be merged. If non None, n_clusters must be None nd compute_full_tree must be True

  • n_clusters (int or None) – The number of clusters found by the algorithm. If distance_threshold=None, it will be equal to the given n_clusters

  • affinity (str) – Metric used to compute the linkage. Can be “euclidean”, “l1”, “l2”, “manhattan”, “cosine”.

  • linkage ({“ward”, “complete”, “average”, “single”}) – Which linkage criterion to use. The linkage criterion determines which distance to use between sets of observation. The algorithm will merge the pairs of cluster that minimize this criterion. - ward minimizes the variance of the clusters being merged. - average uses the average of the distances of each observation of the two sets. - complete or maximum linkage uses the maximum distances between all observations of the two sets. - single uses the minimum of the distances between all observations of the two sets.

  • remove_small (int) – If greater than 0, removes all clusters that have fewer points as “remove_small”

  • connectivity (array-like, callable or None) – Connectivity matrix. Defines for each sample the neighboring samples following a given structure of the data. This can be a connectivity matrix itself or a callable that transforms the data into a connectivity matrix, such as derived from neighbors_graph. If set to None it uses the graph that has adjacent pixels connected.

  • mask_name (str) – An optional mask feature used for exclusion of the area from clustering


eopatch – EOPatch with all features that will be used


array of vectors constructed from the features listed


eopatch (EOPatch) – Input EOPatch


Transformed EOPatch

Return type