eolearn.geometry.sampling
Tasks for spatial sampling of points for building training/validation samples for example.
- class eolearn.geometry.sampling.PointSampler(raster_mask, no_data_value=None, ignore_labels=None)[source]
Bases:
object
Samples randomly points from a raster mask, where the number of points sampled from a polygon with specific label is proportional to its area.
The sampler first vectorizes the raster mask and then samples.
- Parameters
raster_mask (A numpy array of shape (height, width) and type int.) – A raster mask based on which the points are sampled.
no_data_value (integer) – A value indicating no data value – points that are not labeled and should not be sampled
ignore_labels (list of integers) – A list of label values that should not be sampled.
- labels()[source]
Returns all label values found in the raster mask (except for the no_data_value and label values from ignore_labels).
- area(cc_index=None)[source]
Returns the area of the selected polygon if index is provided or of all polygons if it’s not.
- sample(nsamples=1, weighted=True)[source]
Sample n points from the provided raster mask. The number of points belonging to each class is proportional to the area this class covers in the raster mask, if weighted is set to True.
TODO: If polygon has holes the number of sampled points will be less than nsamples
- Parameters
nsamples (integer) – number of sampled samples
weighted (bool, default is True) – flag to apply weights proportional to total area of each class/polygon when sampling
- sample_cc(nsamples=1, weighted=True)[source]
Returns a random polygon of any class. The probability of each polygon to be sampled is proportional to its area if weighted is True.
- sample_within_cc(cc_index, nsamples=1)[source]
Returns randomly sampled points from a polygon.
Complexity of this procedure is (A/a * nsamples) where A=area(bbox(P)) and a=area(P) where P is the polygon of the connected component cc_index
- class eolearn.geometry.sampling.PointRasterSampler(labels, even_sampling=False)[source]
Bases:
object
Class to perform point sampling of a label image
Class that handles sampling of points from a label image representing classification labels. Labels are encoded as uint8 and the raster is a 2D or single-channel 3D array.
- Supported operations include:
exclusion of some labels from sampling
sampling based on label frequency in raster or even sampling of labels (i.e. over-sampling)
Initialisation of sampler parameters
- Parameters
even_sampling (bool) – Whether to sample class labels evenly or not. If True, labels will have the same number samples, with less frequent labels being over-sampled (i.e. same observation is sampled multiple times). If False, sampling follows the label distribution in raster. Default is False
- class eolearn.geometry.sampling.PointSamplingTask(*args, **kwargs)[source]
Bases:
eolearn.core.eotask.EOTask
Task for spatially sampling points from a time-series.
This task performs random spatial sampling of a time-series based on a label mask. The user specifies the number of points to be sampled, the name of the DATA time-series, the name of the label raster image, and the name of the output sample features and sampled labels.
Initialise sampling task.
The data to be sampled is supposed to be a time-series stored in DATA type of the eopatch, while the raster image is supposed to be stored in MASK_TIMELESS. The output sampled features are stored in DATA and have shape T x N_SAMPLES x 1 x D, where T is the number of time-frames, N_SAMPLES the number of random samples, and D is the number of channels of the input time-series.
The row and column index of sampled points can also be stored in the eopatch, to allow the same random sampling of other masks.
- Parameters
n_samples (int) – Number of random spatial points to be sampled from the time-series
ref_mask_feature (str) – Name of MASK_TIMELESS raster image to be used as a reference for sampling
ref_labels (list(int)) – List of labels of ref_mask_feature mask which will be sampled
sample_features (list(tuple(FeatureType, str, str) or tuple(FeatureType, str))) –
A collection of features that will be resampled. Each feature is represented by a tuple in a form of (FeatureType, ‘feature_name’) or (FeatureType, ‘<feature_name>’, ‘<sampled feature name>’). If sampled_feature_name is not set the default name ‘<feature_name>_SAMPLED’ will be used.
Example: [(FeatureType.DATA, ‘NDVI’), (FeatureType.MASK, ‘cloud_mask’, ‘cloud_mask_1’)]
sampling_fraction (float or None) – A maximal fraction of samples to be taken. The value should be from an interval [0, 1]
return_new_eopatch (bool) – If True the task will create new EOPatch, put sampled data and copy of timestamps and meta_info data in it and return it. If False it will just add sampled data to input EOPatch and return it.
sampling_params – Any other parameter used by PointRasterSampler class