Tasks for spatial sampling of points for building training/validation samples for example.
Class to perform point sampling of a label image
Class that handles sampling of points from a label image representing classification labels. Labels are encoded as uint8 and the raster is a 2D or single-channel 3D array.
- Supported operations include:
- exclusion of some labels from sampling
- sampling based on label frequency in raster or even sampling of labels (i.e. over-sampling)
Initialisation of sampler parameters
- labels (list(int)) – A list of labels that will be sampled
- even_sampling (bool) – Whether to sample class labels evenly or not. If True, labels will have the same number samples, with less frequent labels being over-sampled (i.e. same observation is sampled multiple times). If False, sampling follows the label distribution in raster. Default is False
Sample nsamples points form raster
- raster (uint8 numpy array) – Input 2D or single-channel 3D label image
- n_samples (uint32) – Number of points to sample in total
List of row indices of samples, list of column indices of samples
PointSampler(raster_mask, no_data_value=None, ignore_labels=None)¶
Samples randomly points from a raster mask, where the number of points sampled from a polygon with specific label is proportional to its area.
The sampler first vectorizes the raster mask and then samples.
- raster_mask (A numpy array of shape (height, width) and type int.) – A raster mask based on which the points are sampled.
- no_data_value (integer) – A value indicating no data value – points that are not labeled and should not be sampled
- ignore_labels (list of integers) – A list of label values that should not be sampled.
Returns the area of the selected polygon if index is provided or of all polygons if it’s not.
Tests whether point lies within the polygon
Returns all label values found in the raster mask (except for the no_data_value and label values from ignore_labels).
Selects a random point in interior of a rectangle
Parameters: bounds (tuple(float)) – Rectangle coordinates (x_min, y_min, x_max, y_max) Returns: Random point from interior of rectangle Return type: tuple of x and y coordinates
Selects a random point in interior of a rectangle
Parameters: bounds (tuple(float)) – Rectangle coordinates (x_min, y_min, x_max, y_max) Returns: Random point from interior of rectangle Return type: shapely.geometry.Point
Selects a random point in interior of a triangle
Sample n points from the provided raster mask. The number of points belonging to each class is proportional to the area this class covers in the raster mask, if weighted is set to True.
TODO: If polygon has holes the number of sampled points will be less than nsamples
- nsamples (integer) – number of sampled samples
- weighted (bool, default is True) – flag to apply weights proportional to total area of each class/polygon when sampling
Returns a random polygon of any class. The probability of each polygon to be sampled is proportional to its area if weighted is True.
Returns randomly sampled points from a polygon.
Complexity of this procedure is (A/a * nsamples) where A=area(bbox(P)) and a=area(P) where P is the polygon of the connected component cc_index
PointSamplingTask(n_samples, ref_mask_feature, ref_labels, sample_features, return_new_eopatch=False, **sampling_params)¶
Task for spatially sampling points from a time-series.
This task performs random spatial sampling of a time-series based on a label mask. The user specifies the number of points to be sampled, the name of the DATA time-series, the name of the label raster image, and the name of the output sample features and sampled labels.
Initialise sampling task.
The data to be sampled is supposed to be a time-series stored in DATA type of the eopatch, while the raster image is supposed to be stored in MASK_TIMELESS. The output sampled features are stored in DATA and have shape T x N_SAMPLES x 1 x D, where T is the number of time-frames, N_SAMPLES the number of random samples, and D is the number of channels of the input time-series.
The row and column index of sampled points can also be stored in the eopatch, to allow the same random sampling of other masks.
- n_samples (int) – Number of random spatial points to be sampled from the time-series
- ref_mask_feature (str) – Name of MASK_TIMELESS raster image to be used as a reference for sampling
- ref_labels (list(int)) – List of labels of ref_mask_feature mask which will be sampled
- sample_features (list(tuple(FeatureType, str, str) or tuple(FeatureType, str))) –
A collection of features that will be resampled. Each feature is represented by a tuple in a form of (FeatureType, ‘feature_name’) or (FeatureType, ‘<feature_name>’, ‘<sampled feature name>’). If sampled_feature_name is not set the default name ‘<feature_name>_SAMPLED’ will be used.
Example: [(FeatureType.DATA, ‘NDVI’), (FeatureType.MASK, ‘cloud_mask’, ‘cloud_mask_1’)]
- return_new_eopatch (bool) – If True the task will create new EOPatch, put sampled data and copy of timestamps and meta_info data in it and return it. If False it will just add sampled data to input EOPatch and return it.
- sampling_params – Any other parameter used by PointRasterSampler class
Execute random spatial sampling of time-series stored in the input eopatch
An EOPatch with spatially sampled temporal features and associated labels