eolearn.mask.cloud_mask

Module for cloud masking

class eolearn.mask.cloud_mask.CloudMaskTask(*args, **kwargs)[source]

Bases: eolearn.core.eotask.EOTask

Cloud masking with an improved s2cloudless model and the SSIM-based multi-temporal classifier.

Its intended output is a cloud mask that is based on the outputs of both individual classifiers (a dilated intersection of individual binary masks). Additional cloud masks and probabilities can be added for either classifier or both.

Prior to feature extraction and classification, it is recommended that the input be downscaled by specifying the source and processing resolutions. This should be done for the following reasons:

  • faster execution

  • lower memory consumption

  • noise mitigation

Resizing is performed with linear interpolation. After classification, the cloud probabilities are themselves upscaled to the original dimensions, before proceeding with masking operations.

Example usage:

# Only output the combined mask
task1 = AddMultiCloudMaskTask(processing_resolution='120m',
                              mask_feature='CLM_INTERSSIM',
                              average_over=16,
                              dilation_size=8)

# Only output monotemporal masks. Only monotemporal processing is done.
task2 = AddMultiCloudMaskTask(processing_resolution='120m',
                              mono_features=(None, 'CLM_S2C'),
                              mask_feature=None,
                              average_over=16,
                              dilation_size=8)
Parameters
  • data_feature (str) – A data feature which stores raw Sentinel-2 reflectance bands. Default value: ‘BANDS-S2-L1C’.

  • is_data_feature (str) – A mask feature which indicates whether data is valid. Default value: ‘IS_DATA’.

  • all_bands (bool) – Flag, which indicates whether images will consist of all 13 Sentinel-2 bands or only the required 10. Default value: True.

  • processing_resolution (int or (int, int)) – Resolution to be used during the computation of cloud probabilities and masks, expressed in meters. Resolution is given as a pair of x and y resolutions. If a single value is given, it is used for both dimensions. Default is None (source resolution).

  • max_proc_frames (int) – Maximum number of frames (including the target, for multi-temporal classification) considered in a single batch iteration (To keep memory usage at agreeable levels, the task operates on smaller batches of time frames). Default value: 11.

  • mono_features ((str or None, str or None)) – Tuple of keys to be used for storing cloud probabilities and masks (in that order!) of the mono classifier. The probabilities are added as a data feature, while masks are added as a mask feature. By default, none of them are added.

  • multi_features ((str or None, str or None)) – Tuple of keys used for storing cloud probabilities and masks of the multi classifier. The probabilities are added as a data feature, while masks are added as a mask feature. By default, none of them are added.

  • mask_feature (str or None) – Name of the output intersection feature. The masks are added to the eopatch.mask attribute dictionary. Default value: ‘CLM_INTERSSIM’. If None, the intersection feature is not computed.

  • mono_threshold (float) – Cloud probability threshold for the mono classifier. Default value: 0.4.

  • multi_threshold (float) – Cloud probability threshold for the multi classifier. Default value: 0.5.

  • average_over (int or None) – Size of the pixel neighbourhood used in the averaging post-processing step. A value of 0 or None skips this post-processing step. Default value mimics the default for s2cloudless: 4.

  • dilation_size (int or None) – Size of the dilation post-processing step. A value of 0 or None skips this post-processing step. Default value mimics the default for s2cloudless: 2.

  • mono_classifier (lightgbm.Booster or sklearn.base.BaseEstimator) – Classifier used for mono-temporal cloud detection (s2cloudless or equivalent). Must work on the 10 selected reflectance bands as features (“B01”, “B02”, “B04”, “B05”, “B08”, “B8A”, “B09”, “B10”, “B11”, “B12”). Default value: None (s2cloudless is used)

  • multi_classifier (lightgbm.Booster or sklearn.base.BaseEstimator) –

    Classifier used for multi-temporal cloud detection. Must work on the 90 multi-temporal features:

    • raw reflectance value in the target frame,

    • average value within a spatial window in the target frame,

    • maximum, mean and standard deviation of the structural similarity (SSIM)

    • indices between a spatial window in the target frame and every other,

    • minimum and mean reflectance of all available time frames,

    • maximum and mean difference in reflectances between the target frame and every other.

    Default value: None (SSIM-based model is used)

MODELS_FOLDER = '/home/docs/checkouts/readthedocs.org/user_builds/eo-learn/conda/latest/lib/python3.8/site-packages/eolearn/mask/models'
MONO_CLASSIFIER_NAME = 'pixel_s2_cloud_detector_lightGBM_v0.2.txt'
MULTI_CLASSIFIER_NAME = 'ssim_s2_cloud_detector_lightGBM_v0.2.txt'
property mono_classifier

An instance of pre-trained mono-temporal cloud classifier. It is loaded only the first time it is required.

property multi_classifier

An instance of pre-trained multi-temporal cloud classifier. It is loaded only the first time it is required.

execute(eopatch)[source]

Add selected features (cloud probabilities and masks) to an EOPatch instance.

Parameters

eopatch – Input EOPatch instance

Returns

EOPatch with additional features

class eolearn.mask.cloud_mask.AddMultiCloudMaskTask(*args, **kwargs)[source]

Bases: eolearn.mask.cloud_mask.CloudMaskTask

Temporary class for backward compatibility. Will raise a warning when used.

Parameters
  • data_feature (str) – A data feature which stores raw Sentinel-2 reflectance bands. Default value: ‘BANDS-S2-L1C’.

  • is_data_feature (str) – A mask feature which indicates whether data is valid. Default value: ‘IS_DATA’.

  • all_bands (bool) – Flag, which indicates whether images will consist of all 13 Sentinel-2 bands or only the required 10. Default value: True.

  • processing_resolution (int or (int, int)) – Resolution to be used during the computation of cloud probabilities and masks, expressed in meters. Resolution is given as a pair of x and y resolutions. If a single value is given, it is used for both dimensions. Default is None (source resolution).

  • max_proc_frames (int) – Maximum number of frames (including the target, for multi-temporal classification) considered in a single batch iteration (To keep memory usage at agreeable levels, the task operates on smaller batches of time frames). Default value: 11.

  • mono_features ((str or None, str or None)) – Tuple of keys to be used for storing cloud probabilities and masks (in that order!) of the mono classifier. The probabilities are added as a data feature, while masks are added as a mask feature. By default, none of them are added.

  • multi_features ((str or None, str or None)) – Tuple of keys used for storing cloud probabilities and masks of the multi classifier. The probabilities are added as a data feature, while masks are added as a mask feature. By default, none of them are added.

  • mask_feature (str or None) – Name of the output intersection feature. The masks are added to the eopatch.mask attribute dictionary. Default value: ‘CLM_INTERSSIM’. If None, the intersection feature is not computed.

  • mono_threshold (float) – Cloud probability threshold for the mono classifier. Default value: 0.4.

  • multi_threshold (float) – Cloud probability threshold for the multi classifier. Default value: 0.5.

  • average_over (int or None) – Size of the pixel neighbourhood used in the averaging post-processing step. A value of 0 or None skips this post-processing step. Default value mimics the default for s2cloudless: 4.

  • dilation_size (int or None) – Size of the dilation post-processing step. A value of 0 or None skips this post-processing step. Default value mimics the default for s2cloudless: 2.

  • mono_classifier (lightgbm.Booster or sklearn.base.BaseEstimator) – Classifier used for mono-temporal cloud detection (s2cloudless or equivalent). Must work on the 10 selected reflectance bands as features (“B01”, “B02”, “B04”, “B05”, “B08”, “B8A”, “B09”, “B10”, “B11”, “B12”). Default value: None (s2cloudless is used)

  • multi_classifier (lightgbm.Booster or sklearn.base.BaseEstimator) –

    Classifier used for multi-temporal cloud detection. Must work on the 90 multi-temporal features:

    • raw reflectance value in the target frame,

    • average value within a spatial window in the target frame,

    • maximum, mean and standard deviation of the structural similarity (SSIM)

    • indices between a spatial window in the target frame and every other,

    • minimum and mean reflectance of all available time frames,

    • maximum and mean difference in reflectances between the target frame and every other.

    Default value: None (SSIM-based model is used)