How To: Land-Use-Land-Cover Prediction for Slovenia

This notebook shows the steps towards constructing a machine learning pipeline for predicting the land use and land cover for the region of Republic of Slovenia. We will use satellite images obtained by ESA’s Sentinel-2 to train a model and use it for prediction. The example will lead you through the whole process of creating the pipeline, with details provided at each step.

Before you start

Requirements

In order to run the example you’ll need a Sentinel Hub account. If you do not have one yet, you can create a free trial account at Sentinel Hub webpage. If you are a researcher you can even apply for a free non-commercial account at ESA OSEO page.

Once you have the account set up, login to Sentinel Hub Configurator. By default you will already have the default configuration with an instance ID (alpha-numeric code of length 36). For this tutorial we recommend that you create a new configuration ("Add new configuration") and set the configuration to be based on Python scripts template. Such configuration will already contain all layers used in these examples. Otherwise you will have to define the layers for your configuration yourself.

After you have prepared a configuration please put configuration’s instance ID into sentinelhub package’s configuration file following the configuration instructions.

Overview

Part 1:

  1. Define the Area-of-Interest (AOI):

  • Obtain the outline of Slovenia (provided)

  • Split into manageable smaller tiles

  • Select a small 3x3 area for classification

  1. Use the integrated sentinelhub-py package in order to fill the EOPatches with some content (band data, cloud masks, …)

  • Define the time interval (this example uses the whole year of 2017)

  1. Add additional information from band combinations (norm. vegetation index - NDVI, norm. water index - NDWI)

  2. Add a reference map (provided)

  • Convert provided vector data to raster and add it to EOPatches

Part 2:

  1. Prepare the training data

  • Remove too cloudy scenes

  • Perform temporal interpolation (filling gaps and resampling to the same dates)

  • Apply erosion

  • Random spatial sampling of the EOPatches

  • Split patches for training/validation

  1. Construct and train the ML model

  • Make the prediction for each patch

  1. Validate the model

  2. Visualise the results

Let’s start!

[1]:
# Firstly, some necessary imports

# Jupyter notebook related
%reload_ext autoreload
%autoreload 2
%matplotlib inline

# Built-in modules
import pickle
import sys
import os
import datetime
import itertools
from enum import Enum

# Basics of Python data handling and visualization
import numpy as np
import geopandas as gpd
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from mpl_toolkits.axes_grid1 import make_axes_locatable
from shapely.geometry import Polygon
from tqdm import tqdm_notebook as tqdm

# Machine learning
import lightgbm as lgb
from sklearn.externals import joblib
from sklearn import metrics
from sklearn import preprocessing

# Imports from eo-learn and sentinelhub-py
from eolearn.core import EOTask, EOPatch, LinearWorkflow, FeatureType, OverwritePermission, \
    LoadFromDisk, SaveToDisk, EOExecutor
from eolearn.io import S2L1CWCSInput, ExportToTiff
from eolearn.mask import AddCloudMaskTask, get_s2_pixel_cloud_detector, AddValidDataMaskTask
from eolearn.geometry import VectorToRaster, PointSamplingTask, ErosionTask
from eolearn.features import LinearInterpolation, SimpleFilterTask
from sentinelhub import BBoxSplitter, BBox, CRS, CustomUrlParam