作者 | Daniel Moraite
来源 | Medium
编辑 | 代码医生团队
今天将尝试一个关于树覆盖预测的演示,其中展示了使用eo-learn进行机器学习/深度学习是多么容易。将训练U-net深度学习网络来预测树木覆盖。
在英国(伦敦西北部)选择了超过600平方英里的面积。Geopedia的欧盟树木覆盖密度已被用于收集地面实况数据。
建立
- install Sentinel Hub
- install eo-learn
- install keras and tensorflow
(请在文章末尾找到资源的链接)
数据提取
在之前找到如何获得感兴趣的区域AOI坐标的详细信息:使用Python发布的卫星图像分析。确保将坐标保存在工作目录中的file.geojson中,或者如果已复制github repo:../ eo-learn-master/example_data/。
https://danielmoraite.github.io/docs/satellite1.html
全局图像请求参数
time_interval = ('2019-01-01', '2019-05-26')
img_width = 240
img_height = 256
maxcc = 0.2
得到AOI并拆分成bbox
crs = CRS.UTM_31N
aoi = geopandas.read_file('../../example_data/europe.geojson')
aoi = aoi.to_crs(crs={'init':CRS.ogc_string(crs)})
aoi_shape = aoi.geometry.values.tolist()[-1]
bbox_splitter = BBoxSplitter([aoi_shape], crs, (19, 10))
为Geopedia任务设置raster_value转换,请在此处详细了解如何执行此操作:
raster_value = {
'0%': (0, [0, 0, 0, 0]),
'10%': (1, [163, 235, 153, 255]),
'30%': (2, [119, 195, 118, 255]),
'50%': (3, [85, 160, 89, 255]),
'70%': (4, [58, 130, 64, 255]),
'90%': (5, [36, 103, 44, 255])
}
import matplotlib as mpl
tree_cmap = mpl.colors.ListedColormap(['#F0F0F0',
'#A2EB9B',
'#77C277',
'#539F5B',
'#388141',
'#226528'])
tree_cmap.set_over('white')
tree_cmap.set_under('white')
bounds = np.arange(-0.5, 6, 1).tolist()
tree_norm = mpl.colors.BoundaryNorm(bounds, tree_cmap.N)
创建用于计算中值像素值的任务
class MedianPixel(EOTask):
"""
The task returns a pixelwise median value from a time-series and stores the results in a
timeless data array.
"""
def __init__(self, feature, feature_out):
self.feature_type, self.feature_name = next(self._parse_features(feature)())
self.feature_type_out, self.feature_name_out = next(self._parse_features(feature_out)())
def execute(self, eopatch):
eopatch.add_feature(self.feature_type_out, self.feature_name_out,
np.median(eopatch[self.feature_type][self.feature_name], axis=0))
return eopatch
初始化任务任务以获得S2 L2A图像
input_task = S2L2AWCSInput('TRUE-COLOR-S2-L2A', resx='10m', resy='10m', maxcc=0.2)
从Geopedia获得真相的任务
geopedia_data = AddGeopediaFeature((FeatureType.MASK_TIMELESS, 'TREE_COVER'),
layer='ttl2275', theme='QP', raster_value=raster_value)
计算中值的任务
get_median_pixel = MedianPixel((FeatureType.DATA, 'TRUE-COLOR-S2-L2A'),
feature_out=(FeatureType.DATA_TIMELESS, 'MEDIAN_PIXEL'))
保存到磁盘的任务
save = SaveToDisk(op.join('data', 'eopatch'),
overwrite_permission=OverwritePermission.OVERWRITE_PATCH,
compress_level=2)
初始化工作流程
workflow = LinearWorkflow(input_task, geopedia_data, get_median_pixel, save)
使用函数在单个bbox上运行此工作流
def execute_workflow(index):
bbox = bbox_splitter.bbox_list[index]
info = bbox_splitter.info_list[index]
patch_name = 'eopatch_{0}_row-{1}_col-{2}'.format(index,
info['index_x'],
info['index_y'])
results = workflow.execute({input_task:{'bbox':bbox, 'time_interval':time_interval},
save:{'eopatch_folder':patch_name}
})
return list(results.values())[-1]
del results
1.测试示例补丁和显示的工作流程
idx = 168 # feel free to check other idx`s
example_patch = execute_workflow(idx)
example_patch
EOPatch(
data: {
TRUE-COLOR-S2-L2A: numpy.ndarray(shape=(3, 427, 240, 3), dtype=float32)
}
mask: {
IS_DATA: numpy.ndarray(shape=(3, 427, 240, 1), dtype=bool)
}
scalar: {}
label: {}
vector: {}
data_timeless: {
MEDIAN_PIXEL: numpy.ndarray(shape=(427, 240, 3), dtype=float32)
}
mask_timeless: {
TREE_COVER: numpy.ndarray(shape=(427, 240, 1), dtype=uint8)
}
scalar_timeless: {}
label_timeless: {}
vector_timeless: {}
meta_info: {
maxcc: 0.2
service_type: 'wcs'
size_x: '10m'
size_y: '10m'
time_difference: datetime.timedelta(-1, 86399)
time_interval: ('2019-01-01', '2019-05-26')
}
bbox: BBox(((247844.22638276426, 5741388.945588876), (250246.2123813057, 5745654.985149694)), crs=EPSG:32631)
timestamp: [datetime.datetime(2019, 1, 17, 11, 16, 48), ..., datetime.datetime(2019, 2, 26, 11, 22, 1)], length=3
)
mp = example_patch.data_timeless['MEDIAN_PIXEL']
plt.figure(figsize=(15,15))
plt.imshow(2.5*mp)
tc = example_patch.mask_timeless['TREE_COVER']
plt.imshow(tc[...,0], vmin=0, vmax=5, alpha=.5, cmap=tree_cmap)
plt.colorbar()
2.在所有修补程序上运行工作流
运行多个bbox
subset_idx = len(bbox_splitter.bbox_list)
x_train_raw = np.empty((subset_idx, img_height, img_width, 3))
y_train_raw = np.empty((subset_idx, img_height, img_width, 1))
pbar = tqdm(total=subset_idx)
for idx in range(0, subset_idx):
patch = execute_workflow(idx)
x_train_raw[idx] = patch.data_timeless['MEDIAN_PIXEL'][20:276,0:240,:]
y_train_raw[idx] = patch.mask_timeless['TREE_COVER'][20:276,0:240,:]
pbar.update(1)
3.创建训练和验证数据阵列
数据规范化和扩充
img_mean = np.mean(x_train_raw, axis=(0, 1, 2))
img_std = np.std(x_train_raw, axis=(0, 1, 2))
x_train_mean = x_train_raw - img_mean
x_train = x_train_mean - img_std
train_gen = ImageDataGenerator(
horizontal_flip=True,
vertical_flip=True,
rotation_range=180)
y_train = to_categorical(y_train_raw, len(raster_value))
4.使用Keras(tensorflow后端)设置U-net模型
通过keras2权重称量边界像素损失脚本的模型设置:加权张量(与掩模图像相同的形状)
def weighted_bce_loss(y_true, y_pred, weight):
# avoiding overflow
epsilon = 1e-7
y_pred = K.clip(y_pred, epsilon, 1. - epsilon)
logit_y_pred = K.log(y_pred / (1. - y_pred))
check:weighted_cross_entropy_with_logits
https://www.tensorflow.org/api_docs/python/tf/nn/weighted_cross_entropy_with_logits
loss = (1. - y_true) * logit_y_pred + (1. + (weight - 1.) * y_true) * \
(K.log(1. + K.exp(-K.abs(logit_y_pred))) + K.maximum(-logit_y_pred, 0.))
return K.sum(loss) / K.sum(weight)
def weighted_dice_loss(y_true, y_pred, weight):
smooth = 1.
w, m1, m2 = weight * weight, y_true, y_pred
intersection = (m1 * m2)
score = (2. * K.sum(w * intersection) + smooth) / (K.sum(w * m1) + K.sum(w * m2) + smooth)
loss = 1. - K.sum(score)
return loss
def weighted_bce_dice_loss(y_true, y_pred):
y_true = K.cast(y_true, 'float32')
y_pred = K.cast(y_pred, 'float32')
# if we want to get same size of output, kernel size must be odd number
averaged_mask = K.pool2d(
y_true, pool_size=(11, 11), strides=(1, 1), padding='same', pool_mode='avg')
border = K.cast(K.greater(averaged_mask, 0.005), 'float32') * K.cast(K.less(averaged_mask, 0.995), 'float32')
weight = K.ones_like(averaged_mask)
w0 = K.sum(weight)
weight += border * 2
w1 = K.sum(weight)
weight *= (w0 / w1)
loss = weighted_bce_loss(y_true, y_pred, weight) + \
weighted_dice_loss(y_true, y_pred, weight)
return loss
def unet(input_size):
inputs = Input(input_size)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(inputs)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(pool2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(pool3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv4)
drop4 = Dropout(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(pool4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv5)
drop5 = Dropout(0.5)(conv5)
up6 = Conv2D(512, 2, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
merge6 = concatenate([drop4,up6])
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(merge6)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv6)
up7 = Conv2D(256, 2, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
merge7 = concatenate([conv3,up7])
conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(merge7)
conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv7)
up8 = Conv2D(128, 2, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
merge8 = concatenate([conv2,up8])
conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(merge8)
conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv8)
up9 = Conv2D(64, 2, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
merge9 = concatenate([conv1,up9])
conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(merge9)
conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same',
kernel_initializer = 'he_normal')(conv9)
conv10 = Conv2D(len(raster_value), 1, activation = 'softmax')(conv9)
model = Model(inputs = inputs, outputs = conv10)
model.compile(optimizer = Adam(lr = 1e-4),
loss = weighted_bce_dice_loss,
metrics = ['accuracy'])
return model
model = unet(input_size=(256, 240, 3))
5.训练模型
适合模型
batch_size = 16
model.fit_generator(
train_gen.flow(x_train, y_train, batch_size=batch_size),
steps_per_epoch=len(x_train),
epochs=20,
verbose=1)
model.save(op.join('model.h5'))
6.验证模型并显示一些结果
绘制一个例子(图像,标签,预测)
idx = 4 #feel free to check different idx`s as you can see I`ve done in the photos bellow.
p = np.argmax(model.predict(np.array([x_train[idx]])), axis=3)
fig = plt.figure(figsize=(12,4))
ax1 = fig.add_subplot(1,3,1)
ax1.imshow(x_train_raw[idx])
ax2 = fig.add_subplot(1,3,2)
ax2.imshow(y_train_raw[idx][:,:,0])
ax3 = fig.add_subplot(1,3,3)
ax3.imshow(p[0])
绘制一个例子(图像,标签,预测)
idx = 4
p = np.argmax(model.predict(np.array([x_train[idx]])), axis=3)
fig = plt.figure(figsize=(12,4))
ax1 = fig.add_subplot(1,3,1)
ax1.imshow(x_train_raw[idx])
ax2 = fig.add_subplot(1,3,2)
ax2.imshow(y_train_raw[idx][:,:,0])
ax3 = fig.add_subplot(1,3,3)
ax3.imshow(p[0])
显示图像混淆矩阵
predictions = np.argmax(model.predict(x_train), axis=3)
cnf_matrix = confusion_matrix(y_train_raw.reshape(len(y_train_raw) * 256 * 256, 1),
predictions.reshape(len(predictions) * 256 * 256, 1))
plot_confusion_matrix(cnf_matrix, raster_value.keys(), normalize=True)
完整的代码:
https://github.com/DanielMoraite/DanielMoraite.github.io/blob/master/assets/tree-cover-keras.ipynb
资源:
1.eo-learn的文档
https://eo-learn.readthedocs.io/en/latest/index.html
2.eo-learn github
https://github.com/sentinel-hub/eo-learn
3.建议复制驱动器上的eo-learn github存储库
4.sentinelhub创建帐户
https://services.sentinel-hub.com/oauth/subscription?param_redirect_uri=https://apps.sentinel-hub.com/dashboard/oauthCallback.html¶m_state=%2Fconfigurations¶m_scope=SH¶m_client_id=30cf1d69-af7e-4f3a-997d-0643d660a478&origin=
5.sentinelhub实例ID
https://www.sentinel-hub.com/faq/where-get-instance-id
6.sentinelhub安装
https://pypi.org/project/sentinelhub/
笔记:
# SentinelHub API Key (instance id) stored as an env variable(if you want to call it from your notebook):
import os
os.environ['INSTANCE_ID']='your instance ID'
INSTANCE_ID = os.environ.get('INSTANCE_ID')