前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Kaggle winner 方案简介 | Understanding the Amazon from Space: 1st place

Kaggle winner 方案简介 | Understanding the Amazon from Space: 1st place

作者头像
杨熹
发布2018-04-03 16:32:28
7860
发布2018-04-03 16:32:28
举报
文章被收录于专栏:杨熹的专栏杨熹的专栏

Below is a brief introduction of the 1st place winner solution to the competition : Understanding the Amazon from Space

The target of this competition is to better track and understand causes of deforestation by analyzing the satellite images from the Amazon basin .

This competition contains over 40,000 training images, and what we need to do is to label them.

There are 17 labels from the following 3 groups:

  • Atmospheric conditions: clear, partly cloudy, cloudy, and haze
  • Common land cover and land use types: rainforest, agriculture, rivers, towns/cities, roads, cultivation, and bare ground
  • Rare land cover and land use types: slash and burn, selective logging, blooming, conventional mining, artisanal mining, and blow down

And each image could contain multiple labels.


This is a multiple classification problem, and the labels are imbalanced.

The 1st place winner is bestfittinghttps://www.kaggle.com/bestfitting

In preprocessing section, he applies haze removal technique and resizing, as well as some data augmentation steps, such as flipping, rotating, transposing, and elastic transforming.

As to the models, his ensemble consists of 11 popular convolutional networks which is a mixture of ResNets, DenseNets, Inception, SimpleNet with various parameters and layers. Each model is to predict the 17 labels' probabilities.

Since there is a correlation among the 17 labels, such as, the clear, partly cloudy, cloudy, and haze labels are disjoint, but habitation and agriculture labels appear together quite frequently.

And he wants to make use of this structure, he implements two-level Ridge Regression.

One is to take advantage of the relations among the 17 labels: That is for a single model, he takes in this model’s predictions of all 17 labels as features to predict the final probability for each of the 17 labels.

Another one is to select the best model to predict each label:

One more special technique is to write a his own Soft F2-Loss function, since the standard F2 loss function doesn't allow his models to pay more attention to optimizing each label’s recall.

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2017.11.21 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档