专栏首页杨熹的专栏Kaggle winner 方案简介 | Understanding the Amazon from Space: 1st place

Kaggle winner 方案简介 | Understanding the Amazon from Space: 1st place

Below is a brief introduction of the 1st place winner solution to the competition : Understanding the Amazon from Space

The target of this competition is to better track and understand causes of deforestation by analyzing the satellite images from the Amazon basin .

This competition contains over 40,000 training images, and what we need to do is to label them.

There are 17 labels from the following 3 groups:

  • Atmospheric conditions: clear, partly cloudy, cloudy, and haze
  • Common land cover and land use types: rainforest, agriculture, rivers, towns/cities, roads, cultivation, and bare ground
  • Rare land cover and land use types: slash and burn, selective logging, blooming, conventional mining, artisanal mining, and blow down

And each image could contain multiple labels.


This is a multiple classification problem, and the labels are imbalanced.

The 1st place winner is bestfittinghttps://www.kaggle.com/bestfitting

In preprocessing section, he applies haze removal technique and resizing, as well as some data augmentation steps, such as flipping, rotating, transposing, and elastic transforming.

As to the models, his ensemble consists of 11 popular convolutional networks which is a mixture of ResNets, DenseNets, Inception, SimpleNet with various parameters and layers. Each model is to predict the 17 labels' probabilities.

Since there is a correlation among the 17 labels, such as, the clear, partly cloudy, cloudy, and haze labels are disjoint, but habitation and agriculture labels appear together quite frequently.

And he wants to make use of this structure, he implements two-level Ridge Regression.

One is to take advantage of the relations among the 17 labels: That is for a single model, he takes in this model’s predictions of all 17 labels as features to predict the final probability for each of the 17 labels.

Another one is to select the best model to predict each label:

One more special technique is to write a his own Soft F2-Loss function, since the standard F2 loss function doesn't allow his models to pay more attention to optimizing each label’s recall.

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 面试官怎么看你的Github profile

    Udacity的Machine Learning纳米学位课程中,关于Github的笔记。 听课范围: Github Profile Git 和 Github...

    杨熹
  • What is k-means, How to set K?

    figure cited here, recommend reading: K-Means Clustering – What it is and How it...

    杨熹
  • 无人驾驶车

    Sebastian Thrun: Google's driverless car 因为朋友在一场车祸中去世,就决定致力一生于自动驾驶车的研究,想要拯救更多的生命...

    杨熹
  • 创建自己的Code Snippets在VSCode中

    1. Go to Code → Preferences → User Snippets

    前端知否
  • ppo trained carla demo show and method

    Our dream is creating a safe driving system working well under all circumstance,...

    用户1908973
  • ppo trained carla demo show and method

    Our dream is creating a safe driving system working well under all circumstance,...

    用户1908973
  • 分布式计算中的8个谬论

    Eight-Fallacies-of-Distributed-Computing-Tech-Talk

    zhuanxu
  • 人脸对齐--Face Alignment by Explicit Shape Regression

    Face Alignment by Explicit Shape Regression CVPR2012 https://github.com/soun...

    用户1148525
  • 卡内基梅隆大学全校最受欢迎的Python课主讲Prof Kosbie给学生的一些实用性建议

    I base the talk not on morals, but simply on patterns among the hundreds of CMU ...

    double
  • SAP Connect对inbound邮件接收问题的处理和调试环境搭建

    遇到一个奇怪的问题,在ABAP Netweaver系统创建了一个类,实现了SAP Connect的接口IF_INBOUND_EXIT_BCS,

    Jerry Wang

扫码关注云+社区

领取腾讯云代金券