Flink Forward 2019--实战相关(16)--Yelp分享实时访问规模预测

Realtime Store Visit Predictions at Scale -- Luca Giovagnoli(Yelp)

This talk aims to inspire attendees with a multidisciplinary Flink application, where different fields have come together with a graceful synergy. You will hear about geospatial clustering algorithms, a gradient boosting ML model, and cutting-edge stream-processing technology - all in the same talk! And, if you are wondering, you can incorporate all this into your SOA using Async I/O!


After introducing our product use-case (real-time notifications for nearby local businesses), we’ll dive into the big data challenges. The talk will be describing a Visit Detection algorithm we have built to cluster raw GPS pings into Visits, using Flink state management and custom processing constructs (custom Windows, Triggers and Evictors). Finally we will discuss a real-time machine learning model to predict the correct nearby business, leveraging Flink’s Async I/O at scale.

在介绍了我们的产品用例(针对附近本地企业的实时通知)之后,我们将深入探讨大数据挑战。讨论将描述一种访问检测算法,我们已经构建了该算法,使用Flink状态管理和自定义处理结构(自定义窗口、触发器和选择器)将原始GPS Ping集群到访问中。最后,我们将讨论一个实时机器学习模型,利用Flink的异步I/O在规模上预测正确的附近业务。

Flink enabled us to scale complex algorithms to thousands of operations per second, and to power hundreds of thousands of daily push notifications. It availed itself as a clearly superior alternative, whose performance netted Yelp great cost savings, and allowed us to move away from hardly scalable Python alternatives.


原文发布于微信公众号 - Flink实战应用指南(FlinkChina)





0 条评论
登录 后参与评论