pig中,limit可以取样少部分数据,但有很多问题,比如数据不能少于10条,否则返回全部。...0,1,2,3,
g_log = group test_data by (2,4);DESCRIBE g_log;
alldata = limit g_log 10;
dump alldata;--返回了全部数据...,limit 无效
返回的group结构如下
origin_cleaned_data:
{
wizad_ad_id: chararray,
guid: chararray,
Android_id: chararray...,
imei: chararray,
app_category_id: chararray
}
g_log: {
group: (android_id: chararray,app_category_id...: chararray,app_category_id: chararray),
test_data: {wizad_ad_id: chararray,guid: chararray,android_id