好了,这是我的桌子(大得多),有不同的风格(a,B,C,d……)。
+-------+-------+---------+
| Style | Range | Int |
+-------+-------+---------+
| A | 0-1 | . |
| A | 1-5 | . |
| A | 5-11 | . |
| A | 11-12 | . |
| A | 12-24 | -999999 |
| A | 24-36 | 0 |
| A | 36-48 | 0 |
| A | 48-60 | -999999 |
| A | 60+ | 0 |
+-------+-------+---------+
我想要根据如下的序列更改每个单一的-999999个值:
if(int = -999999) and range ='12-24'
then get the first non null value in int from
11-12, 5-11, 1-5, 0-1 24-36, 36-48, 48-60, 60+ in that order
if(int = -999999) and range ='0-1' or '1-5' or '5-11' or '11-12'
then get the first non null value in int from
12-24, 11-12, 5-11, 1-5, 0-1 24-36, 36-48, 48-60, 60+ in that order without looking up self
if(int = -999999) and range ='24-36'
then get the first non null value in int from
36-48, 48-60, 60+, 12-24, 11-12, 5-11, 1-5, 0-1 in that order
if(int = -999999) and range ='36-48' or '48-60' or '60+'
then get the first non null value in int from
24-36, 36-48, 48-60, 60+, 12-24, 11-12, 5-11, 1-5, 0-1 in that order without looking up self
这将使我得到这个表:
+-------+-------+---------+
| Style | Range | Int |
+-------+-------+---------+
| A | 0-1 | . |
| A | 1-5 | . |
| A | 5-11 | . |
| A | 11-12 | . |
| A | 12-24 | 0 |
| A | 24-36 | 0 |
| A | 36-48 | 0 |
| A | 48-60 | 0 |
| A | 60+ | 0 |
+-------+-------+---------+
希望这一点是清楚的。有没有更简单的方法来做到这一点?
谢谢
发布于 2019-05-18 20:25:07
对于每个处理过的样式,您必须将所有的int
值存储在一个数组(或散列)中,然后在12-24
上执行特定的向后和向前规则。
问:假设您的样本数据具有0-1
-999999
。规则将从range 12-24
中挑选出-999999
。您是搜索一次直到non-missing
,还是搜索多次直到non-missing
和non-<special-value>
。也许您认为这样的数据不会出现,但数据并不关心您的想法:)
假设每种风格都有九个范围。串行DOW循环将允许您在向下查找组的项目时排列组int
,然后根据查找规则处理每个项目。可以在增量表中对规则进行编码,该增量表指示下一个查找值在哪里。
data have; infile datalines dlm='|';
input
Style$ Range$ Int; datalines;
| A | 0-1 | . |
| A | 1-5 | . |
| A | 5-11 | . |
| A | 11-12 | . |
| A | 12-24 | -999999 |
| A | 24-36 | 0 |
| A | 36-48 | 0 |
| A | 48-60 | -999999 |
| A | 60+ | 0 |
run;
data want;
array ints(9) int_1 - int_9;
do _n_ = 1 by 1 until (last.style);
set have;
by style;
ints(_n_) = int;
end;
do _n_ = 1 to _n_;
set have;
if int = -999999 then do;
* traversal data for look up rules;
array rule[9,9] _temporary_ (
/* 0-1 */ +4 +3 +2 +1 +0 +5 +6 +7 +8
/* 1-5 */ +3 +2 +1 +0 -1 +4 +5 +6 +7
/* 5-11 */ +2 +1 +0 -1 -2 +3 +4 +5 +6
/* 11-12 */ +1 +0 -1 -2 -3 +2 +3 +4 +5
/* 12-24 */ -1 -2 -3 -4 +0 +1 +2 +3 +4
/* 24-36 */ +1 +2 +3 +0 -1 -2 -3 -4 -5
/* 36-48 */ -1 +0 +1 +2 -2 -3 -4 -5 -6
/* 48-60 */ -2 -1 +0 +1 -3 -4 -5 -6 -7
/* 60+ */ -3 -2 -1 +0 -4 -5 -6 -7 -8
);
do _m_ = 1 to 9 while (int = -999999);
step = rule(_n_,_m_);
if ints(_n_ + step) not in (., -999999) then
int = ints(_n_ + step);
end;
end;
output;
end;
drop int_: _m_ step;
run;
@Whymath询问rule
数组是如何构成的。
编码假设每个组都有每个范围,并且这些范围按照问题中规定的顺序进行排序。
Array
Range Index
----- -----
0-1 1
1-5 2
5-11 3
11-12 4
12-24 5
24-36 6
36-48 7
48-60 8
60+ 9
范围数组索引用于rules
数组的第一维。
那么,当-999999在0-1范围内找到时,该怎么做呢?第二个if告诉你
if(int = -999999) and range ='0-1' or '1-5' or '5-11' or '11-12' then get the first non null value in int from 12-24, 11-12, 5-11, 1-5, 0-1 24-36, 36-48, 48-60, 60+ in that order
第一个DO
循环用组的int
值填充ints
数组(包含9个值)(假设每个范围都有一个值)。
因此,当-999999找到@ 0-1时,if将按照范围查找顺序查找非null、非999999:
0-1 range lookup index delta
index lookup index [0-1] index to lookup index
----- ------ ------ --------
1 12-24 5 +4
1 11-12 4 +3
1 5-11 3 +2
1 1-5 2 +1
1 0-1 1 0
1 24-36 6 +5
1 36-48 7 +6
1 48-60 8 +7
1 60+ 9 +8
向下查看索引增量,当搜索非null非999999时,这些值成为要应用的步骤。索引增量是rules
表中的第二个维度,因此这些值会在2-D数组初始化中传递。
对于其他8个范围,重复逻辑转换为'steps‘,您将得到9x9 rules
矩阵。
如果范围不一定是完整的,则可以使用散列而不是数组来完成每个组中的值和搜索规则。下面是一个使用Proc DS2
和多数据散列规则的示例。
* implement loop over group_lookup[logic_lookup[range]];
* the inner lookup result becomes the key for the outer lookup;
proc ds2;
data want(overwrite=yes keep=(style range "int"));
declare char(8) style;
declare char(8) range key;
declare int "int" value;
declare package hash group_lookup([key], [value]);
declare package hash logic_lookup();
method init();
declare char(8) keys[9];
declare int index;
logic_lookup.keys([range]);
logic_lookup.data([key]);
logic_lookup.multiData('yes');
logic_lookup.defineDone();
keys := ('12-24' '11-12' '5-11' '1-5' '0-1' '24-36' '36-48' '48-60' '60+');
range = '0-1'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
range = '1-5'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
range = '5-11'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
range = '11-12'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
range = '12-24'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
keys := ('24-36' '36-48' '48-60' '60+' '12-24' '11-12' '5-11' '1-5' '0-1' '24-36');
range = '24-36'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
range = '36-48'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
range = '48-60'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
range = '60+'; do index = 1 to dim(keys); key = keys[index]; if key ne range then logic_lookup.add(); end;
end;
method findReplacement(char(10) range, int in_value) returns int;
declare int result rc;
if in_value not in (. -999999) then return in_value;
if logic_lookup.find() eq 0 then do; * hash host variables [range] and [key] are global, do not mask them with local variables of the same name;
if group_lookup.find() eq 0 then do; * hash host variables [keys] and [value] are global, do not mask them with local variables of the same name;
if value not in (. -999999) then return value;
do while (logic_lookup.has_next() = 0); * search the other rule ranges listed in the multidata;
if logic_lookup.find_next() = 0
& group_lookup.find() = 0
& value not in (. -999999) then return value;
end;
end;
end;
else do;
put 'ERROR: Invalid range in data,' range=;
end;
return in_value;
end;
method run();
declare int rc;
declare double index;
group_lookup.clear();
do index = 1 to CONSTANT('BIG') until(last.style);
set have(locktable=share);
by style;
group_lookup.add([range],["int"]); * key -> value;
end;
do index = 1 to index;
set have;
"int" = findReplacement(range, "int");
output;
end;
end;
enddata;
run;
quit;
%let syslast = want;
https://stackoverflow.com/questions/56188975
复制相似问题