我有一个表格,上面显示了人们的姓名、出生日期和死亡日期(1900-2000)。我需要知道某一段时间内每年的人口数量,例如,1940年的人口为23亿,1941年为24亿,1942年为22亿,以此类推,直到1950年。
我在SAS企业指南中工作,也许代码看起来会与普通的sql略有不同。至少我希望看到这样的东西:
~人数|年份
2.300.000.000 |1940 2.400.000.000 |1941 .
select
count(name),
from db
where bd<1jan1940 and dd>=1jan1940 and dd=<31dec1940
group by month
发布于 2019-06-06 05:50:58
首先,您必须知道1899年底的初始人口。假设这是20亿美元。然后每年加上出生人数减去死亡人数。(要执行此操作,必须访问该表两次,一次用于出生,一次用于死亡。)使用SUM OVER
获取运行总数。
我不确定您实际使用的是哪种DBMS,但这是非常标准的SQL:
select yr, 2000000000 + sum(births.cnt - deaths.cnt) over (order by yr)
from
(
select extract(year from bd) as yr, count(*) as cnt
from db
group by extract(year from bd)
) births
join
(
select extract(year from dd) as yr, count(*) as cnt
from db
group by extract(year from dd)
) deaths using (yr)
order by yr;
发布于 2019-06-11 00:26:01
data dob_data;
do i = 1 to 10000;
num = ceil(rand('UNIFORM',0,10));
dob = intnx('day','01JAN1899'd,ceil(rand('UNIFORM',1,36865)));
select (num);
when (1) dod = intnx('day',dob,ceil(rand('UNIFORM',1,36865)));
otherwise dod = .;
end;
output;
end;
format dob dod date9.;
drop num;
run;
data calendar;
do i=0 to 100;
year = 1900+i;
soy = intnx('year','01JAN1900'd,i,'s');
eoy = intnx('year','01JAN1900'd,i,'e');
output;
end;
format soy eoy date9.;
run;
proc sql;
create table pop as
select year,
sum(case when DOB < soy and coalesce(DOD,'31DEC2200'd) ge soy then 1 else 0 end) as Alive_At_Start,
sum(case when DOB between soy and eoy then 1 else 0 end) as Born_During,
sum(case when coalesce(DOD,'31DEC2200'd) between soy and eoy then -1 else 0 end) as Passed,
sum(case when DOB le eoy and coalesce(DOD,'31DEC2200'd) > eoy then 1 else 0 end) as Alive_At_End
from dob_data t1, calendar t2
group by year;
quit;
https://stackoverflow.com/questions/56468189
复制相似问题