我有一个SPSS数据集,里面有5000多个案例,如下所示:
ID, relation to head of household
1, head of household
1, son
1, partner
2, head of household
2, son
3, head of household
3, son
3, cousin我需要数数有多少个家庭
我知道这应该使用ID作为分段变量,但不知道如何实现。
发布于 2015-11-02 12:29:12
一种方法是为每个类别建立一组虚拟变量,然后使用聚合来获取家庭级别的统计数据。
DATA LIST LIST (",") /ID (F1.0) Relation (A20).
BEGIN DATA
1,head of household
1,son
1,partner
2,head of household
2,son
3,head of household
3,son
3,cousin
END DATA.
DATASET NAME Houses.
*Making dummy variables.
COMPUTE HeadHouse = (Relation = "head of household").
COMPUTE Partner = (Relation = "partner").
COMPUTE Child = (Relation = "son").
COMPUTE Relative = (Relation = "cousin").
DATASET DECLARE AggHouse.
AGGREGATE OUTFILE='AggHouse'
/BREAK ID
/HeadHouse = SUM(HeadHouse)
/Partner = SUM(Partner)
/Child = SUM(Child)
/Relative = SUM(Relative).然后,对于聚合的数据集,您可以随后使用IF语句来计算所需的条件。例如。
DATASET ACTIVATE AggHouse.
IF (HeadHouse > 0) AND (Child > 0) First = 1.
IF (HeadHouse > 0) AND (Partner > 0) AND (Child > 0) Second = 1.对于真实的数据集,您需要为原始的虚拟变量集插入更多的条件,但我将这作为练习留给您。
https://stackoverflow.com/questions/33473967
复制相似问题