前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >R语言代做编程辅导IS4240 Business Intelligence Systems Assignment 1(附答案)

R语言代做编程辅导IS4240 Business Intelligence Systems Assignment 1(附答案)

原创
作者头像
拓端
发布2022-11-30 12:02:41
2540
发布2022-11-30 12:02:41
举报
文章被收录于专栏:拓端tecdat

全文链接:http://tecdat.cn/?p=30629

Learning Objectives

·       Use the R environment to do data exploration and data preparation.


Submission Information

·       This assignment contributes 5% to the final course grade. The total marks for this assignment is 20.

·       Please ensure that you have written your name and matric number in the document.


1.     This question will be based on the Heart Disease dataset (processed.va.data). The dataset consists of 200 instances, each having 14 numeric attributes. The description of the dataset can be found in http://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/heart-disease.names (Long Beach VA)

a)    Provide the R codes for loading the dataset into a variable heart. The attributes should be given reasonable attribute names based on the description given above. Ensure that all the attributes are of numeric (or integer) type. (Hint: you should be able to easily convert missing values to be of NA type by using an appropriate function argument) (3 marks)

代码语言:javascript
复制
colnames(heart)=c("age","sex","cp","trestbps","chol","fbs","restecg","thalach","exang","oldpeak","slope","ca","thal","num")

 

for(i in 1:nrow(heart)){ heart[i,which(heart[i,]=="?")]=NA}

a)    Provide the R codes for getting the number of missing values for each attribute. Fill in the table below. (5 marks)

代码语言:javascript
复制
for(i in 1:ncol(heart)){

 sumna[i]=0

 for(j in 1:nrow(heart)){

Attribute****

Number of missing values****

age

0

sex

0

cp

0

trestbps

56

chol

7

fbs

7

restecg

0

thalach

53

exang

53

oldpeak

56

slope

102

ca

198

thal

166

num

0

a)    Based on the number of missing values for each attribute, discuss one potential issue if we were to remove instances with one or more missing attributes. (4 mark)

代码语言:javascript
复制
for(i in 1:nrow(heart)){

 sum[i]=0

 for(j in 1:ncol(heart)){

 if(is.na(heart[i,j])) sum[i]=sum[i]+1;

a)    Instead of removing instances with one or more missing attributes, propose an alternative approach for handling this problem? (4 mark)

代码语言:javascript
复制
for(i in 1:ncol(heart)){

    sum[i]=0 ;

    for(j in 1:nrow(heart)){

        if(is.na(heart[j,i])==FALSE)sum[i]=sum[i]+heart[j,i];

a)    Provide the R codes for generating the correlation matrix for the attributes: age, sex, cp, restecg, num. Show the correlation matrix. (4 mark)

代码语言:javascript
复制
cor(heart[c("age", "sex", "cp", "restecg", "num")])

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 全文链接:http://tecdat.cn/?p=30629
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档