r-按组（dplyr）变异新变量时删除NAs

落寞瓜子壳 lv.3

发布时间：2022-05-24 13:39:54 339

相关标签： # 数据

我正在使用EU-SILC数据进行uni项目。我想创建一个新变量，将所有家庭分配到相应的住房成本组，以创建一个叠加密度图，其中收入分配与住房成本相关。

我遇到了两个问题：

我无法创建变量hcost\U组，因为我的住房成本变量（将住户分配到这些组的基础）有47个NAs（在近70000个观察值中）。在创建新变量时，我尝试了许多不同的方法来删除NAs，但始终收到一条错误消息。
由于我不想一般删除我没有住房成本的家庭，hcost\u组变量将比我的收入变量短-我如何仅针对地块排除我没有住房成本的家庭的收入？

提前多谢！

以下是我创建变量和绘图的代码（inkl错误消息）：

data <- data %>% filter(!is.na(hcost)) %>% group_by(country) %>% 
+   mutate(hcost_group = quantcut(hcost, q=c(0.1, 0.2, 0.3, 0.4)))
Error: Problem with `mutate()` column `hcost_group`.
i `hcost_group = quantcut(hcost, q = c(0.1, 0.2, 0.3, 0.4))`.
x missing value where TRUE/FALSE needed
i The error occurred in group 6: country = "UK".
Run `rlang::last_error()` to see where the error occurred.

> 
> ggplot(data=data, aes(x=decile, group=hcost_group, fill=hcost_group)) +
+   geom_density(adjust=1.5, position="fill") +
+   facet_wrap(~country)+
+   xlab("Einkommensdezil")+
+   ylab("Anteil der Gruppen nach Wohnkostenbelastung")+
+   scale_fill_discrete(name = "Wohnkostenbelastung (Anteil der Wohnkosten am EK)",
+                       labels = 
+                         c("0-10%", "10-20%","20-30%",
+                           "30-40%", "40-100%"))
Error in FUN(X[[i]], ...) : object 'hcost_group' not found

我还尝试了“na.rm = TRUE”、“na.omit()”和“complete.cases”。

特别声明：以上内容（图片及文字）均为互联网收集或者用户上传发布，本站仅提供信息存储服务！如有侵权或有涉及法律问题请联系我们。