blocks|key|696052|text|尝试：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|696053|ddf
+++++Person+++++IPaddress
1++36598035+222.999.22.99
2++36598035+222.999.22.99
3++36598035+222.999.22.99
4++36598035+222.999.22.99
5++36598035+222.999.22.99
6++36598035+444.666.44.66
7++37811171+111.88.111.88
8++37811171+111.88.111.88
9++37811171+111.88.111.88
10+37811171+111.88.111.88
11+37811171+111.88.111.88

dd1+=+data.table(with(ddf,+table(Person,+IPaddress)))[rev(order(N))][!duplicated(Person)]
dd1
+++++Person+++++IPaddress+N
1:+36598035+222.999.22.99+5
2:+37811171+111.88.111.88+5

dd1$all_login_count+=+data.table(with(ddf,+table(Person)))$V1
dd1
+++++Person+++++IPaddress+N+all_login_count
1:+36598035+222.999.22.99+5+++++++++++++++6
2:+37811171+111.88.111.88+5+++++++++++++++5|code-block|syntax|javascript|696054|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

Try:

<pre><code>ddf
 Person IPaddress
1 36598035 222.999.22.99
2 36598035 222.999.22.99
3 36598035 222.999.22.99
4 36598035 222.999.22.99
5 36598035 222.999.22.99
6 36598035 444.666.44.66
7 37811171 111.88.111.88
8 37811171 111.88.111.88
9 37811171 111.88.111.88
10 37811171 111.88.111.88
11 37811171 111.88.111.88

dd1 = data.table(with(ddf, table(Person, IPaddress)))[rev(order(N))][!duplicated(Person)]
dd1
 Person IPaddress N
1: 36598035 222.999.22.99 5
2: 37811171 111.88.111.88 5

dd1$all_login_count = data.table(with(ddf, table(Person)))$V1
dd1
 Person IPaddress N all_login_count
1: 36598035 222.999.22.99 5 6
2: 37811171 111.88.111.88 5 5
</code></pre>

blocks|key|1111986|text|这里有一种方法。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1111987|library(dplyr)

mydf+%25>%25
++++group_by(Person,+IPaddress)+%25>%25+#+For+each+combination+of+person+and+IPaddress
++++summarize(freq+=+n())+%25>%25+#+Get+total+number+of+log-in
++++arrange(Person,+desc(freq))+%25>%25+#+The+most+frequent+IP+address+is+in+the+1st+row+for+each+user
++++group_by(Person)+%25>%25+#+For+each+user
++++mutate(total+=+sum(freq))+%25>%25+#+Get+total+number+of+log-in
++++select(-freq)+%25>%25+#+Remove+count
++++do(head(.,1))+#+Take+the+first+row+for+each+user

#++++Person+++++IPaddress+total
#1+36598035+222.999.22.99+++++6
#2+37811171+111.88.111.88+++++5|code-block|syntax|javascript|1111988|更新|offset|length|style|BOLD|1111989|dplyr+0.3现在出来了。所以，你也可以做下面的事情。使用count只需缩短一行。我还使用了@aosmith推荐的slice。|CODE|1111990|mydf+%25>%25
++++count(Person,+IPaddress)+%25>%25
++++arrange(Person,+desc(n))+%25>%25
++++group_by(Person)+%25>%25
++++mutate(total+=+sum(n))+%25>%25
++++select(-n)+%25>%25
++++slice(1)|1111991|entityMap^0|0|0|0|2|0|0|5|V|5|1N|5|0|0^^$0|@$1|2|3|4|5|6|7|T|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|U|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|V|8|@$I|W|J|X|K|L]]|9|@]|A|$]]|$1|M|3|N|5|6|7|Y|8|@$I|Z|J|10|K|O]|$I|11|J|12|K|O]|$I|13|J|14|K|O]]|9|@]|A|$]]|$1|P|3|Q|5|D|7|15|8|@]|9|@]|A|$E|F]]|$1|R|3|-4|5|6|7|16|8|@]|9|@]|A|$]]]|S|$]]

Here is one approach.

<pre><code>library(dplyr)

mydf %&gt;%
 group_by(Person, IPaddress) %&gt;% # For each combination of person and IPaddress
 summarize(freq = n()) %&gt;% # Get total number of log-in
 arrange(Person, desc(freq)) %&gt;% # The most frequent IP address is in the 1st row for each user
 group_by(Person) %&gt;% # For each user
 mutate(total = sum(freq)) %&gt;% # Get total number of log-in
 select(-freq) %&gt;% # Remove count
 do(head(.,1)) # Take the first row for each user

# Person IPaddress total
#1 36598035 222.999.22.99 6
#2 37811171 111.88.111.88 5
</code></pre>

UPDATE

<code>dplyr</code> 0.3 is out now. So, you could do the following as well. Just one line shorter by using <code>count</code>. I also used <code>slice</code> as @aosmith recommended.

<pre><code>mydf %&gt;%
 count(Person, IPaddress) %&gt;%
 arrange(Person, desc(n)) %&gt;%
 group_by(Person) %&gt;%
 mutate(total = sum(n)) %&gt;%
 select(-n) %&gt;%
 slice(1)
</code></pre>

blocks|key|682549|text|您可以将data.table用于简洁的解决方案：|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|682550|library(data.table)
setDT(dat)
dat[,+list(IPaddress=names(which.max(table(IPaddress))),
+++++++++++Logins=.N),+
++++by=Person]|code-block|syntax|javascript|682551|entityMap^0|4|A|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@$9|N|A|O|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|P|8|@]|D|@]|E|$I|J]]|$1|K|3|-4|5|6|7|Q|8|@]|D|@]|E|$]]]|L|$]]

You can use <code>data.table</code> for a concise solution:

<pre><code>library(data.table)
setDT(dat)
dat[, list(IPaddress=names(which.max(table(IPaddress))),
 Logins=.N), 
 by=Person]
</code></pre>

I have a dataset (dat) that looks like this:

<pre><code> Person IPaddress
36598035 222.999.22.99
36598035 222.999.22.99
36598035 222.999.22.99
36598035 222.999.22.99
36598035 222.999.22.99
36598035 444.666.44.66
37811171 111.88.111.88
37811171 111.88.111.88
37811171 111.88.111.88
37811171 111.88.111.88
37811171 111.88.111.88
</code></pre>

It reflects instances of that individuals logged into a website over a certain period of time. I need the data to look like this:

<pre><code>Person IPaddress Number of Logins
36598035 222.999.22.99 6
37811171 111.88.111.88 5
</code></pre>

So, instead of multiple entries for the same person, there is just one row per individual, with a count of how many times they logged in.

Also, you'll notice in my example that person 36598035 logged in under more than 1 IP address. When this happens, I want the IP address in the final dataset to reflect the mode IP address--in other words, the IP address that the individual logged in under most frequently.

Data restructuring using R

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

教程

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云智能顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

聚焦“写作效率、视觉美观与运行性能”三方面进行全面升级，为您提供更高效、稳定的创作环境

社区富文本&Markdown编辑器全新改版上线，欢迎大家体验!

诚挚邀请您参与本次调研，分享您的真实使用感受与建议。您的反馈至关重要，感谢您的支持与参与！

社区新版编辑器体验调研

我有一个数据集(dat)，如下所示： Person       IPaddress36598035    222.999.22.9936598035    222.999.22.9936598035    222.999.22.9936598035    222.999.22.9936598035    222.999...

问使用R的数据重组
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用R的数据重组EN