文章/答案/技术大牛

发布

社区首页 >问答首页 >社会网络的Cassandra数据建模

问社会网络的Cassandra数据建模
EN

Stack Overflow用户

提问于 2016-05-29 16:44:39

回答 1查看 2.1K关注 0票数 3

我们正在为我们的社交网络使用，我们正在设计我们需要的/数据建模表，这让我们很困惑，我们不知道如何设计一些表，我们有一些小问题！

正如我们所理解的，对于每个查询，我们必须有不同的表()，例如，用户A跟随用户C和B。

现在，在卡桑德拉，我们有一个表是posts_by_user

user_id      |  post_id       |  text  |  created_on  |  deleted  |  view_count  

likes_count  |  comments_count  |  user_full_name

我们有一个表，根据用户的追随者，我们将帖子的信息插入到名为user_timeline的表中，当跟随者用户访问第一个网页时，我们从user_timeline表的数据库中获取帖子。

下面是user_timeline表：

follower_id      |      post_id      | user_id (who posted)  |  likes_count  |  

comments_count   |   location_name   |  user_full_name

首先，这个数据建模对跟随基(跟随者，跟随行为)的社交网络是否正确？

现在，我们希望计数类似于一个帖子，正如您所看到的，我们在两个表(user_timeline，posts_by_user__中都有多少个赞)，假设一个用户有1000个关注者，然后通过每个类似的操作，我们必须更新user_timeline中的所有1000行和posts_by_users中的1行；这是不符合逻辑的！

，那么，我的第二个问题是它应该是怎样的？我的意思是喜欢(最喜欢的)桌子应该是什么？

datastax

datastax-enterprise

datastax-startup

cassandra

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-06-14 17:57:52

考虑使用posts_by_user作为发布信息的元数据。这将允许您容纳user_id、post_id、message_text等，但您可以将view_count、likes_count和comments_count抽象到一个计数器中。这将允许您获取一个帖子的元数据或计数器，只要您有post_id，但您只需要更新一次counter_record。

DSE计数器文档：t.html

然而，

下面这篇文章对于Cassandra的数据建模来说是一个很好的起点。也就是说，在回答这个问题时有几件事要考虑，其中许多事情将取决于系统的内部结构和查询的结构。前两项规则规定如下：

规则1:在集群周围均匀分布数据

规则2:最小化分区数读取

花点时间考虑一下"user_timeline“表。

user_id和created_on作为复合键*-如果

- You wanted to query for posts by a certain user and with the    assumption that you would have a decent number of users. This would  distribute records evenly, and your queries would only be hitting a  partition at a time.

user_id和hash_prefix作为复合键*-如果

- You had a small number of users with a large number of posts, which      would allow your data to be evenly spread across the cluster. However    you run the risk of having to query across multiple partitions.

follower_id和created_on作为复合键*-如果

- You wanted to query for posts being followed by a certain follower.    The records would be distributed and you would minimize queries    across partitions

这是一个表的3个示例，我想传达的要点是围绕要执行的查询设计表。此外，不要害怕在多个表之间复制数据，这些表是为处理各种查询而设置的，这就是Cassandra的建模方式。花点时间阅读下面的文章，并观看DataStax学院的数据建模课程，让自己熟悉这些细微之处。我还在下面提供了一个示例模式，以涵盖我前面指出的基本计数器模式。

*使用复合键的原因是，您的主键必须是唯一的，否则带有现有主键的插入将成为更新。

http://www.datastax.com/dev/blog/basic-rules-of-cassandra-data-modeling https://academy.datastax.com/courses

CREATE TABLE IF NOT EXISTS social_media.posts_by_user (
user_id uuid,
post_id uuid,
message_text text,
created_on timestamp,
deleted boolean,
user_full_name text,
PRIMARY KEY ((user_id, created_on))
);
CREATE TABLE IF NOT EXISTS social_media.user_timeline (
follower_id uuid,
post_id uuid,
user_id uuid,
location_name text,
user_full_name text,
created_on timestamp,
PRIMARY KEY ((user_id, created_on))
);
CREATE TABLE IF NOT EXISTS social_media.post_counts (
likes_count counter,
view_count counter,
comments_count counter,
post_id uuid,
PRIMARY KEY (post_id)
);

票数 5

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/37512446

复制

相似问题

问社会网络的Cassandra数据建模
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问社会网络的Cassandra数据建模EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问社会网络的Cassandra数据建模
EN