Schema Design for Social Inboxes in MongoDB-MySQL 튜토리얼-php.cn

집

데이터 베이스

MySQL 튜토리얼

Schema Design for Social Inboxes in MongoDB

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:32 PM

Designforsocial

Designing a schema is a critical part of any application. Like most databases, there are many options for modeling data in MongoDB, and it is important to incorporate the functional requirements and performance goals for your application when determining the best design. In this post, we’ll explore three approaches for using MongoDB when creating social inboxes or message timelines.

If you’re building a social network, like Twitter for example, you need to design a schema that is efficient for users viewing their inbox, as well as users sending messages to all their followers. The whole point of social media, after all, is that you can connect in real time.

There are several design considerations for this kind of application:

The application needs to support a potentially large volume of reads and writes.
Reads and writes are not uniformly distributed across users. Some users post much more frequently than others, and some users have many, many more followers than others.
The application must provide a user experience that is instantaneous.
Edit 11/6: The application will have little to no user deletions of data (a follow up blog post will include information about user deletions and historical data)

Because we are designing an application that needs to support a large volume of reads and writes we will be using a sharded collection for the messages. All three designs include the concept of “fan out,” which refers to distributing the work across the shards in parallel:

Fan out on Read
Fan out on Write
Fan out on Write with Buckets

Each approach presents trade-offs, and you should use the design that is best for your application’s requirements.

The first design you might consider is called Fan Out on Read. When a user sends a message, it is simply saved to the inbox collection. When any user views their own inbox, the application queries for all messages that include the user as a recipient. The messages are returned in descending date order so that users can see the most recent messages.

To implement this design, create a sharded collection called inbox, specifying the from field as the shard key, which represents the address sending the message. You can then add a compound index on the to field and the sent field. Once the document is saved into the inbox, the message is effectively sent to all the recipients. With this approach sending messages is very efficient.

Viewing an inbox, on the other hand, is less efficient. When a user views their inbox the application issues a find command based on the to field, sorted by sent. Because the inbox collection uses from as its shard key, messages are grouped by sender across the shards. In MongoDB queries that are not based on the shard key will be routed to all shards. Therefore, each inbox view will be routed to all shards in the system. As the system scales and many users go to view their inbox, all queries will be routed to all shards. This design does not scale as well as each query being routed to a single shard.

With the “Fan Out on Read” method, sending a message is very efficient, but viewing the inbox is less efficient.

Fan out on Read is very efficient for sending messages, but less efficient for reading messages. If the majority of your application consists of users sending messages, but very few go to read what anyone sends them — let’s call it an anti-social app — then this design might work well. However, for most social apps there are more requests by users to view their inbox than there are to send messages.

The Fan out on Write takes a different approach that is more optimized for viewing inboxes. This time, instead of sharding our inbox collection on the sender, we shard on the message recipient. In this way, when we go to view an inbox the queries can be routed to a single shard, which will scale very well. Our message document is the same, but now save a copy of the message for every recipient.

With the “Fan Out on Write” method, viewing the inbox is efficient, but sending messages consumes more resources.

In practice we might implement the saving of messages asynchronously. Imagine two celebrities quickly exchange messages at a high-profile event - the system could quickly be saturated with millions of writes. By saving a first copy of their message, then using a pool of background workers to write copies to all followers, we can ensure the two celebrities can exchange messages quickly, and that followers will soon have their own copies. Furthermore, we could maintain a last-viewed date on the user document to ensure they have accessed the system recently - zombie accounts probably shouldn’t get a copy of the message, and for users that haven’t accessed their account recently we could always resort to our first design - Fan out on Read - to repopulate their inbox. Subsequent requests would then be fast again.

At this point we have improved the design for viewing inboxes by routing each inbox view to a single shard. However, each message in the user’s inbox will produce a random read operation. If each inbox view produces 50 random reads, then it only takes a relatively modest number of concurrent users to potentially saturate the disks. Fortunately we can take advantage of the document data model to further optimize this design to be even more efficient.

Fan out on Write with Buckets refines the Fan Out on Write design by “bucketing” messages together into documents of 50 messages ordered by time. When a user views their inbox the request can be fulfilled by reading just a few documents of 50 messages each instead of performing many random reads. Because read time is dominated by seek time, reducing the number of seeks can provide a major performance improvement to the application. Another advantage to this approach is that there are fewer index entries.

To implement this design we create two collections, an inbox collection and a user collection. The inbox collection uses two fields for the shard key, owner and sequence, which holds the owner’s user id and sequence number (i.e. the id of 50-message “bucket” documents in their inbox). The user collection contains simple user documents for tracking the total number of messages in their inbox. Since we will probably need to show the total number of messages for a user in a variety of places in our application, this is a nice place to maintain the count instead of calculating for each request. Our message document is the same as in the prior examples.

To send a message we iterate through the list of recipients as we did in the Fan out on Write example, but we also take another step to increment the count of total messages in the inbox of the recipient, which is maintained on the user document. Once we know the count of messages, we know the “bucket” in which to add the latest message. As these messages reach the 50 item threshold, the sequence number increments and we begin to add messages to the next “bucket” document. The most recent messages will always be in the “bucket” document with the highest sequence number. Viewing the most recent 50 messages for a user’s inbox is at most two reads; viewing the most recent 100 messages is at most three reads.

Normally a user’s entire inbox will exist on a single shard. However, it is possible that a few user inboxes could be spread across two shards. Because our application will probably page through a user’s inbox, it is still likely that every query for these few users will be routed to a single shard.

Fan out on Write with Buckets is generally the most scalable approach of the these three designs for social inbox applications. Every design presents different trade-offs. In this case viewing a user’s inbox is very efficient, but writes are somewhat more complex, and more disk space is consumed. For many applications these are the right trade-offs to make.

Schema design is one of the most important optimizations you can make for your application. We have a number of additional resources available on schema design if you are interested in learning more:

Fan out on Read	Fan out on Write	Fan out on Write with Buckets
Send Message Performance	Best Single write	Good Shard per recipient Multiple writes	Worst Shard per recipient Appends (grows)
Read Inbox Performance	Worst Broadcast all shards Random reads	Good Single shard Random reads	Best Single shard Single read
Data Size	Best Message stored once	Worst Copy per recipient	Worst Copy per recipient

Schema design is one of the most important optimizations you can make for your application. We have a number of additional resources available on schema design if you are interested in learning more:

Check out the recording of our recent Schema Design webinar on this topic.
Schema Design in general in the MongoDB Manual
You can also view our schema design resources on the MongoDB docs page
If you have any schema design questions, please view the archived questions on our user forum or ask a question yourselfon the MongoDB User Forum.

原文地址：Schema Design for Social Inboxes in MongoDB, 感谢原作者分享。

성명

본 글의 내용은 네티즌들의 자발적인 기여로 작성되었으며, 저작권은 원저작자에게 있습니다. 본 사이트는 이에 상응하는 법적 책임을 지지 않습니다. 표절이나 침해가 의심되는 콘텐츠를 발견한 경우 admin@php.cn으로 문의하세요.

관련 기사

MySQL과 다른 SQL 방언의 구문의 차이점은 무엇입니까?Apr 27, 2025 am 12:26 AM

mysqldiffersfromothersqldialectsinsyntaxforlimit, 자동 점유, 문자열 comparison, 하위 쿼리 및 퍼포먼스 앤 알리 분석 .1) mysqluse Slimit, whilesqlSerVerusestOpandoracleSrownum.2) MySql'Sauto_incrementContrastSwithPostgresql'serialandoracle '

MySQL 파티셔닝이란 무엇입니까?Apr 27, 2025 am 12:23 AM

MySQL 파티셔닝은 성능을 향상시키고 유지 보수를 단순화합니다. 1) 큰 테이블을 특정 기준 (예 : 날짜 범위)으로 작은 조각으로 나누고, 2) 데이터를 독립적 인 파일로 물리적으로 나눌 수 있습니다.

MySQL에서 어떻게 권한을 부여하고 취소합니까?Apr 27, 2025 am 12:21 AM

MySQL에서 권한을 부여하고 취소하는 방법은 무엇입니까? 1. 보조금 명세서를 사용하여 grantallprivilegesondatabase_name.to'username'@'host '와 같은 부여 권한; 2. Revoke 문을 사용하여 Revokeallprivilegesondatabase_name.from'username'@'host '와 같은 권한을 취소하여 허가 변경의 적시에 의사 소통을 보장하십시오.

InnoDB와 MyISAM 스토리지 엔진의 차이점을 설명하십시오.Apr 27, 2025 am 12:20 AM

InnoDB는 거래 지원 및 높은 동시성이 필요한 응용 프로그램에 적합한 반면, MyISAM은 더 많은 읽기와 덜 쓰는 응용 프로그램에 적합합니다. 1. INNODB는 전자 상거래 및 은행 시스템에 적합한 거래 및 은행 수준의 자물쇠를 지원합니다. 2. Myisam은 블로깅 및 컨텐츠 관리 시스템에 적합한 빠른 읽기 및 색인을 제공합니다.

MySQL의 다른 유형의 조인은 무엇입니까?Apr 27, 2025 am 12:13 AM

MySQL에는 Innerjoin, Leftjoin, RightJoin 및 FullouterJoin의 네 가지 주요 조인 유형이 있습니다. 1. 결합 조건을 충족하는 두 테이블의 모든 행을 반환합니다. 2. Leftjoin 오른쪽 테이블에 일치하는 행이 없더라도 왼쪽 테이블의 모든 행을 반환합니다. 3. RightJoin은 LeftJoin과 상반되며 오른쪽 테이블의 모든 행을 반환합니다. 4. FULLOUTERNOIN은 조건을 충족 시키거나 충족하지 않는 두 테이블의 모든 행을 반환합니다.

MySQL에서 사용 가능한 다른 스토리지 엔진은 무엇입니까?Apr 26, 2025 am 12:27 AM

mysqloffersvariousStorageEngines, 각각의 everitedforentUsecases : 1) innodbisidealforapplicationsneedingAcidCoInceandHighConcurrency, 지원 트랜잭션 및 foreignKeys.2) myIsAmisbestforread-heverworkloads, memoryengineis

MySQL의 일반적인 보안 취약점은 무엇입니까?Apr 26, 2025 am 12:27 AM

MySQL의 일반적인 보안 취약점에는 SQL 주입, 약한 암호, 부적절한 권한 구성 및 업데이트되지 않은 소프트웨어가 포함됩니다. 1. 전처리 명령문을 사용하여 SQL 주입을 방지 할 수 있습니다. 2. 강력한 비밀번호 전략을 사용하여 약한 암호는 피할 수 있습니다. 3. 정기적 인 검토 및 사용자 권한 조정을 통해 부적절한 권한 구성을 해결할 수 있습니다. 4. Unupdated 소프트웨어는 MySQL 버전을 정기적으로 확인하고 업데이트하여 패치 할 수 있습니다.

MySQL에서 느린 쿼리를 어떻게 식별 할 수 있습니까?Apr 26, 2025 am 12:15 AM

느린 쿼리 로그를 활성화하고 임계 값을 설정하여 MySQL에서 느린 쿼리를 식별 할 수 있습니다. 1. 느린 쿼리 로그를 활성화하고 임계 값을 설정하십시오. 2. 느린 쿼리 로그 파일을보고 분석하고 심층 분석을 위해 MySQLDumpSlow 또는 PT-Query 소수성과 같은 도구를 사용하십시오. 3. 인덱스 최적화, 쿼리 재 작성 및 select*의 사용을 피함으로써 느린 쿼리 최적화를 달성 할 수 있습니다.

See all articles

핫 AI 도구

Undresser.AI Undress

사실적인 누드 사진을 만들기 위한 AI 기반 앱

AI Clothes Remover

사진에서 옷을 제거하는 온라인 AI 도구입니다.

Undress AI Tool

무료로 이미지를 벗다

Clothoff.io

AI 옷 제거제

Video Face Swap

완전히 무료인 AI 얼굴 교환 도구를 사용하여 모든 비디오의 얼굴을 쉽게 바꾸세요!

뜨거운 도구

드림위버 CS6

시각적 웹 개발 도구

SublimeText3 영어 버전

권장 사항: Win 버전, 코드 프롬프트 지원!

mPDF

mPDF는 UTF-8로 인코딩된 HTML에서 PDF 파일을 생성할 수 있는 PHP 라이브러리입니다. 원저자인 Ian Back은 자신의 웹 사이트에서 "즉시" PDF 파일을 출력하고 다양한 언어를 처리하기 위해 mPDF를 작성했습니다. HTML2FPDF와 같은 원본 스크립트보다 유니코드 글꼴을 사용할 때 속도가 느리고 더 큰 파일을 생성하지만 CSS 스타일 등을 지원하고 많은 개선 사항이 있습니다. RTL(아랍어, 히브리어), CJK(중국어, 일본어, 한국어)를 포함한 거의 모든 언어를 지원합니다. 중첩된 블록 수준 요소(예: P, DIV)를 지원합니다.