第五部分 架构篇 第十四章 MongoDB Replica Sets 架构(自动故障转移/读写分离实践)
说明:该篇内容部分来自红丸编写的MongoDB实战文章。
1、简介
MongoDB支持在多个机器中通过异步复制达到故障转移和实现冗余,多机器中同一时刻只有一台是用于写操作,正是由于这个情况,为了MongoDB提供了数据一致性的保障,担当primary角色的服务能把读操作分发给Slave(详情请看前两篇关于Replica Set成员组成和理解)。
MongoDB高可用分为两种:
- Master-Slave主从复制:只需要在某一个服务启动时加上-master参数,而另外一个服务加上-slave与-source参数,即可实现同步,MongoDB的最新版本已经不在推荐此方案。在官网的文档中有如下一段提醒:
IMPORTANT
Replica sets replace master-slave replication for most use cases. If possible, use replica sets rather than master-slave replication for all new production deployments. This documentation remains to support legacy deployments and for archival purposes only.
意思就是说在很多的案例中已经用Replica Set来替代Master-slave。
- Replica Set复制集:MongoDB在1.6版本后开发了新功能Replica Set,这比之前的Replication功能要强大一些,增加了故障自动切换和自动修复成员节点,各个DB之间数据完全一致,大大降低了维护难度,auto shard已经明确说明不支持replication paris,建议使用Replica Set,故障完全自动切换。
- 环境准备
- 步骤
[root@localhost mongodb]# mkdir -p r0 [root@localhost mongodb]# mkdir -p r1 [root@localhost mongodb]# mkdir -p r2创建日志文件路径:
[root@localhost mongodb]# mkdir -p log创建主从key文件,用于标识集群的私钥的完整路径,如果各个实例的keyfile内容不一致,程序将不能正常启动。
[root@localhost mongodb]# mkdir -p key
[root@localhost mongodb]# echo "this is rs1 super secret key">key/r0 [root@localhost mongodb]# echo "this is rs1 super secret key">key/r1 [root@localhost mongodb]# echo "this is rs1 super secret key">key/r2 [root@localhost mongodb]# chmod 600 key/r* [root@localhost mongodb]#启动三个实例:
[root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r0 --fork --port 28010 --dbpath=/usr/local/mongodb/r0 --logpath=/usr/local/mongodb/log/r0.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 2545
[root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 2596
[root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 2602说明:三个实例端口分别为28010、28011、28012 数据存放文件分别为r0、r1、r2。
[root@localhost bin]# ./mongo --port 28010 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28010/test > config_rs1={_id:"rs1",members:[{_id:0,host:'localhost:28010',priority:1},{_id:1,host:'localhost:28011'},{_id:2,host:'localhost:28012'}]} { "_id" : "rs1", "members" : [ { "_id" : 0, "host" : "localhost:28010", "priority" : 1 }, { "_id" : 1, "host" : "localhost:28011" }, { "_id" : 2, "host" : "localhost:28012" } ] } >说明:指定每个阶段的IP和端口,priority=1作用将端口28010设置为primary。
> rs.initiate(config_rs1); { "info" : "Config now saved locally. Should come online in about a minute.", "ok" : 1 } >查看复制集的状态:
rs1:OTHER> rs.status(); { "set" : "rs1", "date" : ISODate("2015-01-16T03:10:41Z"), "myState" : 2, "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 260, "optime" : Timestamp(1421377833, 1), "optimeDate" : ISODate("2015-01-16T03:10:33Z"), "self" : true }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, <span style="background-color: rgb(255, 0, 0);"><span style="color:#ff0000;"> </span>"state" : 5, "stateStr" : "STARTUP2",</span> "uptime" : 8, "optime" : Timestamp(0, 0), "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2015-01-16T03:10:39Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:10:39Z"), "pingMs" : 0, "<span style="background-color: rgb(255, 0, 0);">lastHeartbeatMessage" : "initial sync need a member to be primary or secondary to do our initial sync"</span> }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 5, "stateStr" : "STARTUP2", "uptime" : 8, "optime" : Timestamp(0, 0), "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2015-01-16T03:10:39Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:10:40Z"), "pingMs" : 0, <span style="background-color: rgb(255, 0, 0);">"lastHeartbeatMessage" : "initial sync need a member to be primary or secondary to do our initial sync"</span> } ], "ok" : 1 }
说明:在name为localhost:28010的节点的stateStr为SECONDARY,这是为什么呢?我们结合下面红色字体标注的地方来看,在调用rs.initiatie初始化Replica Set配置时,里面的提示信息为:
rs1:PRIMARY> rs.status(); { "set" : "rs1", "date" : ISODate("2015-01-16T03:13:09Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 408, "optime" : Timestamp(1421377833, 1), "optimeDate" : ISODate("2015-01-16T03:10:33Z"), "electionTime" : Timestamp(1421377841, 1), "electionDate" : ISODate("2015-01-16T03:10:41Z"), "self" : true }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 156, "optime" : Timestamp(1421377833, 1), "optimeDate" : ISODate("2015-01-16T03:10:33Z"), "lastHeartbeat" : ISODate("2015-01-16T03:13:09Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:13:07Z"), "pingMs" : 1, "syncingTo" : "localhost:28010" }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 156, "optime" : Timestamp(1421377833, 1), "optimeDate" : ISODate("2015-01-16T03:10:33Z"), "lastHeartbeat" : ISODate("2015-01-16T03:13:08Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:13:09Z"), "pingMs" : 0, "syncingTo" : "localhost:28010" } ], "ok" : 1 }此时Replica Set已经初始化完成,各个节点状态均以正常,state=1的为primary服务。state=2的为SECONDARY服务节点。两个SECONDARY状态的阶段都是通过28010端口同步数据,通过syncingTo字段可以看出。
rs1:PRIMARY> rs.isMaster(); { "setName" : "rs1", "setVersion" : 1, "ismaster" : true, "secondary" : false, "hosts" : [ "localhost:28010", "localhost:28012", "localhost:28011" ], "primary" : "localhost:28010", "me" : "localhost:28010", "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 1000, "localTime" : ISODate("2015-01-16T02:38:58.479Z"), "maxWireVersion" : 2, "minWireVersion" : 0, "ok" : 1 } rs1:PRIMARY>
3.1、主从操作日志oplog
MongoDB的Replica Set架构是通过一个日志来存储写操作的,这个日志叫做oplog,在前面的教程中已经学习过了,oplog.rs是一个固定长度的capped collection,它存在于local数据库中,用于记录Replica Sets的操作日志,在默认情况下,对于64位的MongoDB,oplog是比较大的,可以达到5%的磁盘空间,oplog的大小可以通过mongod的参数--oplogSize来改变oplog的日志大小。
rs1:PRIMARY> use local switched to db local rs1:PRIMARY> show collections me oplog.rs startup_log system.indexes system.replset rs1:PRIMARY> \
rs1:PRIMARY> db.oplog.rs.find(); { "ts" : Timestamp(1421375729, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } } rs1:PRIMARY>
字段说明:
rs1:PRIMARY> db.printReplicationInfo(); configured oplog size: 990MB log length start to end: 0secs (0hrs) oplog first event time: Fri Jan 16 2015 10:35:29 GMT+0800 (CST) oplog last event time: Fri Jan 16 2015 10:35:29 GMT+0800 (CST) now: Fri Jan 16 2015 10:45:18 GMT+0800 (CST) rs1:PRIMARY>字段说明:
rs1:PRIMARY> db.printSlaveReplicationInfo(); source: localhost:28011 syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST) 1421375729 secs (394826.59 hrs) behind the primary source: localhost:28012 syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST) 1421375729 secs (394826.59 hrs) behind the primary rs1:PRIMARY>字段说明:
rs1:PRIMARY> db.system.replset.find(); { "_id" : "rs1", "version" : 1, "members" : [ { "_id" : 0, "host" : "localhost:28010" }, { "_id" : 1, "host" : "localhost:28011" }, { "_id" : 2, "host" : "localhost:28012" } ] } rs1:PRIMARY>从这个集合中可以看出,Replica Sets的配置信息,也可以在任何一个成员实例上执行rs.conf()来查看配置信息。
3.3、Replica set测试
[root@localhost bin]# ./mongo --port 28010 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28010/test rs1:PRIMARY> db.student.insert({name:"zhangsan",age:20}); WriteResult({ "nInserted" : 1 }) <span style="background-color: rgb(255, 0, 0);">rs1:PRIMARY</span>> db.student.find(); { "_id" : ObjectId("54b87ca7f663c819d621d590"), "name" : "zhangsan", "age" : 20 } rs1:PRIMARY>
[root@localhost bin]# ./mongo --port 28011 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28011/test rs1:SECONDARY> show collections 2015-01-16T11:22:54.517+0800 error: { "$err" : "<span style="background-color: rgb(255, 102, 102);">not master and slaveOk=false</span>", "code" : 13435 } at src/mongo/shell/query.js:131
rs1:SECONDARY> db.getMongo().setSlaveOk();
<span style="background-color: rgb(255, 0, 0);">rs1:SECONDARY</span>> show collections; student system.indexes rs1:SECONDARY>此时便可以进行查询操作了。
在此要注意下连接到mongod服务之后,命令行开头变成了rs1.SECONDARY和rs1.PRIMARY,说明当前登录的rs1这个复制集得PRIMARY节点或者SECONDARY的节点。
rs1:SECONDARY> db.student.find(); { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 } rs1:SECONDARY>28012端口操作如下:
[root@localhost bin]# ./mongo --port 28012 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28012/test rs1:SECONDARY> show collections 2015-01-16T11:27:04.747+0800 error: { "$err" : "not master and slaveOk=false", "code" : 13435 } at src/mongo/shell/query.js:131 rs1:SECONDARY> db.getMongo().setSlaveOk(); rs1:SECONDARY> show collections student system.indexes rs1:SECONDARY> db.student.find(); { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 } rs1:SECONDARY>在28011端口上进行写操作:
[root@localhost bin]# ./mongo --port 28011 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28011/test rs1:SECONDARY> db.student.insert({name:"lisi",age:20}); WriteResult({ "writeError" : { "code" : undefined, "errmsg" : "not master" } })此时提示不是master不能进行写操作,这跟前面两章节详细讲解Replica Set架构的相关原理相符合。
bye [root@localhost bin]# <span style="color:#ff0000;">ps aux|grep mongod</span> root <span style="background-color: rgb(255, 0, 0);"> 6658 </span> 0.8 3.7 3175956 37508 ? Sl 11:06 0:12 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r0 --fork --port <span style="background-color: rgb(255, 0, 0);">28010</span> --dbpath=/usr/local/mongodb/r0 --logpath=/usr/local/mongodb/log/r0.log --logappend root 7461 0.7 3.7 3144172 37764 ? Sl 11:06 0:11 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend root 28166 0.6 3.8 3144152 38520 ? Sl 11:10 0:08 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend root 30833 0.0 0.0 103244 832 pts/1 S+ 11:31 0:00 grep mongod [root@localhost bin]# <span style="background-color: rgb(255, 0, 0);">kill -2 6658</span> [root@localhost bin]# ps aux|grep mongod root 7461 0.7 3.7 3158520 37960 ? Sl 11:06 0:11 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend root 28166 0.6 3.8 3154396 38616 ? Sl 11:10 0:08 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend root 30869 0.0 0.0 103244 832 pts/1 S+ 11:31 0:00 grep mongod [root@localhost bin]#此时通过28011端口连接mongod服务并查看复制集状态
rs1:PRIMARY> rs.status() { "set" : "rs1", "date" : ISODate("2015-01-16T03:33:15Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "<span style="background-color: rgb(255, 0, 0);">localhost:28010</span>", "health" : 0, <span style="background-color: rgb(255, 0, 0);">"state" : 8,</span> <span style="background-color: rgb(255, 0, 0);">"stateStr" : "(not reachable/healthy)",</span> "uptime" : 0, "optime" : Timestamp(1421378555, 1), "optimeDate" : ISODate("2015-01-16T03:22:35Z"), "lastHeartbeat" : ISODate("2015-01-16T03:33:14Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:31:50Z"), "pingMs" : 0 }, { "_id" : 1, "name" : <span style="color:#ff0000;">"localhost:28011</span>", "health" : 1, "state" : 1, <span style="background-color: rgb(255, 0, 0);">"stateStr" : "PRIMARY"</span>, "uptime" : 1608, "optime" : Timestamp(1421378555, 1), "optimeDate" : ISODate("2015-01-16T03:22:35Z"), "electionTime" : Timestamp(1421379114, 1), "electionDate" : ISODate("2015-01-16T03:31:54Z"), "self" : true }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 1358, "optime" : Timestamp(1421378555, 1), "optimeDate" : ISODate("2015-01-16T03:22:35Z"), "lastHeartbeat" : ISODate("2015-01-16T03:33:15Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:33:13Z"), "pingMs" : 0, "lastHeartbeatMessage" : "syncing to: localhost:28011", "syncingTo" : "localhost:28011" } ], "ok" : 1 } rs1:PRIMARY>此时28010的状态变为了8,描述为不可达。健康状态为0,28011的状态变为了1,描述为PRIMARY,此时的架构为如下所示:
rs1:PRIMARY> use test switched to db test rs1:PRIMARY> db.student.insert({name:"lisi",age:20}); WriteResult({ "nInserted" : 1 }) rs1:PRIMARY> db.student.find(); { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 } { "_id" : ObjectId("54b8876aad5e04c1fe460154"), "name" : "lisi", "age" : 20 } rs1:PRIMARY>
郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。