Twemproxy, a Redis proxy from Twitter-mysql教程-PHP中文網

首頁

資料庫

mysql教程

Twemproxy, a Redis proxy from Twitter

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:36 PM

fromproxyredis

While a big number of users use large farms of Redis nodes, from the point of view of the project itself currently Redis is a mostly single-instance business.

I've big plans about going distributed with the project, to the extent that I'm no longer evaluating any threaded version of Redis: for me from the point of view of Redis a core is like a computer, so that scaling multi core or on a cluster of computers is the same conceptually. Multiple instances is a share-nothing architecture. Everything makes sense AS LONG AS we have a *credible way* to shard :-)

This is why Redis Cluster will be the main focus of 2013 for Redis, and finally, now that Redis 2.6 is out and is showing to be pretty stable and mature, it is the right moment to focus on Redis Cluster, Redis Sentinel, and other long awaited improvements in the area of replication (partial resynchronization).

However the reality is that Redis Cluster is not yet production ready and requires months of work. Still our users already need to shard data on multiple instances in order to distribute the load, and especially in order to use many computers to get a big amount of RAM ready for data.

The sole option so far was client side sharding. Client side sharding has advantages as there are no intermediate layers between clients and nodes, nor routing of request, so it is a very scalable setup (linearly scalable, basically). However to implement it reliably requires some tuning, a way to take clients configuration in sync, and the availability of a solid client with consistent hashing support or some other partitioning algorithm.

Apparently there is a big news in the landscape, and has something to do with Twitter, where one of the biggest Redis farms deployed happen to serve timelines to users. So it comes as no surprise that the project I'm talking about in this blog post comes from the Twitter Open Source division.

Twemproxy
---

Twemproxy is a fast single-threaded proxy supporting the Memcached ASCII protocol and more recently the Redis protocol:

https://github.com/twitter/twemproxy

It is written entirely in C and is licensed under the Apache 2.0 License.
The project works on Linux and AFAIK can't be compiled on OSX because it relies on the epoll API.

I did my tests using my Ubuntu 12.04 desktop.

But well, I'm still not saying anything useful. What twemproxy does actually? (Note: I'll focus on the Redis part, but the project is also able to do the same things for memcached as well).

1) It works as a proxy between your clients and many Redis instances.
2) It is able to automatically shard data among the configured Redis instances.
3) It supports consistent hashing with different strategies and hashing functions.

What's awesome about Twemproxy is that it can be configured both to disable nodes on failure, and retry after some time, or to stick to the specified keys -> servers map. This means that it is suitable both for sharding a Redis data set when Redis is used as a data store (disabling the node ejection), and when Redis is using as a cache, enabling node-ejection for cheap (as in simple, not as in bad quality) high availability.

The bottom line here is: if you enable node-ejection your data may end into other nodes when a node fails, so there is no guarantee about consistency. On the other side if you disable node-ejection you need to have a per-instance high availability setup, for example using automatic failover via Redis Sentinel.

Installation
---

Before diving more inside the project features, I've good news, it is trivial to build on Linux. Well, not as trivial as Redis, but… you just need to follow those simple steps:

apt-get install automake
apt-get install libtool
git clone git://github.com/twitter/twemproxy.git
cd twemproxy
autoreconf -fvi
./configure --enable-debug=log
make
src/nutcracker -h

It is pretty trivial to configure as well, and there is sufficient documentation in the project github page to have a smooth first experience. For instance I used the following configuration:

redis1:
listen: 0.0.0.0:9999
redis: true
hash: fnv1a_64
distribution: ketama
auto_eject_hosts: true
timeout: 400
server_retry_timeout: 2000
server_failure_limit: 1
servers:
- 127.0.0.1:6379:1
- 127.0.0.1:6380:1
- 127.0.0.1:6381:1
- 127.0.0.1:6382:1

redis2:
listen: 0.0.0.0:10000
redis: true
hash: fnv1a_64
distribution: ketama
auto_eject_hosts: false
timeout: 400
servers:
- 127.0.0.1:6379:1
- 127.0.0.1:6380:1
- 127.0.0.1:6381:1
- 127.0.0.1:6382:1

Basically the first cluster is configured with node ejection, and the second as a static map among the configured instances.

What is great is that you can have multiple setups at the same time possibly involving the same hosts. However for production I find more appropriate to use multiple instances to use multiple cores.

Single point of failure?
---

Another very interesting thing is that, actually, using this setup does not mean you have a single point of failure, since you can run multiple instances of twemproxy and let your client connect to the first available.

Basically what you are doing with twemproxy is to separate the sharding logic from your client. At this point a basic client will do the trick, sharding will be handled by the proxy.

It is a straightforward but safe approach to partitioning IMHO.

Currently that Redis Cluster is not available, I would say, it is the way to go for most users that want a cluster of Redis instances today. But read about the limitations before to get too excited ;)

Limitations
---

I think that twemproxy do it right, not supporting multiple keys commands nor transactions. Currently is AFAIK even more strict than Redis Cluster that instead allows MULTI/EXEC blocks if all the commands are about the same key.

But IMHO it's the way to go, distribute the subset you can distribute efficiently, and pose this as a design challenge early to the user, instead to invest a big amount of resources into "just works" implementations that try to aggregate data from multiple instances, but that will hardly be fast enough once you start to have serious loads because of too big constant times to move data around.

However there is some support for commands with multiple keys. MGET and DEL are handled correctly. Interestingly MGET will split the request among different servers and will return the reply as a single entity. This is pretty cool even if I don't get the right performance numbers with this feature (see later).

Anyway the fact that multi-key commands and transactions are not supported it means that twemproxy is not for everybody, exactly like Redis Cluster itself. Especially since apparently EVAL is not supported (I think they should support it! It's trivial, EVAL is designed to work in a proxy like that because key names are explicit).

Things that could be improved
---

Error reporting is not always stellar. Sending a non supported command closes the connection. Similarly sending just a "GET" from redis-cli does not report any error about bad number of arguments but hangs the connection forever.

However other errors from the server are passed to the client correctly:

redis metal:10000> get list
(error) WRONGTYPE Operation against a key holding the wrong kind of value

Another thing that I would love to see is support for automatic failover. There are many alternatives:

1) twemproxy is already able to monitor instance errors, count the number of errors, and eject the node when enough errors are detected. Well it is a shame it is not able to take slave nodes as alternatives, and instead of eject nodes use the alternate nodes just after sending a SLAVE OF NOONE command. This would turn it into an HA solution as well.

2) Or alternatively, I would love if it could be able to work in tandem with Redis Sentinel, checking the Sentinel configuration regularly to upgrade the servers table if a failover happened.

3) Another alternative is to provide a way to hot-configure twemproxy so that on fail overs Sentinel could switch the configuration of the proxy ASAP.

There are many alternatives, but basically, some support for HA could be great.

Performances
---

This Thing Is Fast. Really fast, it is almost as fast as talking directly with Redis. I would say you lose 20% of performances at worst.

My only issue with performances is that IMHO MGET could use some improvement when the command is distributed among instances.

After all if the proxy has similar latency between it and all the Redis instances (very likely), if the MGETs are sent at the same time, likely the replies will reach the proxy about at the same time. So I expected to see almost the same numbers with an MGET as I see when I run the MGET against a single instance, but I get only 50% of the operations per second. Maybe it's the time to reconstruct the reply, I'm not sure.

Conclusions
---

It is a great project, and since Redis Cluster is yet not here, I strongly suggest Redis users to give it a try.

Personally I'm going to link it in some visible place in the Redis project site. I think the Twitter guys here provided some real value to Redis itself with their project, so…

Kudos! Comments

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

MySQL與Sqlite有何不同？Apr 24, 2025 am 12:12 AM

MySQL和SQLite的主要區別在於設計理念和使用場景：1.MySQL適用於大型應用和企業級解決方案，支持高性能和高並發；2.SQLite適合移動應用和桌面軟件，輕量級且易於嵌入。

MySQL中的索引是什麼？它們如何提高性能？Apr 24, 2025 am 12:09 AM

MySQL中的索引是數據庫表中一列或多列的有序結構，用於加速數據檢索。 1）索引通過減少掃描數據量提升查詢速度。 2）B-Tree索引利用平衡樹結構，適合範圍查詢和排序。 3）創建索引使用CREATEINDEX語句，如CREATEINDEXidx_customer_idONorders(customer_id)。 4）複合索引可優化多列查詢，如CREATEINDEXidx_customer_orderONorders(customer_id,order_date)。 5）使用EXPLAIN分析查詢計劃，避

說明如何使用MySQL中的交易來確保數據一致性。Apr 24, 2025 am 12:09 AM

在MySQL中使用事務可以確保數據一致性。 1)通過STARTTRANSACTION開始事務，執行SQL操作後用COMMIT提交或ROLLBACK回滾。 2)使用SAVEPOINT可以設置保存點，允許部分回滾。 3)性能優化建議包括縮短事務時間、避免大規模查詢和合理使用隔離級別。

在哪些情況下，您可以選擇PostgreSQL而不是MySQL？Apr 24, 2025 am 12:07 AM

選擇PostgreSQL而非MySQL的場景包括：1)需要復雜查詢和高級SQL功能，2)要求嚴格的數據完整性和ACID遵從性，3)需要高級空間功能，4)處理大數據集時需要高性能。 PostgreSQL在這些方面表現出色，適合需要復雜數據處理和高數據完整性的項目。

如何保護MySQL數據庫？Apr 24, 2025 am 12:04 AM

MySQL數據庫的安全可以通過以下措施實現：1.用戶權限管理：通過CREATEUSER和GRANT命令嚴格控制訪問權限。 2.加密傳輸：配置SSL/TLS確保數據傳輸安全。 3.數據庫備份和恢復：使用mysqldump或mysqlpump定期備份數據。 4.高級安全策略：使用防火牆限制訪問，並啟用審計日誌記錄操作。 5.性能優化與最佳實踐：通過索引和查詢優化以及定期維護兼顧安全和性能。

您可以使用哪些工具來監視MySQL性能？Apr 23, 2025 am 12:21 AM

如何有效監控MySQL性能？使用mysqladmin、SHOWGLOBALSTATUS、PerconaMonitoringandManagement(PMM)和MySQLEnterpriseMonitor等工具。 1.使用mysqladmin查看連接數。 2.用SHOWGLOBALSTATUS查看查詢數。 3.PMM提供詳細性能數據和圖形化界面。 4.MySQLEnterpriseMonitor提供豐富的監控功能和報警機制。

MySQL與SQL Server有何不同？Apr 23, 2025 am 12:20 AM

MySQL和SQLServer的区别在于：1)MySQL是开源的，适用于Web和嵌入式系统，2)SQLServer是微软的商业产品，适用于企业级应用。两者在存储引擎、性能优化和应用场景上有显著差异，选择时需考虑项目规模和未来扩展性。

在哪些情況下，您可以選擇SQL Server而不是MySQL？Apr 23, 2025 am 12:20 AM

在需要高可用性、高級安全性和良好集成性的企業級應用場景下，應選擇SQLServer而不是MySQL。 1)SQLServer提供企業級功能，如高可用性和高級安全性。 2)它與微軟生態系統如VisualStudio和PowerBI緊密集成。 3)SQLServer在性能優化方面表現出色，支持內存優化表和列存儲索引。

See all articles