Inspirations from Memcache may include the focus on simplicity, performance and a simple human readable protocol. As Yoshinori says, Kazuho Oku has already implemented a MySQLD-embedded Memcached server, no need to do it again. What's more, the Memcache protocol offers key-value functionality, whereas implementing a new protocol allows more functionality to be exposed.
The choice of name has come in for some flak. I believe the etymology is that HandlerSocket exposes the existing MySQL Handler interface directly over a separate socket. Perhaps a more exciting name will appear at some point, but looking at the MySQL Handler documentation gives a good background on the basis of the HandlerSocket Api.
HandlerSocket implements more than a Key-Value Api. It supports indexed data access in general including :
- Equality search on any index prefix (returning 0, 1 or more rows)
- Inequality search on any index prefix (returning 0, 1 or more rows)
SQL and non-SQL access to the same data
Yoshinori mentions inspiration from NdbApi for HandlerSocket - they wanted to get fast indexed access performance without excluding the possibility of performing ad-hoc SQL for reports etc. This has been one of the unique benefits of Ndb for some time - the ability to operate on the same underlying data via multiple Apis. Extending this to other MySQL engines (especially InnoDB) is a great idea. What is surprising is the difference a different access layer implementation can make to throughput. Who would have thought that so much performance could be consumed by parsing etc? I suspect that HandlerSocket may create a new benchmark for the MySQL team to optimise parsing towards in future.
As a MySQL daemon plugin, the HandlerSocket plugin gets to create threads running within a MySQLD server instance. These threads can then listen on network sockets and use the Storage Engine Api inside the server to perform primitive operations on Storage engines. From the point of view of the Storage engine, they are just client request handling threads in the Server accessing data concurrently with 'normal' SQL client threads.
Using the Storage Engine Api, HandlerSocket gets :
- Storage engine independence
- Concurrency control as implemented by the engine
- Index maintenance as implemented by the engine
- Constraint enforcement as implemented by the engine
- Engine features such as online backup, crash recovery, compression, encryption etc.
- SQL functionality (queries, joins, aggregation, UDFs etc.)
- Stored procedures
- Trigger activation
- Query cache
- ACL checks
- Views
The impressive published benchmarks use data which fits entirely in memory buffers, so that InnoDb need only write logs and checkpoints. HandlerSocket does not require that all data is in-cache, but the best performance will be achieved if this is the case.
Operation batching/pipelining
As the protocol description states, requests can be pipelined. There is no need for individual clients to make blocking synchronous DB requests. This can vastly increase throughput without excessively straining connection resources.
Often SQL DB access protocols do not make much use of the potential for batching requests on a single connection. JDBC supports batched updates and inserts, and appears to have support (not often talked about) for batched queries, but not every driver implements these and they're often not implemented in the same way. Often the SQL approach seems to be 'If you want different sets of data in one round trip, you need to find a way to get them all in one SQL statement'. This creates a false tension between reducing client-server round trips and avoiding complex joins and unnecessary unions. Decoupling query boundaries from thread blocking / flow of control changes removes this artificial tension and can simplify applications. A recent 'Facebook at Scale' talk describes how they use the MySQL Client's CLIENT_MULTI_STATEMENTS flag to decouple query boundaries and request latency. This pattern is one of the keys to implementing efficient NdbApi clients as well.
Commit grouping
Another HandlerSocket feature possibly inspired by Ndb is the commit grouping. Multiple user writes (it's not clear if/how transaction boundaries can be specified), are combined and committed to the engine together. This amortizes the commit cost across multiple operations. Where the engine performs expensive durability operations (e.g fsync) this can improve write throughput. Writes are group-committed to the Binlog as well, again in a similar way to Ndb. Another shared advantage of merging multiple client 'transactions' into fewer Binlog 'transactions' is that the Slave also gets to benefit from fewer, larger transactions for a given data change rate.
Impact
It's not clear yet what the impact of HandlerSocket will be. It enters a crowded market of SQL, NoSQL, NewSQL, DataGrid and other technologies. Competitors use JSON over HTTP, or Memcached protocols, support richer or simpler Apis, offer transparent sharding, eventually consistent replication or web-scale Erlang distributed Map Reduce. Perhaps HandlerSocket is too old-school?
I think it's a great technology, deserving of success. Coupled with an external sharding layer, it seems to offer a great way to improve the efficiency of MySQL scale out, without losing the ability to perform ad-hoc SQL etc. Time will tell.
No comments:
Post a Comment