Monday, 27 September 2010

Some MySQL projects I think are cool - OpenQuery Graph Engine (OQG)

This project was announced a year or so ago by Antony Curtis who used to work for MySQL AB. Having met Antony a few times I was intrigued to see what he was up to. The quote on the OpenQuery website describes it well :
The Open Query GRAPH engine (OQGRAPH) is a computation engine allowing hierarchies and more complex graph structures to be handled in a relational fashion. In a nutshell, tree structures and friend-of-a-friend style searches can now be done using standard SQL syntax, and results joined onto other tables.

That sounds cool, and it's the first time I've heard of a MySQL 'Computation engine' plugin. Delving further into the manual gives some insight, and there's some unexpected twists there :
  • OQG is a storage engine, but data stored is not persistent w.r.t. server crashes.
  • All tables have the same schema, storing details of graph 'edges'.
  • The fixed schema has a magic column called 'latch'
  • Depending on the constant value of latch used in a SELECT statement on the table, the engine will return different 'pseudo results'.
The last fact is the coolest one. As far as I understand it :
  • SELECT where latch = NULL AND ... allows queries on the graph as though it were a list of edges (as the data was entered).
  • SELECT where latch = 0 [AND ...] allows queries on the graph as though it were a list of nodes.
  • SELECT where latch = 1 [AND ...] allows Dijkstra's shortest path algorithm to be applied to the graph
  • SELECT where latch = 2 [AND ...] allows a breadth-first search to be applied to the graph
This is a superb hack! I imagine the OQG engine internally has an in-memory graph structure which is maintained as edges are added via the INSERT Api. The SELECT Api then gives access to different views of the underlying graph and even allows complex parameterised functions to be applied to the graph, giving results as a set of rows which can be decoded into the required result. It's not pretty, but it's an extremely pragmatic approach to embedding graph access and operations within a database.

It's also undeniable that the use of magic numbers and the 'latch' column adds a certain arcane wackiness that charms this reader. It's definitely a MySQL-style solution, continuing the tradition of MyISAM, Blackhole, Federated etc where good-enough gets to market before best, and 20% of the implementation effort delivers 80% of the functionality.

Once again I'd be interested to hear about how this is actually being used, and what sort of difference it is making.

Each of these three cool projects enable new solutions individually and expand the dimensions of what is possible using MySQL. In combination they open up a vast expanse of potential. One of the best things about them is that they all happened outside the confines of MySQL / Sun / Oracle. Hopefully they will get the success they deserve so that we can have more cool new projects in future.

Some MySQL projects I think are cool - Spider Storage Engine

One thing that has puzzled me about MySQL Server is that it became famous for sharded scale-out deployments in well known web sites and yet has no visible support for such deployments. The MySQL killer feature for some time has been built-in asynchronous replication and gigabytes of blogs have been written about how to setup, use, debug and optimise replication, but when it comes to 'sharding' there is nothing built in. Perhaps to have attempted to implement something would have artificially constrained user's imaginations, whereas having no support at all has allowed 1,000 solutions to sprout? Perhaps there just wasn't MySQL developer bandwidth available, or perhaps it just wasn't the best use of the available time. In any case, it remains unclaimed territory to this day.

On first hearing of the Federated storage engine some years ago, I mistakenly assumed that this could be the basis of some MySQL scale-out solution. Perhaps a layer of front end 'proxy' MySQLDs could federate tables from a layer of backend MySQLDs giving some level of distribution transparency to sharded data. However as I found out, the Federated engine was not designed with such a scenario in mind. It has a certain internal elegance and simplicity, but unfortunately it is a little too simple for anything other than light duties.

The Spider storage engine extends the Federated concept of a table definition being a 'link' to a table on a remote MySQL server. However, it also integrates with the table partitioning features of MySQL 5.1, allowing each partition of a table to be specified as a 'link' to a table on a remote MySQL server. This effectively allows the built-in partitioning mechanisms of MySQLD (PARTITION BY RANGE/LIST/HASH) to be used to shard/partition rows across multiple MySQL servers transparently.

One of the major drawbacks of the Federated engine was that it had very little support for 'pushing conditions' to the MySQLD instance storing the source tables. This meant that well behaved selective queries issued on the 'front-end' MySQLD instance could result in non-selective queries being issued to the 'back-end' MySQLD instances, and large volumes of data being unnecessarily transferred back to the 'front-end' MySQLD where query processing then discarded it.

Spider attempts to improve this situation by pushing conditions down to the MySQLDs containing the source data. Combined with the partition pruning available from the MySQLD partitioning engine this should significantly reduce the amount of redundant data transferred in some cases.

So I think Spider is a pretty cool project. Like MySQL Cluster, it bears the burden of making MySQLD more data-distribution-aware and I think they're doing great work. It'd be great to hear stories about how Spider is being used, especially if anyone is using it *with* MySQL Cluster.

Some MySQL projects I think are cool - Flexviews

Most of the time we think of SQL queries as being executed at a point in time and generating a single definitive result, but huge efficiency gains are available when data changes are tracked and derived views are partially updated as needed rather than being fully recomputed periodically. MySQL has support for views on tables, but there is currently no support for materialized views. While thinking about this topic I decided to have another look at Justin Swanhart's Flexviews tool and it's definitely a cool MySQL based project.

Flexviews is an open source set of non-intrusive addons to MySQL enabling materialized views to be defined and maintained as the underlying tables are changed. If you're not sure what a materialized view is or why they can be useful then I recommend reading the intro on the Flexviews site. I was particularly impressed by the documented support for GROUP BY, aggregates and joins.

I have a vague recollection of reading a blog post about an early version of Flexviews which used MySQL triggers to collect data changes on underlying tables, and feeling that it was probably a little flaky. However I now read that the 'Change Data Capture' in recent versions has been factored out into a separate tool called FlexCDC which 'mines' Row-based Binlog entries. This is a far more promising approach, and useful for many other applications. From my own point of view, it makes Flexviews potentially useful for maintaining materialized views of data stored in MySQL Cluster, where Binlog is the only centralised record of change. It also got me thinking that MySQL Cluster already has code inside the Server listening to data changes for writing the Binlog, which could be extended to capture data changes into some other tables if it were a useful feature.

The fact that Flexviews is implemented without Server changes or hooks is impressive and it's a great example of a MySQL 'ecosystem' project. It would be great to read some blog entries about how people are using it and what it is doing for them.

Thursday, 25 March 2010

ACID tradeoffs, modularity, plugins, Drizzle

Most software people are aware of the ACID acronym coined by Jim Gray. With the growth of the web and open source, the scaling and complexity constraints imposed on DBMS implementations supporting ACID are more visible, and new (or at least new terms for known) compromises and tradeoffs are being discussed widely. The better known NoSQL systems are giving insight by example into particular choices of tradeoffs.

Working at MySQL, I have often been surprised at the variety of potential alternatives when implementing a DBMS, and the number of applications which don't need the full set of ACID letters in the strictest form. The original MySQL storage engine, MyISAM is one of the first and most successful examples of an 'ACID remix'. The people drawn to DBMS development work often have a perfectionist streak, which can cause them to tend to prefer 'nothing' over 'imperfect'. MyISAM was and still is a flag-bearer for 'good enough'. Perhaps we should be less modest and call it 'more than good enough'.

One seldom discussed benefit of MySQL's storage engine architecture is that pressure to make 'The One True Storage Engine' is reduced. DBMS products with one fixed database engine need to optimise for all supported use cases. This is a great engineering challenge, but increases design effort, requirements for configuration and auto-tuning, constraints on any design change or reoptimisation etc. With MySQL, there are multiple existing storage engines, each with a (sub)set of target use-cases in mind. A single MySQL server can maintain and access tables in different storage engines, each tuned as closely as possible to the use-case for the data, without adding complexity to unrelated engines. Engines can be wildly optimised for a narrow use case as there are plausible alternative engines available for other use cases.

I understand that one aim of the Drizzle project is to extend the modularity of the MySQL Server on multiple axes, allowing diversity to flourish. As a one-time Java coder, who enjoyed the pleasures of design-by-interface, I can see the attraction. While the effort is guided by an actual need for modularity and real examples of alternative plugins, it can be a great force multiplier. There is always the risk of modularity for its own sake - a branch of Architecture Astronautics. Sure symptoms, which I may have suffered from in the past, include the class names FactoryFactory..., PolicyPolicy, or [Anything]Broker).

Another good vibe from Drizzle is the microkernel concept, although would say that there's some terminological abuse occurring here! Perhaps it could more reasonably be said that MySQL has a TeraKernel and Drizzle has a MegaKernel? In any case the motivations are good. Decoupling the huge chunks of functionality glued together inside MySQLD is great for long term software integrity, understanding dependencies, finding (and introducing) bugs, and might make it easier to start adding functionality again. Replication seems especially ripe ground for alternative plugins. User authentication is another often requested 'chunk'. It will take longer to crystalise interfaces for more deeply embedded areas like the query Optimizer/Executor, but if these interfaces are arising from a real need then that can drive the API design.

One aspect of storage engine modularity that is not often mentioned is that some MySQL storage engines also moonlight with other products. The Berkeley database (BDB) is probably the oldest and most promiscuous, embedded in DNS daemons, LDAP servers and all sorts of other places. Ndb is unusual in that it can be used from separate MySQLD and other NdbApi processes at the same time. InnoDB has also recently added an embedded variant. This trend will accelerate, especially when some of the distributed NoSQL systems start supporting 'pluggable local storage' APIs. I imagine that a NoSQL local storage engine API could be somewhat simpler to implement than the MySQL SE API, at least to start with!