2014-02-10

The final piece of the puzzle

I just pushed the new CREATE OR REPLACE TABLE syntax into MariaDB 10.0, for the soon to be released 10.0.8-gamma (RC). (Before we had only CREATE OR REPLACE for views)

When using the new syntax, the CREATE statement will automatically DROP the old table if it existed.

This is the last feature (which is also a bug fix) depending on me that needed to be pushed before we could release 10.0 gamma (RC). Next, I will start working on speed optimizations and features in 10.1.

The CREATE OR REPLACE TABLE syntax was needed to make global transaction id (GTID) work reliably with CREATE ... SELECT, both in statement-based and row-based replication.

We (Kristian Nielsen and I) didn't think that the solution used in MySQL 5.6 (to give an error message "CREATE TABLE ... SELECT is forbidden when @@GLOBAL.ENFORCE_GTID_CONSISTENCY = 1") when using CREATE ... SELECT was good enough. We wanted something better.

The solution now implemented ensures that we can store DROP TABLE + CREATE TABLE + INSERT INTO TABLE under one GTID.  The GTID entry can also be re-executed in case of slave failure during execution.

While developing CREATE OR REPLACE, I noticed several possible problems in the replication code that were not properly taken care of (neither in MySQL or MariaDB):
  • Having different storage engines on master and slave for any table would not work well together with the GTID code and would cause inconsistencies between GTID's in the master and slave.
  • If CREATE SELECT would fail on the slave, there was no way the slave could continue as it could not roll back the CREATE statement.
  • Slave failure during a DROP TABLE would make the slave stop and it would be unable to restart without user intervention.
  • Having different replication modes on master and slave (like statement based on master and row based on slave) would cause inconsistencies in GTID generation.
To fix these and make the slave more robust, I introduced the following things:
  • While the slave is running a transaction, it will treat all tables as transactional tables when it comes to the caching of statements for the binary log.
  • Commits will only happen when the binary log says so.  This ensures that the slave will log and commit changes in the same order as the master, independent of the storage engine used.
  • CREATE is replayed on the slave as CREATE OR REPLACE.  This makes CREATE SELECT statements repeatable on the slave.
  • DROP TABLE statements are replayed on the slave as DROP TABLE IF EXISTS.  This makes DROP TABLE statements repeatable on the slave.
  • One can now have a mix of DDL and DML statments in the binary log (we use this fact to handle CREATE ... SELECT which is logged as BEGIN ; DROP; CREATE ; INSERTS ; COMMITS). This can be very useful also for other things in the future.
The end effect of the above is that the slave in 10.0.8 is going to be more robust than ever before.  In addition, the replication mode will not affect how GTID's are generated anymore.

I also added a variable 'slave-ddl-exec-mode' that one can set to STRICT if one prefers the old behavior that the the slave will fail if the DDL would fail on the slave for any reason, including if it fails to repeat a command on restart.

As a bonus, I also fixed that if one used LOCK TABLES with CREATE OR REPLACE TABLE, the lock will be held while the table is deleted and re-created and the lock is then added to the new table. This makes it possible to replace a table with an empty one without other users noticing it.

Here is an extract from the documentation of CREATE OR REPLACE :

The CREATE OR REPLACE TABLE syntax was added in MariaDB 10.0.8 to make replication more robust if it has to rollback and repeat statements like CREATE ... SELECT on slaves.

CREATE OR REPLACE TABLE table_name (a int);

is basically the same as:

DROP TABLE IF EXISTS table_name;
CREATE TABLE table_name (a int);


with the following exceptions:
  • If table_name was locked with LOCK TABLES it will continue to be locked after the statement.
  • Temporary tables are only dropped if the TEMPORARY key word was used. (With DROP TABLE temporary tables are preferred to be dropped before normal tables).