Tuesday, December 29, 2015

On ProxySQL, MaxScale, Persistent Connection, response time, and bugs

Few days ago I came across the announcement that MaxScale 1.3 finally supports Persistent Connection.
ProxySQL supports persistent connection since it was a prototype (sometime back in 2013), therefore I am very happy that the MaxScale Team finally introduced a long waited feature.
Although, MaxScale implementation of persistent connection has a serious drawback and I would consider it as a serious bug (more details at the end of the article). A bug so serious that shouldn't reach any GA release.

Since I like running benchmark, and due the new feature in MaxScale, I thought it is a good time to compare again ProxySQL vs MaxScale, around 6 months after a previous benchmark .

Benchmark : ProxySQL vs MaxScale


Benchmark setup


The benchmark setup is very simple:
- a physical server (20 CPU cores) running sysbench , proxysql and maxscale
- a physical server (20 CPU cores) running 3 mysqld instances (1 master and 2 slaves) launched using MySQL Sandbox

Software version used:
- MySQL 5.6.28
- 0.4.12
- ProxySQL v1.1.0 stable
- MaxScale 1.3.0-beta

Configuration files can be found at the follow URLs:
ProxySQL
MaxScale

ProxySQL configuration was completed running these commands through the admin interface:

INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply) VALUES(1,1,'^SELECT.*FOR UPDATE$', 1, 1);
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply) VALUES(2,1,'^SELECT', 2, 1);
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;


Preliminary benchmarks confirm the conclusion of my previous blog post : MaxScale is very CPU intensive, therefore to make a fair comparison between the two proxies I ran both of them with only 1 worker thread.


Benchmarks were executed running sysbench against the local proxy (either ProxySQL or MaxScale) , using this command:

sysbench --max-requests=0 --test=oltp --mysql-user=rcannao --mysql-password=rcannao \
--mysql-db=test --oltp-table-size=1000000 --oltp-read-only=on --oltp-point-selects=1 \
--oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 \
--oltp-skip-trx=off --db-ps-mode=disable --max-time=60 \
--oltp-reconnect-mode=transaction --mysql-host=10.1.1.164 \
--num-threads=$thr  --mysql-port=$port run

What is important to to note is that the workload is read-only , it performs only point selects, and it will reconnect at the end of each transaction : this is a workload meant to check the performance benefit of Persistent Connection .


The benchmark will compare:
- ProxySQL configured with read/write split
- MaxScale with readwritesplit module (RW)
- MaxScale with readconnroute module (RR)
- MaxScale with readwritesplit module and persistent connection (RW-PC)
- MaxScale with readconnroute module and persistent connection (RR-PC)


Benchmark result



Here the graph of the benchmark result about throughput:



There are a lot of information, but also some unanswered questions.
At very low concurrency, ProxySQL is slightly slower.
At 64 connections, ProxySQL and MaxScale RW-PC have very similar throughput, and that is great since these two configurations have similar behaviors.
Always at 64 connections it seems that MaxScale without Persistent Connection has reached its maximum throughput: as throughput with Persistent Connection is higher, we can already conclude that this feature is indeed useful and improves performance. MaxScale RR-PC will continue giving more throughput than the others, but this is expected as this routing module is very simple.

At 256 connections, throughput of ProxySQL and MaxScale RR-PC are the only two that continue growing. That means that the other configurations have saturated 1 core and are unable to scale anymore, while ProxySQL continues providing all its feature and scales with just 1 core.

At 1024 connections, all proxies configurations have a drop in performance. Although the drop in performance in ProxySQL is marginal, the drop in performance in MaxScale is severe.
This confirms ProxySQL's ability to scale.


What about response time?





From this graph of response time we can note that at high concurrency ProxySQL is able to provide the better response time. Let's remove the response time for 1024 connections and compare at lower concurrency:



What about maximum response time? This is really interesting:



No blue columns ... did I forgot to add the response time of ProxySQL? No, I didn't forget, but the max response time of MaxScale is too high for a proper comparison.
Starting at 64 connections, the maximum response time of MaxScale becomes so high that it reaches 60 seconds: this matches the max time in sysbench, therefore we should conclude that at least one (or perhaps more) connection created by sysbench isn't able to complete a transaction until the end of the benchmark.
That needs to be further validated by its developers, but it seems that at high concurrency (even if 64 connections shouldn't be considered "high") MaxScale is only processing a subset of connections while completely ignoring others. If that's correct, this should be considered as a serious bug.

For further testing, I rerun sysbench with 256 connections (not a lot, at all!) against MaxScale for 600 seconds , and max response times was 600 seconds : that is, at least one transaction (maybe more) wasn't processed until all the other transactions were completed.
I am sure nobody wants a transaction stuck for and undefined amount of time because the proxy is busy processing other transactions.

For reference, here the graph of max response times without the off the charts values:





Severe bug in Persistent Connection


The release note of MaxScale 1.3.0 hides a very important detail that is available only in the Administration Tutorial , that I report here for reference:

Please note that because persistent connections have previously been in use, they may give a different environment from a fresh connection. For example, if the previous use of the connection issued "use mydatabase" then this setting will be carried over into the reuse of the same connection. [...] In exceptional cases this feature could be a problem.

If I read it correctly, this is not a feature but a severe series of bugs.
More details below.

MaxScale doesn't track current schema


$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE()" test
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE()" mysql
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

I assume nobody wants this to happen : the second client believes to connect to schema "mysql" , but in reality it is connected to schema "test" .
Unless your application is using only one schema, I strongly discourage the use of persistent connection.

MaxScale doesn't track charset and returns incorrect encoding


Let's try to identify another possible problem:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE(), @@session.character_set_client" mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET NAMES latin1" mysql

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE(), @@session.character_set_client" mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | latin1                         |
+------------+--------------------------------+

In this example the current database is always incorrect (as already pointed out previously), but also the character set is compromised/corrupted. This can be a serious issue for many application, as MaxScale is ignoring the charset as specific by the client.
That is, MaxScale is ignoring schemaname and charset as specific during the initial handshake.


MaxScale doesn't track autocommit


Same applies for autocommit ...

$ mysql -u rcannao -prcannao -h 10.1.1..164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+


Here we can see another major issue/bug : an application could issue statements assuming autocommit=ON (the default) while in reality another client could have change it.

MaxScale doesn't track transactions


I think this is perhaps the most serious bugs of how Persistent Connection are implemented in MaxScale.
Without MaxScale, when a client disconnects its transaction should be rolled back.
Let's see what happens with MaxScale and Persistent Connection.

First, we create a transaction the way many applications do: SET autocommit=0 , followed by any DML :

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+

So far, we have modified autocommit in a Persistent connection.
Second, let's run some SELECT statement:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM sbtest" test
ERROR 1046 (3D000) at line 1: No database selected

Ops, error ... I forgot that MaxScale ignores my request for a default schema ... Now I must specify it in the query itself!

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM test.sbtest" test
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM test.sbtest WHERE id < 1000" test
+----------+
| COUNT(*) |
+----------+
|      999 |
+----------+

All looks good so far. Let me check if there are active transactions:
 
$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
$

No active transactions, that's good. Now, let's run a DML statement ...
 
$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "DELETE FROM test.sbtest WHERE id < 1000"
$

Let me check again if there are active transactions ...

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
---TRANSACTION 2253315, ACTIVE 29 sec
$

Here is the bug! The client that issued the DML statement and started the transaction is gone/disconnected, yet MaxScale is holding a transaction open.
The bad news is that MaxScale doesn't track transaction no matter if they are started due to autocommit or an explicit START TRANSACTION . Here an example (after restarting maxscale) :

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "START TRANSACTION; SELECT 1 FROM test.sbtest LIMIT 1" test
+---+
| 1 |
+---+
| 1 |
+---+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
---TRANSACTION 2253317, ACTIVE 2 sec

$


ProxySQL provides a safe environment with Persistent Connetion


By comparison, ProxySQL has a more mature implementation of Persistent Connection, and keeps track of the environment set by the client, ensuring that the environments of backend and frontend match.

ProxySQL tracks current schema


ProxySQL isn't affected by the same bug of MaxScale , and correctly tracks the schema as specific by the client:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE()" test
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE()" mysql
+------------+
| DATABASE() |
+------------+
| mysql      |
+------------+

ProxySQL tracks character set


ProxySQL isn't affected by the same bug of MaxScale , and correctly tracks the character set as specific by the client:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE(), @@session.character_set_client" mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SET NAMES latin1" mysql

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE(), @@session.character_set_client" mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

ProxySQL tracks autocommit


Also in this case, ProxySQL isn't affected by the same bug of MaxScale , and correctly tracks the value of autocommit as specific by the client:

$ mysql -u rcannao -prcannao -h 10.22.20.164 -P6033 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+  

ProxySQL tracks transactions


Also in this case, ProxySQL isn't affected by the same bug of MaxScale , and correctly tracks transactions, terminating them if required:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "START TRANSACTION; SELECT 1 FROM test.sbtest LIMIT 1" test
+---+
| 1 |
+---+
| 1 |
+---+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$

 

Why ProxySQL implements better Persistent Connection?


Since its initial implementation 2 years ago, ProxySQL was designed to handle frontends (clients) and backends (servers) as different entities, only connects them when needed, and remove the link between them as soon as possible.
When a client connects to ProxySQL no connection to any backend is established. When a client issues a request ProxySQL determines if the request needs a connection to the backend or not, and only if required it forwards the request to a backend. As soon as the request is completed, ProxySQL determines if the connection to the backend is still required, and if not it returns it to a connection pool.
That is: the connection to the backend is returned to the connection pool not when the client disconnects, but when the request is completed, assuming that it can be reused by other clients (for example if there are no active transactions).
Similarly, when a backend connection is linked to a client connection, ProxySQL will ensure that the environment is set correctly : schema, charset, autocommit, etc .
In other words, ProxySQL doesn't just implement Persistent Connection, but also Connection Multiplexing. In fact, ProxySQL can handle hundreds of thousands of clients, yet forward all their traffic to few connections to the backend.

As a final note, I invite everybody to try ProxySQL , now GA , and feel free to contact me for any question.

Tuesday, September 22, 2015

ProxySQL Announcement : version 0.2 is stable and production ready

Long time has passed since the first time ProxySQL was announced in October 2013 .
A lot of changes happened from that day: ProxySQL was initially just a prototype, and had many many limitations.
Quickly these limitations became a showstopper for future development, but that was expected from a simple prototype.
Since then it was completely rewritten, and after almost 2 years I am glad to announce that ProxySQL is now a proxy with a long list of features, stable and production ready.
Please join me and attend my session at Percona Live where I will present the new exciting features of ProxySQL , a High Availability proxy designed to solve real issues of MySQL setups from small to very large production environments.

Tuesday, September 15, 2015

ProxySQL tutorial : seamless replication switchover in less than one second

In the previous blog post we setup a replication topology with one master and 2 slaves, and we installed proxysql on a forth servers where the application should connect to.
The post was quite long, and we covered several topics, from installation and configuration of the cluster to installation of proxysql and creation of users and backends, from configuration of query rules for routing and their statistics to monitoring module and replication lag.

But ProxySQL can do more than this.
Among the most interesting features of proxysql is the ability to hold traffic until a backend is available (within a configurable time limit).
For instance, if a query should be send to hostgroup 0 but there is no servers in hostgroup 0, ProxySQL will wait until a server is available in hostgroup 0 or a timeout expires.

This feature has the very interesting effect that using ProxySQL it is possible to perform a planned failover without any error to the application.
In a nutshell, the process is the following:
  1. ProxySQL is configured to disable the master from hostgroup 0 ;
  2. the failover is performed ;
  3. ProxySQL is configured to demote the old master to become a slave, and the new master is moved to hostgroup 0.
Let see these steps in details.

1. ProxySQL is configured to disable the master from hostgroup 0


To make an example, we can run something like this:

vagrant@app:~$ export OldMaster='192.168.124.101'
vagrant@app:~$ export NewMaster='192.168.124.102' 
vagrant@app:~$ mysql -u admin -padmin -h 127.0.0.1 -P6032 -e "UPDATE mysql_servers SET STATUS='OFFLINE_SOFT' WHERE hostname=\"$OldMaster\" AND hostgroup_id=0; LOAD MYSQL SERVERS TO RUNTIME;"

What the above command is simple: the master is disabled (status=OFFLINE_SOFT).
At this stage, no more queries can be sent to old master.


2. Perform the failover


The reason why we configured GTID for replication is to simplify the failover process. In fact, we can use MySQL Utilities to perform the failover. Here an example:

vagrant@app:~$ mysqlrpladmin --discover-slaves-login=root:root --rpl-user=repl:replpass --master=root:root@$OldMaster --new-master=root:root@$NewMaster --demote-master switchover
# Discovering slaves for master at 192.168.124.101:3306
# Discovering slave at 192.168.124.102:3306
# Found slave: 192.168.124.102:3306
# Discovering slave at 192.168.124.103:3306
# Found slave: 192.168.124.103:3306
# Checking privileges.
# Performing switchover from master at 192.168.124.101:3306 to slave at 192.168.124.102:3306.
# Checking candidate slave prerequisites.
# Checking slaves configuration to master.
# Waiting for slaves to catch up to old master.
# Stopping slaves.
# Performing STOP on all slaves.
# Demoting old master to be a slave to the new master.
# Switching slaves to new master.
# Starting all slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Switchover complete.
#
# Replication Topology Health:
+------------------+-------+---------+--------+------------+---------+
| host             | port  | role    | state  | gtid_mode  | health  |
+------------------+-------+---------+--------+------------+---------+
| 192.168.124.102  | 3306  | MASTER  | UP     | ON         | OK      |
| 192.168.124.101  | 3306  | SLAVE   | UP     | ON         | OK      |
| 192.168.124.103  | 3306  | SLAVE   | UP     | ON         | OK      |
+------------------+-------+---------+--------+------------+---------+
# ...done.
vagrant@app:~$


3. ProxySQL is configured to demote the old master to become a slave, and the new master is moved to hostgroup 0


Similarly, we can quickly reconfigure ProxySQL to perform the traffic switch configuring the old master as a slave and enabling it, and configuring the new master.

vagrant@app:~$ (
> echo "UPDATE mysql_servers SET STATUS='ONLINE', hostgroup_id=1 WHERE hostname=\"$OldMaster\" AND hostgroup_id=0;"
> echo "UPDATE mysql_servers SET hostgroup_id=0 WHERE hostname=\"$NewMaster\" AND hostgroup_id=1;"
> echo "LOAD MYSQL SERVERS TO RUNTIME;"
> ) | mysql -u admin -padmin -h 127.0.0.1 -P6032
Warning: Using a password on the command line interface can be insecure.
vagrant@app:~$



Seems very easy, right?
Let's try to put everything together in a failover script.

vagrant@app:~$ cat ./switchover.sh
#!/bin/sh
if [ $# -ne 2 ]; then
  echo "Usage: ./switchover.sh oldmaster newmaster"
  exit 1
fi
OldMaster=$1
NewMaster=$2

(
echo "UPDATE mysql_servers SET STATUS='OFFLINE_SOFT' WHERE hostname=\"$OldMaster\" AND hostgroup_id=0;"
echo "LOAD MYSQL SERVERS TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032

sleep 1 # let's give some time to current transactions to complete

mysqlrpladmin --discover-slaves-login=root:root --rpl-user=repl:replpass \
  --master=root:root@$OldMaster --new-master=root:root@$NewMaster \
  --demote-master switchover

(
echo "UPDATE mysql_servers SET STATUS='ONLINE', hostgroup_id=1 WHERE hostname=\"$OldMaster\" AND hostgroup_id=0;"
echo "UPDATE mysql_servers SET hostgroup_id=0 WHERE hostname=\"$NewMaster\" AND hostgroup_id=1;"
echo "LOAD MYSQL SERVERS TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032

Note that the script above doesn't perform any sanity check or error handling. This is intentionally, to make the script short and easy to understand.


Seamless replication switchover


At this point we have a script (switchover.sh) that is able to perform a switch in a very short period of time.
Let's put it in action, and while running sysbench (below) in another terminal I will be running:
./switchover.sh 192.168.124.102 192.168.124.101
and then:
./switchover.sh 192.168.124.101 192.168.124.102

Here is the output of sysbench while performing 2 switchovers :


vagrant@app:~$ sysbench --report-interval=1 \
--num-threads=4 --max-requests=0 --max-time=20 \
--test=/home/vagrant/sysbench-0.5/sysbench/tests/db/oltp.lua \
--mysql-user=utest --mysql-password=ptest \
--mysql-host=127.0.0.1 --mysql-port=6033 --mysql-db=test \
--oltp-read-only=off run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4
Report intermediate results every 1 second(s)
Random number generator seed is 0 and will be ignored


Threads started!

[   1s] threads: 4, tps: 137.94, reads: 1958.22, writes: 551.78, response time: 42.52ms (95%), errors: 0.00, reconnects:  0.00
[   2s] threads: 4, tps: 156.01, reads: 2205.11, writes: 635.03, response time: 32.96ms (95%), errors: 0.00, reconnects:  0.00
[   3s] threads: 4, tps: 155.93, reads: 2172.96, writes: 612.71, response time: 36.52ms (95%), errors: 0.00, reconnects:  0.00
[   4s] threads: 4, tps: 156.05, reads: 2200.76, writes: 628.22, response time: 32.25ms (95%), errors: 0.00, reconnects:  0.00
[   5s] threads: 4, tps: 75.00, reads: 996.06, writes: 296.02, response time: 30.29ms (95%), errors: 0.00, reconnects:  0.00
[   6s] threads: 4, tps: 12.00, reads: 197.01, writes: 52.00, response time: 1493.45ms (95%), errors: 0.00, reconnects:  0.00
[   7s] threads: 4, tps: 160.99, reads: 2230.92, writes: 639.98, response time: 33.70ms (95%), errors: 0.00, reconnects:  0.00
[   8s] threads: 4, tps: 158.99, reads: 2275.84, writes: 649.95, response time: 30.82ms (95%), errors: 0.00, reconnects:  0.00
[   9s] threads: 4, tps: 160.02, reads: 2240.22, writes: 642.06, response time: 31.89ms (95%), errors: 0.00, reconnects:  0.00
[  10s] threads: 4, tps: 155.99, reads: 2170.91, writes: 612.97, response time: 34.46ms (95%), errors: 0.00, reconnects:  0.00
[  11s] threads: 4, tps: 132.00, reads: 1825.05, writes: 523.01, response time: 39.40ms (95%), errors: 0.00, reconnects:  0.00
[  12s] threads: 4, tps: 157.00, reads: 2207.00, writes: 628.00, response time: 34.01ms (95%), errors: 0.00, reconnects:  0.00
[  13s] threads: 4, tps: 160.99, reads: 2271.83, writes: 643.95, response time: 31.08ms (95%), errors: 0.00, reconnects:  0.00
[  14s] threads: 4, tps: 164.01, reads: 2286.18, writes: 660.05, response time: 30.51ms (95%), errors: 0.00, reconnects:  0.00
[  15s] threads: 4, tps: 102.01, reads: 1405.07, writes: 405.02, response time: 34.48ms (95%), errors: 1.00, reconnects:  0.00
[  16s] threads: 4, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects:  0.00
[  17s] threads: 4, tps: 152.99, reads: 2187.93, writes: 618.98, response time: 32.53ms (95%), errors: 0.00, reconnects:  0.00
[  18s] threads: 4, tps: 162.00, reads: 2277.96, writes: 652.99, response time: 30.47ms (95%), errors: 0.00, reconnects:  0.00
[  19s] threads: 4, tps: 160.78, reads: 2250.98, writes: 640.14, response time: 33.41ms (95%), errors: 0.00, reconnects:  0.00
[  20s] threads: 4, tps: 160.22, reads: 2225.01, writes: 631.85, response time: 32.45ms (95%), errors: 0.00, reconnects:  0.00
OLTP test statistics:
    queries performed:
        read:                            37604
        write:                           10741
        other:                           5371
        total:                           53716
    transactions:                        2685   (134.13 per sec.)
    read/write requests:                 48345  (2415.02 per sec.)
    other operations:                    5371   (268.30 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          20.0185s
    total number of events:              2685
    total time taken by event execution: 80.0231s
    response time:
         min:                                 13.86ms
         avg:                                 29.80ms
         max:                               1505.34ms
         approx.  95 percentile:              34.40ms

Threads fairness:
    events (avg/stddev):           671.2500/4.15
    execution time (avg/stddev):   20.0058/0.00

The provided script, using proxysql and mysqlrpladmin, is able to perform a switch in a very short period of time (roughly 1.5 second) without stopping the application and without generating any error.
The application (sysbench in this case) doesn't report any error/failure!
Yet, it is possible to note that sysbench pauses at seconds 5-6 and again at second ~16 . Furthermore, sysbench also reports that the maximum response time is 1505ms : this is exactly the time it took to perform a switch. The trade off for no error is increased latency, within a configurable threshold.

ProxySQL is able to give us some more information about what happened during the switch:

mysql> SELECT max_time, digest_text FROM stats_mysql_query_digest ORDER BY max_time DESC LIMIT 3;
+----------+-------------------------------------+
| max_time | digest_text                         |
+----------+-------------------------------------+
| 1523629  | BEGIN                               |
| 157867   | COMMIT                              |
| 151759   | UPDATE sbtest1 SET k=k+? WHERE id=? |
+----------+-------------------------------------+
3 rows in set (0.00 sec)

BEGIN is the statement that was the slowest among all (1523ms). Let's see why.
When a server is put into OFFLINE_SOFT mode, new incoming connections aren't accepted anymore, while the existing connections are used until they became inactive. In other words, connections are kept in use until the current transaction is completed. This allows to gracefully detach a backend.
With the sysbench executed above, all statements are executed within a transaction that starts with BEGIN , therefore during the failover only the BEGIN statements were blocked: the other statements weren't sent yet.


Even faster seamless replication switchover


In the previous switchover script there is a completely arbitrary "sleep 1" to wait transactions to complete.
This means that if any transaction takes longer than 1 second it is killed, but also that it is possible that the switchover if paused for too long and could be faster.

ProxySQL tracks the number of active connections (in use) in stats_mysql_connection_pool.ConnUsed .
Therefore, if SUM(ConnUsed)==0 it means that no connections are in use.
Let's use this information to improve the script, replacing "sleep 1" with a loop waiting that all connections get free within 1 second.

vagrant@app:~$ cat ./switchover.sh
#!/bin/sh
if [ $# -ne 2 ]; then
  echo "Usage: ./switchover.sh oldmaster newmaster"
  exit 1
fi
OldMaster=$1
NewMaster=$2

(
echo "UPDATE mysql_servers SET STATUS='OFFLINE_SOFT' WHERE hostname=\"$OldMaster\" AND hostgroup_id=0;"
echo "LOAD MYSQL SERVERS TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032


CONNUSED=`mysql -h 127.0.0.1 -P6032 -uadmin -padmin -e 'SELECT IFNULL(SUM(ConnUsed),0) FROM stats_mysql_connection_pool WHERE status="OFFLINE_SOFT" AND hostgroup=0' -B -N 2> /dev/null`
TRIES=0
while [ $CONNUSED -ne 0 -a $TRIES -ne 20 ]
do
  CONNUSED=`mysql -h 127.0.0.1 -P6032 -uadmin -padmin -e 'SELECT IFNULL(SUM(ConnUsed),0) FROM stats_mysql_connection_pool WHERE status="OFFLINE_SOFT" AND hostgroup=0' -B -N 2> /dev/null`
  TRIES=$(($TRIES+1))
  if [ $CONNUSED -ne "0" ]; then
    sleep 0.05
  fi
done

mysqlrpladmin --discover-slaves-login=root:root --rpl-user=repl:replpass \
  --master=root:root@$OldMaster --new-master=root:root@$NewMaster \
  --demote-master switchover

(
echo "UPDATE mysql_servers SET STATUS='ONLINE', hostgroup_id=1 WHERE hostname=\"$OldMaster\" AND hostgroup_id=0;"
echo "UPDATE mysql_servers SET hostgroup_id=0 WHERE hostname=\"$NewMaster\" AND hostgroup_id=1;"
echo "LOAD MYSQL SERVERS TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032


We will now re-run sysbench, while on another terminal we perform 2 switchover:

vagrant@app:~$ sysbench --report-interval=1 --num-threads=4 --max-requests=0 --max-time=20 --test=/home/vagrant/sysbench-0.5/sysbench/tests/db/oltp.lua --mysql-user=utest --mysql-password=p
test --mysql-host=127.0.0.1 --mysql-port=6033 --mysql-db=test --oltp-read-only=off run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4
Report intermediate results every 1 second(s)
Random number generator seed is 0 and will be ignored


Threads started!

[   1s] threads: 4, tps: 132.01, reads: 1885.13, writes: 532.04, response time: 40.72ms (95%), errors: 0.00, reconnects:  0.00
[   2s] threads: 4, tps: 139.95, reads: 1955.36, writes: 555.82, response time: 37.97ms (95%), errors: 0.00, reconnects:  0.00
[   3s] threads: 4, tps: 130.06, reads: 1820.88, writes: 524.25, response time: 43.67ms (95%), errors: 0.00, reconnects:  0.00
[   4s] threads: 4, tps: 138.02, reads: 1935.22, writes: 550.06, response time: 38.49ms (95%), errors: 0.00, reconnects:  0.00
[   5s] threads: 4, tps: 75.00, reads: 1013.98, writes: 297.99, response time: 41.33ms (95%), errors: 0.00, reconnects:  0.00
[   6s] threads: 4, tps: 103.00, reads: 1474.96, writes: 417.99, response time: 46.80ms (95%), errors: 0.00, reconnects:  0.00
[   7s] threads: 4, tps: 140.00, reads: 1955.00, writes: 557.00, response time: 37.95ms (95%), errors: 0.00, reconnects:  0.00
[   8s] threads: 4, tps: 137.99, reads: 1938.81, writes: 548.95, response time: 38.18ms (95%), errors: 0.00, reconnects:  0.00
[   9s] threads: 4, tps: 135.89, reads: 1919.42, writes: 543.55, response time: 36.65ms (95%), errors: 0.00, reconnects:  0.00
[  10s] threads: 4, tps: 133.09, reads: 1846.30, writes: 536.38, response time: 41.17ms (95%), errors: 0.00, reconnects:  0.00
[  11s] threads: 4, tps: 141.03, reads: 1970.43, writes: 560.12, response time: 35.89ms (95%), errors: 0.00, reconnects:  0.00
[  12s] threads: 4, tps: 137.00, reads: 1926.02, writes: 548.01, response time: 39.55ms (95%), errors: 0.00, reconnects:  0.00
[  13s] threads: 4, tps: 141.99, reads: 1961.80, writes: 567.94, response time: 37.37ms (95%), errors: 0.00, reconnects:  0.00
[  14s] threads: 4, tps: 125.01, reads: 1777.13, writes: 504.04, response time: 43.84ms (95%), errors: 0.00, reconnects:  0.00
[  15s] threads: 4, tps: 95.88, reads: 1336.31, writes: 379.52, response time: 59.99ms (95%), errors: 0.00, reconnects:  0.00
[  16s] threads: 4, tps: 122.16, reads: 1716.26, writes: 495.65, response time: 46.88ms (95%), errors: 0.00, reconnects:  0.00
[  17s] threads: 4, tps: 136.99, reads: 1917.90, writes: 543.97, response time: 42.31ms (95%), errors: 0.00, reconnects:  0.00
[  18s] threads: 4, tps: 125.01, reads: 1734.07, writes: 499.02, response time: 45.23ms (95%), errors: 0.00, reconnects:  0.00
[  19s] threads: 4, tps: 135.00, reads: 1912.97, writes: 541.99, response time: 44.89ms (95%), errors: 0.00, reconnects:  0.00
[  20s] threads: 4, tps: 127.00, reads: 1773.07, writes: 512.02, response time: 48.10ms (95%), errors: 0.00, reconnects:  0.00
OLTP test statistics:
    queries performed:
        read:                            35784
        write:                           10224
        other:                           5112
        total:                           51120
    transactions:                        2556   (127.71 per sec.)
    read/write requests:                 46008  (2298.83 per sec.)
    other operations:                    5112   (255.43 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          20.0137s
    total number of events:              2556
    total time taken by event execution: 79.9988s
    response time:
         min:                                 16.54ms
         avg:                                 31.30ms
         max:                                687.98ms
         approx.  95 percentile:              43.43ms

Threads fairness:
    events (avg/stddev):           639.0000/5.48
    execution time (avg/stddev):   19.9997/0.01

The switchover are at second ~5 and ~15 . The complete failover switches took around 0.58s , of which ~0.51s running mysqlrpladmin while ~0.07s to reconfigure proxysql .

Conclusion


In this tutorial we described how it is possible to use ProxySQL to perform a master switchover and reconfiguration of all the slaves in less than one second.
Using ProxySQL to perform a master switchover provides extremely high availability and without returning any error to the application. That is: the application won't be aware that a switchover happened as it doesn't need to be reconfigured, and neither it will get any error. All the application will notice is a subsecond latency while the switchover is running.


Last, a friendly reminder that I will be speaking at PerconaLive Amsterdam on 23rd September about ProxySQL : please come to me session and ask all the questions you have!

Monday, September 14, 2015

ProxySQL Tutorial : setup in a MySQL replication topology

With the imminent release of ProxySQL as stable version, and its presentation at PerconaLive Amsterdam , I decide to write few blog posts to describe how to install, configure and use it without digging too much through documentation.
Moreover, practical examples are often easier to understand.

This tutorial is a long one, although none of the topics are discussed in details to avoid making it even longer (or boring).

For this tutorial, we will use a small setup with 4 nodes:
- node1 (192.168.124.101) , mysql master
- node2 (192.168.124.102) , mysql slave
- node3 (192.168.124.103) , mysql slave
- app (192.168.124.104) , application server where proxysql will also run


Replication setup


Without going into details on how to configure mysqld in the nodes (this is outside the scope of this tutorial), the nodes have already mysqld up and running, configure to support gtid, but replication is not running yet. We will configure replication during this tutorial from "app".

Create root user user,  replication user, setup replication, and create application user

First, let's create the root user to connect from app:
vagrant@app:~$ for i in 1 2 3 ; do
ssh node$i -- 'mysql -uroot -proot -e "GRANT ALL PRIVILEGES ON *.* TO root@app IDENTIFIED BY \"root\" WITH GRANT OPTION" -NB'
done
Warning: Using a password on the command line interface can be insecure.
Warning: Using a password on the command line interface can be insecure.
Warning: Using a password on the command line interface can be insecure.

These warnings are annoying, let's remove them and try our new user:
vagrant@app:~$ echo -e "[client]\npassword=root" > .my.cnf
vagrant@app:~$ for i in 1 2 3 ; do mysql -u root -h node$i -e "SELECT CURRENT_USER()" -NB ; done
root@app
root@app
root@app

Create replication user:

vagrant@app:~$ mysql -u root -h node1 -e "GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.124.%' IDENTIFIED BY 'replpass'"

Setup and start replication:

vagrant@app:~$ for i in 2 3 ; do mysql -u root -h node$i -e "CHANGE MASTER TO MASTER_HOST='192.168.124.101' , MASTER_USER='repl', MASTER_PASSWORD='replpass', MASTER_AUTO_POSITION=1; START SLAVE;" ; done 

At this stage, we assume replication is up and running (please double check this in your setup).
Application needs a user to connect to the database. We will create a simple user with credential "utest"/"ptest"

vagrant@app:~$ mysql -u root -h node1 -e "GRANT ALL PRIVILEGES ON test.* TO utest@app IDENTIFIED BY \"ptest\""
vagrant@app:~$ mysql -u root -h node1 -e "CREATE DATABASE IF NOT EXISTS test"

Now that the user is created, we can run some test to verify that we can connect everywhere and that replication is working using sysbench to generate data on master, and run a readonly workload on a slave:

vagrant@app:~$ sysbench --test=/home/vagrant/sysbench-0.5/sysbench/tests/db/oltp.lua --mysql-user=utest --mysql-password=ptest --mysql-host=node1 --mysql-db=test prepare
sysbench 0.5:  multi-threaded system evaluation benchmark

Creating table 'sbtest1'...
Inserting 10000 records into 'sbtest1'

vagrant@app:~$ sysbench --max-requests=0 --max-time=10 --test=/home/vagrant/sysbench-0.5/sysbench/tests/db/oltp.lua --mysql-user=utest --mysql-password=ptest --mysql-host=node2 --mysql-db=
test --oltp-read-only=on run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Random number generator seed is 0 and will be ignored


Threads started!

OLTP test statistics:
    queries performed:
        read:                            22638
        write:                           0
        other:                           3234
        total:                           25872
    transactions:                        1617   (161.63 per sec.)
    read/write requests:                 22638  (2262.88 per sec.)
    other operations:                    3234   (323.27 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          10.0041s
    total number of events:              1617
    total time taken by event execution: 9.9958s
    response time:
         min:                                  4.96ms
         avg:                                  6.18ms
         max:                                212.22ms
         approx.  95 percentile:               6.66ms

Threads fairness:
    events (avg/stddev):           1617.0000/0.00
    execution time (avg/stddev):   9.9958/0.00



Download and install ProxySQL


Now that we know that replication is up and running correctly and users have been created, it is time to download and install ProxySQL on app server.
Precompiled  binaries and packages of ProxySQL are available at https://github.com/sysown/proxysql-binaries , for Centos7 , Ubuntu12 and Ubuntu14 .

This is a Ubuntu14 system, so once we download the package we can install it with dpkg :

vagrant@app:~$ wget -q https://github.com/sysown/proxysql-binaries/raw/master/binaries/Ubuntu14/proxysql_0.2.0902-ubuntu14_amd64.deb

vagrant@app:~$ sudo dpkg -i proxysql_0.2.0902-ubuntu14_amd64.deb
Selecting previously unselected package proxysql.
(Reading database ... 61067 files and directories currently installed.)
Preparing to unpack proxysql_0.2.0902-ubuntu14_amd64.deb ...
Unpacking proxysql (0.2.0902) ...
Setting up proxysql (0.2.0902) ...
update-rc.d: warning: /etc/init.d/proxysql missing LSB information
update-rc.d: see 
 Adding system startup for /etc/init.d/proxysql ...
   /etc/rc0.d/K20proxysql -> ../init.d/proxysql
   /etc/rc1.d/K20proxysql -> ../init.d/proxysql
   /etc/rc6.d/K20proxysql -> ../init.d/proxysql
   /etc/rc2.d/S20proxysql -> ../init.d/proxysql
   /etc/rc3.d/S20proxysql -> ../init.d/proxysql
   /etc/rc4.d/S20proxysql -> ../init.d/proxysql
   /etc/rc5.d/S20proxysql -> ../init.d/proxysql
Processing triggers for ureadahead (0.100.0-16) ...

vagrant@app:~$ 


ProxySQL installs an basic config file in /etc/proxysql.cnf , and uses /var/run/proxysql as its datadir (working dir).
Although,  /etc/proxysql.cnf is not configured with any mysql servers to be used as backend, neither it is configured with any mysql users to connect to such backend. Therefore we either edit the config file before starting proxysql, or we configure users and servers directly from within proxysql.
Although, the config file /etc/proxysql.cnf has one user configured: the user required to connect to the admin interface (details later).
Note that proxysql doesn't parse its config file after the first time it was executed (more details in https://github.com/sysown/proxysql-0.2/blob/master/doc/configuration_system.md), therefore the preferred way to configure proxysql is not through its config file but through its admin interface (more details in https://github.com/sysown/proxysql-0.2/blob/master/doc/configuration_system.md and https://github.com/sysown/proxysql-0.2/blob/master/doc/admin_tables.md) .


Start ProxySQL

We can start proxysql using service:

vagrant@app:~$ sudo service proxysql start
Starting ProxySQL: DONE!

Checking the content of proxysql's datadir, we can identify: a database file, a log file and a pid file:

vagrant@app:~$ ls -l /var/run/proxysql/
total 36
-rw------- 1 root root 25600 Sep 12 22:01 proxysql.db
-rw------- 1 root root   892 Sep 12 22:01 proxysql.log
-rw-r--r-- 1 root root     5 Sep 12 22:01 proxysql.pid

Note that the database file is the file that stores all the configuration related to proxysql. If this file is present, the config file is not parsed because the database file is the preferred source of configuration.

From the list of processes running we can see two proxysql processes:
vagrant@app:~$ ps aux | grep proxysql | grep -v grep
root      3589  0.0  0.6  35700  5036 ?        S    22:01   0:00 proxysql -c /etc/proxysql.cnf -D /var/run/proxysql
root      3590  0.0  1.3  85148 10112 ?        Sl   22:01   0:00 proxysql -c /etc/proxysql.cnf -D /var/run/proxysql

The reason behind this is that by default proxysql forks at startup, and the parent process is nothing more than an angel process that restarts proxysql if it crashes.

 

Connect to ProxySQL Admin Interface

As said previously, proxysql.cnf is read just the very first time that proxysql is started.
In proxysql.cnf , when was read for the first time, ProxySQL admin interface was configured as:
admin_variables=
{
        admin_credentials="admin:admin"
        mysql_ifaces="127.0.0.1:6032;/tmp/proxysql_admin.sock"
}

Therefore, to connect to ProxySQL admin interface we should use those information.
Let's connect to proxysql and run few commands:

vagrant@app:~$ mysql -u admin -padmin -h 127.0.0.1 -P6032
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 0
Server version: 5.1.30 (ProxySQL Admin Module)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>
mysql> SHOW DATABASES;
+-----+---------+-------------------------------+
| seq | name    | file                          |
+-----+---------+-------------------------------+
| 0   | main    |                               |
| 2   | disk    | /var/run/proxysql/proxysql.db |
| 3   | stats   |                               |
| 4   | monitor |                               |
+-----+---------+-------------------------------+
4 rows in set (0.00 sec)

mysql> SHOW TABLES;
+-------------------+
| tables            |
+-------------------+
| mysql_servers     |
| mysql_users       |
| mysql_query_rules |
| global_variables  |
| mysql_collations  |
+-------------------+
5 rows in set (0.01 sec)

The "main" database represents the in-memory configuration, while "disk" represents the on-disk configuration. More details here: https://github.com/sysown/proxysql-0.2/blob/master/doc/configuration_system.md
The "mysql_*" tables are described at https://github.com/sysown/proxysql-0.2/blob/master/doc/admin_tables.md , and at this stage we can verify that they are empty:

mysql> SELECT * FROM mysql_users;
Empty set (0.00 sec)

mysql> SELECT * FROM mysql_servers;
Empty set (0.01 sec)

mysql> SHOW CREATE TABLE mysql_users\G
*************************** 1. row ***************************
       table: mysql_users
Create Table: CREATE TABLE mysql_users (
username VARCHAR NOT NULL,
password VARCHAR,
active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 1,
use_ssl INT CHECK (use_ssl IN (0,1)) NOT NULL DEFAULT 0,
default_hostgroup INT NOT NULL DEFAULT 0,
default_schema VARCHAR,
schema_locked INT CHECK (schema_locked IN (0,1)) NOT NULL DEFAULT 0,
transaction_persistent INT CHECK (transaction_persistent IN (0,1)) NOT NULL DEFAULT 0,
fast_forward INT CHECK (fast_forward IN (0,1)) NOT NULL DEFAULT 0,
backend INT CHECK (backend IN (0,1)) NOT NULL DEFAULT 1,
frontend INT CHECK (frontend IN (0,1)) NOT NULL DEFAULT 1,
max_connections INT CHECK (max_connections >=0) NOT NULL DEFAULT 10000,
PRIMARY KEY (username, backend),
UNIQUE (username, frontend))
1 row in set (0.01 sec)

mysql> SHOW CREATE TABLE mysql_servers\G
*************************** 1. row ***************************
       table: mysql_servers
Create Table: CREATE TABLE mysql_servers (
hostgroup_id INT NOT NULL DEFAULT 0,
hostname VARCHAR NOT NULL,
port INT NOT NULL DEFAULT 3306,
status VARCHAR CHECK (UPPER(status) IN ('ONLINE','SHUNNED','OFFLINE_SOFT', 'OFFLINE_HARD')) NOT NULL DEFAULT 'ONLINE',
weight INT CHECK (weight >= 0) NOT NULL DEFAULT 1,
compression INT CHECK (compression >=0 AND compression <= 102400) NOT NULL DEFAULT 0,
max_connections INT CHECK (max_connections >=0) NOT NULL DEFAULT 1000,
max_replication_lag INT CHECK (max_replication_lag >= 0 AND max_replication_lag <= 126144000) NOT NULL DEFAULT 0,
PRIMARY KEY (hostgroup_id, hostname, port) )
1 row in set (0.00 sec)

Configure backends

We can now start configuring the backend.
As described in previous articles, ProxySQL uses the concept of hostgroup: a hostgroup is a group of host with logical functionalities.
For example, you can have the production master in hostgroup0, all the production slaves in hostgroup1, reporting slaves in hostgroup2, DR slaves in hostgroup3 , etc etc .
For this tutorial, we will use just 2 hostgroups:
  • hostgroup0 for the master
  • hostgroup1 for the slaves
Configuring backends in ProxySQL is as easy as inserting records into mysql_servers representing such backends, specifying the correct hostgroup_id based on their roles:

mysql> INSERT INTO mysql_servers(hostgroup_id, hostname, port) VALUES (0,'192.168.124.101',3306);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO mysql_servers(hostgroup_id, hostname, port) VALUES (1,'192.168.124.102',3306);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO mysql_servers(hostgroup_id, hostname, port) VALUES (1,'192.168.124.103',3306);
Query OK, 1 row affected (0.01 sec)

mysql> SELECT * FROM mysql_servers;
+--------------+-----------------+------+--------+--------+-------------+-----------------+---------------------+
| hostgroup_id | hostname        | port | status | weight | compression | max_connections | max_replication_lag |
+--------------+-----------------+------+--------+--------+-------------+-----------------+---------------------+
| 0            | 192.168.124.101 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   |
| 1            | 192.168.124.102 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   |
| 1            | 192.168.124.103 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   |
+--------------+-----------------+------+--------+--------+-------------+-----------------+---------------------+
3 rows in set (0.00 sec)

Note that at this stage the backends are yet not activated, they are only configured. To load such configuration at runtime we need to issue a LOAD command:

mysql> LOAD MYSQL SERVERS TO RUNTIME;
Query OK, 0 rows affected (0.01 sec)

These configuration changes won't persist after proxysql is shutdown because they are all in-memory. To persist these configuration changes we need to issue the correct SAVE command to save these changes onto on-disk database configuration:

mysql> SELECT * FROM disk.mysql_servers;
Empty set (0.00 sec)

mysql> SAVE MYSQL SERVERS TO DISK;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM disk.mysql_servers;
+--------------+-----------------+------+--------+--------+-------------+-----------------+---------------------+
| hostgroup_id | hostname        | port | status | weight | compression | max_connections | max_replication_lag |
+--------------+-----------------+------+--------+--------+-------------+-----------------+---------------------+
| 0            | 192.168.124.101 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   |
| 1            | 192.168.124.102 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   |
| 1            | 192.168.124.103 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   |
+--------------+-----------------+------+--------+--------+-------------+-----------------+---------------------+
3 rows in set (0.00 sec)


Configure users

This is very similar to how we configured backends:

mysql> INSERT INTO mysql_users (username,password) VALUES ('utest','ptest');
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM mysql_users\G
*************************** 1. row ***************************
              username: utest
              password: ptest
                active: 1
               use_ssl: 0
     default_hostgroup: 0
        default_schema: NULL
         schema_locked: 0
transaction_persistent: 0
          fast_forward: 0
               backend: 1
              frontend: 1
       max_connections: 10000
1 row in set (0.00 sec) 
 
mysql> LOAD MYSQL USERS TO RUNTIME;
Query OK, 0 rows affected (0.00 sec) 
 
mysql> SELECT * FROM disk.mysql_users;
Empty set (0.00 sec)

mysql> SAVE MYSQL USERS TO DISK;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM disk.mysql_users\G
*************************** 1. row ***************************
              username: utest
              password: ptest
                active: 1
               use_ssl: 0
     default_hostgroup: 0
        default_schema: NULL
         schema_locked: 0
transaction_persistent: 0
          fast_forward: 0
               backend: 1
              frontend: 1
       max_connections: 10000
1 row in set (0.00 sec)
mysql> exit
Bye



Testing connectivity via proxysql


To test if the application is able to send queries to the backend through proxysql we can run any query against proxysql. For example:

vagrant@app:~$ mysql -u utest -pptest -h 127.0.0.1 -P6033 -e "SELECT @@hostname"
Warning: Using a password on the command line interface can be insecure.
+------------+
| @@hostname |
+------------+
| node1      |
+------------+

In this example, the SELECT statement is sent to node1, and it will be always sent to node1, because there is no query rule that defines routing for such query (mysql_query_rules table is still empty), therefore the default_hostgroup for user "utest" applies (that is "0" according to mysql_users), and the query is sent to a host that has hostgroup_id=0 (that is "node1" according to mysql_servers).

We can run a bit more complex test using sysbench pointing to proxysql (host=127.0.0.1, port=6033) :

vagrant@app:~$ sysbench --num-threads=4 --max-requests=0 --max-time=10 \
> --test=/home/vagrant/sysbench-0.5/sysbench/tests/db/oltp.lua \
> --mysql-user=utest --mysql-password=ptest \
> --mysql-host=127.0.0.1 --mysql-port=6033 \
> --mysql-db=test \
> --oltp-read-only=off \
> run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4
Random number generator seed is 0 and will be ignored


Threads started!

OLTP test statistics:
    queries performed:
        read:                            11592
        write:                           3312
        other:                           1656
        total:                           16560
    transactions:                        828    (82.15 per sec.)
    read/write requests:                 14904  (1478.62 per sec.)
    other operations:                    1656   (164.29 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          10.0797s
    total number of events:              828
    total time taken by event execution: 40.2186s
    response time:
         min:                                 17.03ms
         avg:                                 48.57ms
         max:                                307.84ms
         approx.  95 percentile:              96.42ms

Threads fairness:
    events (avg/stddev):           207.0000/3.08
    execution time (avg/stddev):   10.0546/0.02

vagrant@app:~$


Read write split

As already said in other posts, ProxySQL doesn't implement any read/write algorithm but tries to consider read/write a subset of query routing. Therefore, to implement read/write split, we need to correctly configure query routing.
A very basic read/write split can be implemented sending to slaves all SELECTs without FOR UPDATE, and to send everything else to master. We can configure this through the admin interface:



vagrant@app:~$ mysql -u admin -padmin -h 127.0.0.1 -P6032
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 19
Server version: 5.1.30 (ProxySQL Admin Module)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SELECT * FROM mysql_query_rules\G
Empty set (0.00 sec)

mysql> SHOW CREATE TABLE mysql_query_rules\G
*************************** 1. row ***************************
       table: mysql_query_rules
Create Table: CREATE TABLE mysql_query_rules (
rule_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 0,
username VARCHAR,
schemaname VARCHAR,
flagIN INT NOT NULL DEFAULT 0,
match_pattern VARCHAR,
negate_match_pattern INT CHECK (negate_match_pattern IN (0,1)) NOT NULL DEFAULT 0,
flagOUT INT,
replace_pattern VARCHAR,
destination_hostgroup INT DEFAULT NULL,
cache_ttl INT CHECK(cache_ttl > 0),
reconnect INT CHECK (reconnect IN (0,1)) DEFAULT NULL,
timeout INT UNSIGNED,
delay INT UNSIGNED,
apply INT CHECK(apply IN (0,1)) NOT NULL DEFAULT 0)
1 row in set (0.00 sec)

mysql> INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT.*FOR UPDATE$',0,1);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT',1,1);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM mysql_query_rules\G
*************************** 1. row ***************************
              rule_id: 1
               active: 1
             username: NULL
           schemaname: NULL
               flagIN: 0
        match_pattern: ^SELECT.*FOR UPDATE$
 negate_match_pattern: 0
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 0
            cache_ttl: NULL
            reconnect: NULL
              timeout: NULL
                delay: NULL
                apply: 1
*************************** 2. row ***************************
              rule_id: 2
               active: 1
             username: NULL
           schemaname: NULL
               flagIN: 0
        match_pattern: ^SELECT
 negate_match_pattern: 0
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 1
            cache_ttl: NULL
            reconnect: NULL
              timeout: NULL
                delay: NULL
                apply: 1
2 rows in set (0.00 sec)


mysql> SELECT rule_id, match_pattern,destination_hostgroup hg_id, apply FROM mysql_query_rules WHERE active=1;
+---------+----------------------+-------+-------+
| rule_id | match_pattern        | hg_id | apply |
+---------+----------------------+-------+-------+
| 1       | ^SELECT.*FOR UPDATE$ | 0     | 1     |
| 2       | ^SELECT              | 1     | 1     |
+---------+----------------------+-------+-------+
2 rows in set (0.00 sec)

mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)

mysql> SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.00 sec)

mysql> quit
Bye

We created rules for SELECTs and SELECT FOR UPDATE : what about all the other queries? When the Query Processor scans the query rules trying to find a match with no success and it reaches the end, it will apply the default_hostgroup for the specific user according to mysql_users entry.
In our case, user "utest" has a default_hostgroup=0 , therefore any query not matching the above rules will be sent to hostgroup 0.

Also note that by default new servers are configured as ONLINE and new users are configured as active. Although, for query rules, the default is active=0 . This is intentional.

Now that we have configured ProxySQL with query routing, let's try it running new queries against proxysql. SELECT FOR UPDATE will be send to master, while other SELECT will be send to one of the slave, completely randomly.

vagrant@app:~$ mysql -u utest -pptest -h 127.0.0.1 -P6033 -e "SELECT @@hostname"
Warning: Using a password on the command line interface can be insecure.
+------------+
| @@hostname |
+------------+
| node2      |
+------------+
vagrant@app:~$ mysql -u utest -pptest -h 127.0.0.1 -P6033 -e "SELECT @@hostname"
Warning: Using a password on the command line interface can be insecure.
+------------+
| @@hostname |
+------------+
| node3      |
+------------+
vagrant@app:~$ mysql -u utest -pptest -h 127.0.0.1 -P6033 -e "SELECT @@hostname FOR UPDATE"
Warning: Using a password on the command line interface can be insecure.
+------------+
| @@hostname |
+------------+
| node1      |
+------------+

And now sysbench:

vagrant@app:~$ sysbench --num-threads=4 --max-requests=0 --max-time=10 \
> --test=/home/vagrant/sysbench-0.5/sysbench/tests/db/oltp.lua \
> --mysql-user=utest --mysql-password=ptest \
> --mysql-host=127.0.0.1 --mysql-port=6033 \
> --mysql-db=test \
> --oltp-read-only=off \
> run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4
Random number generator seed is 0 and will be ignored


Threads started!

OLTP test statistics:
    queries performed:
        read:                            21112
        write:                           6032
        other:                           3016
        total:                           30160
    transactions:                        1508   (150.51 per sec.)
    read/write requests:                 27144  (2709.16 per sec.)
    other operations:                    3016   (301.02 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          10.0193s
    total number of events:              1508
    total time taken by event execution: 40.0503s
    response time:
         min:                                 15.87ms
         avg:                                 26.56ms
         max:                                215.16ms
         approx.  95 percentile:              38.74ms

Threads fairness:
    events (avg/stddev):           377.0000/1.87
    execution time (avg/stddev):   10.0126/0.00

vagrant@app:~$

sysbench ran successfully, but from this report we cannot understand if read/write split worked.
No problem, proxysql can tell us!

vagrant@app:~$ mysql -u admin -padmin -h 127.0.0.1 -P6032
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 32
Server version: 5.1.30 (ProxySQL Admin Module)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SHOW TABLES FROM stats;
+--------------------------------+
| tables                         |
+--------------------------------+
| stats_mysql_query_rules        |
| stats_mysql_commands_counters  |
| stats_mysql_processlist        |
| stats_mysql_connection_pool    |
| stats_mysql_query_digest       |
| stats_mysql_query_digest_reset |
| stats_mysql_global             |
+--------------------------------+
7 rows in set (0.00 sec)


mysql> SELECT * FROM stats_mysql_connection_pool;
+-----------+-----------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+
| hostgroup | srv_host        | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv |
+-----------+-----------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+
| 0         | 192.168.124.101 | 3306     | ONLINE | 0        | 4        | 4      | 0       | 32884   | 1991796         | 47602338        |
| 1         | 192.168.124.102 | 3306     | ONLINE | 0        | 4        | 4      | 0       | 9499    | 410383          | 25702366        |
| 1         | 192.168.124.103 | 3306     | ONLINE | 0        | 4        | 4      | 0       | 9457    | 410141          | 26360288        |
+-----------+-----------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+
3 rows in set (0.00 sec)

mysql> SELECT hostgroup, SUM(Queries), SUM(Bytes_data_sent), SUM(Bytes_data_recv) FROM stats_mysql_connection_pool GROUP BY hostgroup\G
*************************** 1. row ***************************
           hostgroup: 0
        SUM(Queries): 32884
SUM(Bytes_data_sent): 1991796
SUM(Bytes_data_recv): 47602338
*************************** 2. row ***************************
           hostgroup: 1
        SUM(Queries): 18956
SUM(Bytes_data_sent): 820524
SUM(Bytes_data_recv): 52062654
2 rows in set (0.01 sec)

From stats_mysql_connection_pool it seems queries are distributed across hosts and hostgroups. Yet, this is not enough to say that read/write is working.
Let try a read only workload, and see what it changes.

vagrant@app:~$ sysbench --num-threads=4 --max-requests=0 --max-time=10 --test=/home/vagrant/sysbench-0.5/sysbench/tests/db/oltp.lua --mysql-user=utest --mysql-password=ptest --mysql-host=1
27.0.0.1 --mysql-port=6033 --mysql-db=test --oltp-read-only=on --oltp-skip-trx=on run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4
Random number generator seed is 0 and will be ignored


Threads started!

OLTP test statistics:
    queries performed:
        read:                            32984
        write:                           0
        other:                           0
        total:                           32984
    transactions:                        0      (0.00 per sec.)
    read/write requests:                 32984  (3293.70 per sec.)
    other operations:                    0      (0.00 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          10.0143s
    total number of events:              2356
    total time taken by event execution: 40.0156s
    response time:
         min:                                  7.62ms
         avg:                                 16.98ms
         max:                                 73.30ms
         approx.  95 percentile:              33.39ms

Threads fairness:
    events (avg/stddev):           589.0000/146.87
    execution time (avg/stddev):   10.0039/0.01


vagrant@app:~$ mysql -u admin -padmin -h 127.0.0.1 -P6032 -e "SELECT hostgroup, SUM(Queries), SUM(Bytes_data_sent), SUM(Bytes_data_recv) FROM stats_mysql_connection_pool GROUP BY hostgroup
\G"
Warning: Using a password on the command line interface can be insecure.
*************************** 1. row ***************************
           hostgroup: 0
        SUM(Queries): 32884
SUM(Bytes_data_sent): 1991796
SUM(Bytes_data_recv): 47602338
*************************** 2. row ***************************
           hostgroup: 1
        SUM(Queries): 51940
SUM(Bytes_data_sent): 2248260
SUM(Bytes_data_recv): 142653210

It all seems correct. The number of queries sent to hostgroup=0 didn't change, and all queries were sent to hostgroup=1 .

More granular read/write split

We previously said that read/write split is a subset of query routing.
In fact, we can decide the destination of each group of queries. Maybe you don't want to send all SELECT to slaves, but only some specific ones. ProxySQL allows this granularity.

If you don't know what queries are generated by your application, you can check in proxysql what queries are passing through using the table stats_mysql_query_digest . For example:

mysql> SELECT * FROM stats_mysql_query_digest;
+------------+----------+--------------------+----------------------------------------------------------------------+------------+------------+------------+----------+----------+----------+
| schemaname | username | digest             | digest_text                                                          | count_star | first_seen | last_seen  | sum_time | min_time | max_time |
+------------+----------+--------------------+----------------------------------------------------------------------+------------+------------+------------+----------+----------+----------+
| test       | utest    | 0x4760CBDEFAD1519E | BEGIN                                                                | 2592       | 1442148405 | 1442148446 | 1997120  | 0        | 49046    |
| test       | utest    | 0x10634DACE52A0A02 | INSERT INTO sbtest1 (id, k, c, pad) VALUES (?, ?, ?, ?)              | 2592       | 1442148405 | 1442148446 | 2517386  | 377      | 35926    |
| test       | utest    | 0xE75DB8313E268CF3 | SELECT SUM(K) FROM sbtest1 WHERE id BETWEEN ? AND ?+?                | 4948       | 1442148405 | 1442149117 | 5694972  | 365      | 35860    |
| test       | utest    | 0xCCB481C7C198E52B | UPDATE sbtest1 SET k=k+? WHERE id=?                                  | 2592       | 1442148405 | 1442148446 | 4131398  | 531      | 19211    |
| test       | utest    | 0x55319B9EE365BEB5 | DELETE FROM sbtest1 WHERE id=?                                       | 2592       | 1442148405 | 1442148446 | 2694171  | 409      | 14392    |
| test       | utest    | 0xADF3DDF2877EEAAF | SELECT c FROM sbtest1 WHERE id BETWEEN ? AND ?+? ORDER BY c          | 4948       | 1442148405 | 1442149117 | 7315147  | 456      | 32245    |
| test       | utest    | 0x704822A0F7D3CD60 | SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN ? AND ?+? ORDER BY c | 4948       | 1442148405 | 1442149117 | 10310160 | 677      | 40372    |
| test       | utest    | 0x13781C1DBF001A0C | SELECT c FROM sbtest1 WHERE id=?                                     | 49480      | 1442148405 | 1442149117 | 40873691 | 321      | 33556    |
| test       | utest    | 0x7DD56217AF7A5197 | SELECT c FROM sbtest1 WHERE id BETWEEN ? AND ?+?                     | 4948       | 1442148405 | 1442149117 | 5989877  | 365      | 54172    |
| test       | utest    | 0x5A23CA36FB239BC9 | UPDATE sbtest1 SET c=? WHERE id=?                                    | 2592       | 1442148405 | 1442148446 | 3006402  | 519      | 21786    |
| test       | utest    | 0x5DBEB0DD695FBF25 | COMMIT                                                               | 2592       | 1442148405 | 1442148446 | 8335739  | 1209     | 55347    |
+------------+----------+--------------------+----------------------------------------------------------------------+------------+------------+------------+----------+----------+----------+
11 rows in set (0.00 sec)

We can run any sort of clause against this table for a more interesting result. For example, let assume we want to identify the SELECT statements that take most of the time:

mysql> SELECT count_star, sum_time, digest_text FROM stats_mysql_query_digest WHERE digest_text LIKE 'SELECT%' ORDER BY sum_time DESC;
+------------+----------+----------------------------------------------------------------------+
| count_star | sum_time | digest_text                                                          |
+------------+----------+----------------------------------------------------------------------+
| 49480      | 40873691 | SELECT c FROM sbtest1 WHERE id=?                                     |
| 4948       | 10310160 | SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN ? AND ?+? ORDER BY c |
| 4948       | 7315147  | SELECT c FROM sbtest1 WHERE id BETWEEN ? AND ?+? ORDER BY c          |
| 4948       | 5989877  | SELECT c FROM sbtest1 WHERE id BETWEEN ? AND ?+?                     |
| 4948       | 5694972  | SELECT SUM(K) FROM sbtest1 WHERE id BETWEEN ? AND ?+?                |
+------------+----------+----------------------------------------------------------------------+
5 rows in set (0.00 sec)

The first type of SELECT is surely the one that is taking more time, for a total of ~40.8s (times in stats_mysql_query_digest are in microseconds).
Once we have identified these, we can decide to disable our generic read/write split and send to slaves only the queries that take most of the times. This is just an example on how proxysql allows fine-tuning query routing, and that read/write split is just a subcase of query routing.

vagrant@app:~$ mysql -u admin -padmin -h 127.0.0.1 -P6032
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 74
Server version: 5.1.30 (ProxySQL Admin Module)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> DELETE FROM mysql_query_rules;
Query OK, 2 rows affected (0.01 sec)

mysql> INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT c FROM sbtest1 WHERE id=\d+$',1,1);
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN \d+ AND \d+\+\d+ ORDER BY c$',1,1);
Query OK, 1 row affected (0.00 sec)

mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)

mysql> quit
Bye

Once again, note that while you change the content of table mysql_query_rules, the rules aren't effective until you run LOAD MYSQL QUERY RULES TO RUNTIME . This gives you not only the time to review them before loading them at runtime, but also to create complex chains of rules. Therefore, while you are changing the content of mysql_query_rules, the rules at runtime aren't changed.
As soon as LOAD MYSQL QUERY RULES TO RUNTIME is executed, the currently rules are wiped and replaced with the new ones, that immediately take effect.

How can we ensure that our new rules are matching queries, and routing queries as we expect?
Did we write the regular expression correctly? (regex could be hard to write and debug)
Table stats_mysql_query_rules provides counters to the number of times the specified rules is matched against a query.
For example:

mysql> SELECT rule_id, hits, destination_hostgroup hg, match_pattern pattern FROM mysql_query_rules NATURAL JOIN stats_mysql_query_rules\G
*************************** 1. row ***************************
rule_id: 6
   hits: 19340
     hg: 1
pattern: ^SELECT c FROM sbtest1 WHERE id=\d+$
*************************** 2. row ***************************
rule_id: 7
   hits: 1934
     hg: 1
pattern: ^SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN \d+ AND \d+\+\d+ ORDER BY c$
2 rows in set (0.00 sec)

We have hits, that means that queries are matched against the rules and routed accordingly.



Backends monitoring

To operate MySQL traffic between clients and backends, ProxySQL doesn't need to proactively monitor the health of the backends. When a connection to a backend fails either during the connect phase or during the execution of a query, ProxySQL reacts to such event trying to establish a new connection (possibly on a different node) and eventually retry the query. This is a quite complex topic and needs a blog post on its own, but the main point here is that ProxySQL doesn't need proactive monitoring of backends to determine their availability and health.
Nonetheless, there is a Monitor module that monitors the health of the backends and stores such information in some logging tables.

Without going into all the details of Monitor module, for the scope of this tutorial we will see only few important ones.

Small parenthesis before proceeding further: ProxySQL stores configuration variables in a table called global_variables, recalling it from mysqld. The big difference is that this table is writable and it is the way to reconfigure ProxySQL .

Back to Monitor module...

mysql> SELECT * FROM global_variables WHERE variable_name IN ('mysql-monitor_username','mysql-monitor_password','mysql-monitor_connect_interval','mysql-monitor_ping_interval','mysql-monitor_history');
+--------------------------------+----------------+
| variable_name                  | variable_value |
+--------------------------------+----------------+
| mysql-monitor_connect_interval | 120000         |
| mysql-monitor_history          | 600000         |
| mysql-monitor_password         | monitor        |
| mysql-monitor_ping_interval    | 60000          |
| mysql-monitor_username         | monitor        |
+--------------------------------+----------------+
5 rows in set (0.00 sec)

This means that Monitor module will try to connect to all its backends once every 2 minutes (120000ms) and ping them once every 1 minute (60000ms), storing statistics and log for 10 minutes (600000ms) , using user/password "monitor"/"monitor" .

We never created such user, so currently Monitor module is failing. To solve this we can either create the "monitor" user with the right privileges, or reconfigure Monitor module to use "root".
As an exercise, instead of creating such user we will reconfigure Monitor module.
This is as easy as updating the records in table global_variables and issue a LOAD MYSQL VARIABLES TO RUNTIME .
Again, don't forget to persist the changes to disk with LOAD MYSQL VARIABLES TO DISK .

mysql> UPDATE global_variables SET variable_value="root" WHERE variable_name="mysql-monitor_username";
Query OK, 1 row affected (0.00 sec)

mysql> UPDATE global_variables SET variable_value="root" WHERE variable_name="mysql-monitor_password";
Query OK, 1 row affected (0.00 sec)

mysql> LOAD MYSQL VARIABLES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)

mysql> SAVE MYSQL VARIABLES TO DISK;
Query OK, 46 rows affected (0.00 sec)


You can now check the results from Monitor module in tables mysql_server_connect_log and mysql_server_ping_log .

mysql> SHOW TABLES FROM monitor;
+----------------------------------+
| tables                           |
+----------------------------------+
| mysql_server_connect             |
| mysql_server_connect_log         |
| mysql_server_ping                |
| mysql_server_ping_log            |
| mysql_server_replication_lag_log |
+----------------------------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM monitor.mysql_server_ping_log ORDER BY time_start DESC;
+-----------------+------+------------------+-------------------+------------+
| hostname        | port | time_start       | ping_success_time | ping_error |
+-----------------+------+------------------+-------------------+------------+
| 192.168.124.101 | 3306 | 1442153413013369 | 513               | NULL       |
| 192.168.124.102 | 3306 | 1442153413013369 | 400               | NULL       |
| 192.168.124.103 | 3306 | 1442153413013369 | 353               | NULL       |
| 192.168.124.101 | 3306 | 1442153353012568 | 695               | NULL       |
| 192.168.124.102 | 3306 | 1442153353012568 | 580               | NULL       |
| 192.168.124.103 | 3306 | 1442153353012568 | 520               | NULL       |
| 192.168.124.101 | 3306 | 1442153293011283 | 856               | NULL       |
| 192.168.124.102 | 3306 | 1442153293011283 | 683               | NULL       |
| 192.168.124.103 | 3306 | 1442153293011283 | 550               | NULL       |
| 192.168.124.101 | 3306 | 1442153233009628 | 225               | NULL       |
| 192.168.124.102 | 3306 | 1442153233009628 | 347               | NULL       |
| 192.168.124.103 | 3306 | 1442153233009628 | 347               | NULL       |
+-----------------+------+------------------+-------------------+------------+
12 rows in set (0.01 sec)

mysql> SELECT * FROM monitor.mysql_server_connect_log ORDER BY time_start DESC;
+-----------------+------+------------------+----------------------+---------------+
| hostname        | port | time_start       | connect_success_time | connect_error |
+-----------------+------+------------------+----------------------+---------------+
| 192.168.124.101 | 3306 | 1442153412970870 | 1334                 | NULL          |
| 192.168.124.102 | 3306 | 1442153412970870 | 1606                 | NULL          |
| 192.168.124.103 | 3306 | 1442153412970870 | 1555                 | NULL          |
| 192.168.124.101 | 3306 | 1442153292970677 | 1523                 | NULL          |
| 192.168.124.102 | 3306 | 1442153292970677 | 1386                 | NULL          |
| 192.168.124.103 | 3306 | 1442153292970677 | 1283                 | NULL          |
+-----------------+------+------------------+----------------------+---------------+
6 rows in set (0.00 sec)

Note that time_start is not a unix_timestamp but a monotic time in microsecond.
Similarly, "connect_success_time" and "ping_success_time" are in microsecond.

You can run more complex queries against these table. For example, the average connection time per host is given by this query:

mysql> SELECT hostname,port, AVG(connect_success_time)/1000 avg_ms
    -> FROM monitor.mysql_server_connect_log
    -> WHERE connect_error IS NULL
    -> GROUP BY hostname,port;
+-----------------+------+---------+
| hostname        | port | avg_ms  |
+-----------------+------+---------+
| 192.168.124.101 | 3306 | 1.43025 |
| 192.168.124.102 | 3306 | 1.5365  |
| 192.168.124.103 | 3306 | 1.61875 |
+-----------------+------+---------+
3 rows in set (0.00 sec)



Replication lag monitoring


In the previous paragraph we said that ProxySQL doesn't perform any proactive monitoring. This is always true with only one exception so far: replication lagging.
In fact, ProxySQL can be configure to monitor replication lag and temporary shun slaves when they lag beyond a certain threshold.
As reported in other post, ProxySQL is not replication aware and therefore ProxySQL monitors replication lag only for hosts where it was specific a maximum replication lag through mysql_servers.max_replication_lag .

To configure replication lag monitoring we need:
a) monitor user (as specified in mysql-monitor_username) with SUPER or REPLICATION CLIENT privileges (we are already using "root") ;
b) mysql_servers.max_replication_lag is greater than zero ;
c) tune mysql-monitor_replication_lag_interval accordingly: as the name suggests, this is how often replication lag is checked.

As mysql_servers.max_replication_lag applies only to slaves, we can configure it safely to all hosts either slaves or masters (where won't have any effect).

mysql> SELECT * FROM mysql_servers WHERE max_replication_lag>0;
Empty set (0.00 sec)

mysql> UPDATE mysql_servers SET max_replication_lag=5;
Query OK, 3 rows affected (0.00 sec)

mysql> LOAD MYSQL SERVERS TO RUNTIME;
Query OK, 0 rows affected (0.01 sec)

mysql> SAVE MYSQL SERVERS TO DISK;
Query OK, 0 rows affected (0.00 sec)

mysql> UPDATE global_variables SET variable_value="1500"
    -> WHERE variable_name="mysql-monitor_replication_lag_interval";
Query OK, 1 row affected (0.00 sec)

mysql> SAVE MYSQL VARIABLES TO DISK;
Query OK, 47 rows affected (0.00 sec)

mysql> LOAD MYSQL VARIABLES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)


After this configuration, ProxySQL will monitor replication lag on every host every 1500ms , and if their replication lag is over 5 seconds the host will be temporary shunned.

Please be very caution on how you configure this, because if misconfigured it could lead to unexpected behavior.
For example, if all the slaves are lagging and reads go to slaves, proxysql will wait until some slave catch up and then execute the query, resulting in increased latency. Or, if the slaves do not catch up within mysql-connect_timeout_server_max millisecond, proxysql will return an error saying it can't find any host.
A workaround to solve this is to configure also the master in the same hostgroup with the slaves, but with a weight a lot lower than the slaves: in this way, if the slaves are up and running the queries will be send to slaves most of the time and occasionally to the master, while if the slaves are not available they will be send to the master.

We can check status using monitor table mysql_server_replication_lag_log :

mysql> SELECT * FROM monitor.mysql_server_replication_lag_log
    -> ORDER BY time_start DESC LIMIT 10;
+-----------------+------+------------------+--------------+----------+-------+
| hostname        | port | time_start       | success_time | repl_lag | error |
+-----------------+------+------------------+--------------+----------+-------+
| 192.168.124.102 | 3306 | 1442156260020471 | 830          | 0        | NULL  |
| 192.168.124.103 | 3306 | 1442156260020471 | 784          | 0        | NULL  |
| 192.168.124.102 | 3306 | 1442156258518415 | 537          | 0        | NULL  |
| 192.168.124.103 | 3306 | 1442156258518415 | 448          | 0        | NULL  |
| 192.168.124.102 | 3306 | 1442156257015337 | 844          | 0        | NULL  |
| 192.168.124.103 | 3306 | 1442156257015337 | 722          | 0        | NULL  |
| 192.168.124.102 | 3306 | 1442156255514011 | 837          | 0        | NULL  |
| 192.168.124.103 | 3306 | 1442156255514011 | 691          | 0        | NULL  |
| 192.168.124.102 | 3306 | 1442156254012744 | 726          | 0        | NULL  |
| 192.168.124.103 | 3306 | 1442156254012744 | 605          | 0        | NULL  |
+-----------------+------+------------------+--------------+----------+-------+
10 rows in set (0.00 sec)

mysql> SELECT hostname,repl_lag,COUNT(*)
    -> FROM monitor.mysql_server_replication_lag_log
    -> WHERE error IS NULL GROUP BY hostname\G
*************************** 1. row ***************************
hostname: 192.168.124.102
repl_lag: 0
COUNT(*): 400
*************************** 2. row ***************************
hostname: 192.168.124.103
repl_lag: 0
COUNT(*): 400
2 rows in set (0.00 sec)

Note that there are 400 entries per host, that is because the check is performed every 1500ms and the log table hold entries for 600s .
As soon as an entry becomes older than 600s it is deleted.



This tutorial has reached an end.
In this tutorial we described:
  • how to install ProxySQL
  • how to perform some basic configuration adding users and backends
  • the concept of hostgroups
  • how to configure query routing and read/write split
  • how to retrieve some of the many information and statistics available from within ProxySQL
  • how to configure global variables
  • how to configure the Monitor module
  • how to monitor replication lag
Please stay tuned for the next tutorial, titled "seamless replication switchover in less than one second"




Wednesday, June 3, 2015

SQL Load Balancing Benchmark - Comparing Performance of ProxySQL vs MaxScale

In the MySQL ecosystem there are few load balancers there are also open-source, and ProxySQL is one of the few proxies that works at the application layer and therefore is SQL aware.
In this blog post we will benchmark ProxySQL against MaxScale, another popular proxy for MySQL.
The idea to compare ProxySQL vs MaxScale came after reading an interesting blog post of Krzysztof Książek on SQL Load Balancing Benchmark, comparing performance of MaxScale vs HAProxy.

Disclaimer: ProxySQL is not GA yet, therefore please do not use it in production.



Sysbench setup

I wanted to setup a similar sysbench setup to what Krzysztof used in his benchmark, but it is slightly different:
a) instead of using a MySQL cluster with Galera, I setup a cluster with 1 master and 3 slaves. Since the workload was meant to be completely read-only and in-memory, the 2 setups are functionally identical;
b) instead of using AWS instances I used 4 physical servers: server A was running as a master and servers B, C and D were running as slaves. Since the master was idle (remember, this is a read-only workload that use only the slaves), I used the same box to also run sysbench and all the various proxies.

Benchmark were executed running the follow:
./sysbench \
--test=./tests/db/oltp.lua \
--num-threads=$THREADS \
--max-requests=0 \
--max-time=600 \
--mysql-user=rcannao \
--mysql-password=rcannao \
--mysql-db=test \
--db-driver=mysql \
--oltp-tables-count=128 \
--oltp-read-only=on \
--oltp-skip-trx=on  \
--report-interval=1 \
--oltp-point-selects=100 \
--oltp-table-size=400000 \
--mysql-host=127.0.0.1 \
--mysql-port=$PORT \
run

The versions used are:
Percona Server 5.6.22
sysbench 0.5
ProxySQL at commit a47136e with debugging disabled
MaxScale 1.0.5 GA
HAProxy 1.4.15

ProxySQL and MaxScale: few design differences


In the benchmark executed by Krzysztof, MaxScale was configured to listen on port 4006 where the service "RW Split Router" was running, and on port 4008 where the service "Read Connection Router" was running.
To my understand:
a) RW Split Router performs read/write split, parsing the queries and tracking the state of the transaction;
b) Read Connection Router performs a simple network forwarding, connecting clients to backends;
c) the two services, to operate, need to listen on different ports.

ProxySQL is, by design, different.

ProxySQL and RW split


ProxySQL performs a very simple query analysis to determine where the queries need to be send.
ProxySQL decides where a query needs to be forwarded based on a user configurable chain of rules, where a DBA can specify various matching criteria like username, schemaname, if there is an active transaction (feature not completely implemented), and a regular expression to match the query.
Matching against a regular expression provides better speed than building a syntax tree, and having a chain of rules that match with either regex or other attributes allows a great degree of flexibility compared to hardcoded routing policies.
Therefore, to implemented a basic read/write split, ProxySQL was configured in a way that:
a) all the queries matching '^SELECT.*FOR UPDATE$' were sent to master ;
b) all the queries not matching the previous rules but matching '^SELECT.*' were sent to slaves.
c) by default, all traffic not matching any of the previous rules was sent to master;

Considering the 3 rules listed above, all traffic generated by sysbench was always sent to slaves.

Additionally, while ProxySQL doesn't perform any syntax parsing to determine the target of a query, no matter what routing rules are in place, it also performs a very simple query analysis to determine what type of statement is being executed and generate statistics based on these. That is, ProxySQL is counting the type of statements that is executing, and these information are accessible through ProxySQL itself.

As already pointed in previous articles, one of the main idea behind ProxySQL is that the DBA is now the one controlling and defining query routing rules, making routing completely transparent to the developers, eliminates the politics behind DBAs depending on developers for such tweaking
of the setup, and therefore increasing interaction speed.

ProxySQL and Fast Forwarding

I think that the way MaxScale implements different modules listening on different port is a very interesting approach, yet it forces the developers to enforce some sort of read/write split in the application: connect to port 4006 if you want R/W split, or port 4008 if you want RO load balancing.
My aim in ProxySQL is that the application should have a single connection point, ProxySQL, and the proxy should determine what to do with the incoming requests. In other words, the application should just connect to ProxySQL and this should take care of the rests, according to its configuration.
To do so, ProxySQL should always authenticate the client before applying any rule. Therefore I thought that a quick feature to implement is Fast Forwarding based on username: when a specific user connects, all its requests are forwarded to the backends without any query processing or connection pool.
In other words, ProxySQL's Fast Forwarding is a concept similar to MaxScale's Read Connection, but uses the same port as the R/W split module and the matching criteria is the client's username instead of listener port.
Note that ProxySQL already support multiple listeners, but the same rules apply to all ports; in future versions, ProxySQL will support matching criteria also based on listener's port behaving in a similar way of MaxScale, but will also add additional matching criteria like the source of the connection.



Performance benchmarks


As said previously, on the same host where sysbench was running I also configured ProxySQL, MaxScale and HAProxy.
In the blog post published by Severalnines, one of the comment states that MaxScale was very slow with few connections, on physical hardware.
Therefore, the first benchmark I wanted to run was exactly at low number of connections, and progressively increase the number of connections.
ProxySQL and MaxScale were both configured with just 1 worker thread, and HAProxy was configured with only 1 process.

Please note that in the follows benchmark worker threads and connections are two completely different entities:
1) a connection is defined as a client connection;
2) a worker thread is a thread inside the proxy, either ProxySQL, MaxScale or HAProxy (even if HAProxy uses processes and not threads).
What could cause confusion is the fact that in sysbench a thread is a connection: from a proxy prospective, it is just a connection.

Benchmark with 1 worker thread




Tagline:
maxscale rw = MaxScale with RW Split Router
maxscale rr = MaxScale with Read Connection Router
proxysql rw = ProxySQL with query routing enabled
proxysql ff = ProxySQL with fast forwarding enabled

Average throughput in QPS:

Connections HAProxy MaxScale RW MaxScale RR ProxySQL RW ProxySQL FF
1 3703.36 709.99 722.27 3534.92 3676.04
4 14506.45 2815.7 2926.44 13125.67 14275.66
8 26628.44 5690.22 5833.77 23000.98 24514.94
32 54570.26 14722.97 22969.73 41072.51 51998.35
256 53715.79 13902.92 42227.46 45348.59 58210.93

In the above graphs we can easily spot that:
a) indeed, MaxScale performance are very low when running with just few connections (more details below);
b) for any proxy, performance become quite unstable when the number of connections increases;
c) proxysql-ff is very close to the performance of haproxy;
d) with only 1 or 4 client connections, ProxySQL provides 5 times more throughput than MaxScale in both modules; with only 8 client connections ProxySQL provides 4 times more throughput than MaxScale in R/W split, and 4.3 times more in fast forward mode;
e) at 32 client connections, proxysql-rw provides 2.8x more throughput than maxscale-rw, and proxysql-ff provides 2.3x more than maxscale-rr ;
f) 4 proxies configurations (haproxy, maxscale-rw, proxysql-rw, proxysql-ff) behave similarly at 32 or 256 client's connections, while maxscale-rr almost double its throughput at 256 connections vs 32 connections: in other words, when the number of connections is high some bottleneck is taken away.

Below are also the graphs of average throughput, average and 95% response time at low number of connections.





Fortunately, I have access to physical hardware (not AWS instances) and I was able to reproduce the issue reported in that comment: MaxScale seems to be very slow when running with just few connections.
Although, for comparison, I tried a simple benchmark on AWS and I found that MaxScale doesn't behave as bad as on physical server.
After these interesting results, I tried running the same benchmark connecting to MaxScale and ProxySQL not through TCP but through Unix Domain Socket, with further interesting results.
Unfortunately, I didn't have a version of HAProxy that accepted connections via UDS, so I ran benchmark against HAProxy using TCP connections.















Average throughput in QPS:
Connections HAProxy MaxScale RW MaxScale RR ProxySQL RW ProxySQL FF
1 3703.36 3276.85 3771.15 3716.19 3825.81
4 14506.45 11780.27 14807.45 13333.03 14729.59
8 26628.44 15203.93 27068.81 24504.42 25538.57
32 54570.26 16370.69 44711.25 46846.04 58016.03
256 53715.79 14689.73 45108.54 54229.29 71981.32

In the above graphs we can easily spot that:
a) MaxScale is no longer slow when running with just few connections: the performance bottleneck at low number of connections is not present when using UDS instead of TCP;
b) again, for any proxy, performance become quite unstable when the number of connections increase;
d) maxscale-rw is the slowest configuration at any number of connections;d) with an increased number of client connections, performance of MaxScale reaches its limits with an average QPS of 16.4k reads/s peaking at 32 connections for maxscale-rw , and an average QPS of 45.1k reads/s peaking at 256 connections for maxscale-rr;
e) with an increased number of client connections, performance of ProxySQL reaches its limits with an average QPS of 54.2k reads/s peaking at 256 connections for proxysql-rw , and an average QPS of 72.0k reads/s peaking at 256 connections for proxysql-ff .

As pointed already, with an increased number of connections the performance become quite unstable, although it is easy to spot that:
1) in R/W split mode, ProxySQL can reached a throughput over 3 times higher than MaxScale;
2) ProxySQL in Fast Forward mode can reach a throughput of 33% more than MaxScale in Read Connection Router mode;
3) ProxySQL in R/W split mode is faster than MaxScale in simple Read Connection Router mode.

The above points that while MaxScale has a readconnroute module with a low latency, none of the two MaxScale's module scale very well. The bottleneck seems to be that MaxScale uses a lot of CPU, as already pointed by Krzysztof in his blog post, therefore it quickly saturates its CPU resources without being able to scale.

Of course, it is possible to scale adding more threads: more results below!


MaxScale and TCP

At this stage I knew that, on physical hardware:
- ProxySQL was running well when clients were connecting via TCP or UDS at any number of connections;
- MaxScale was running well when clients were connecting via UDS at any number of connections;
- MaxScale was running well when clients were connecting via TCP with a high number of connections;
- MaxScale was not running well when clients were connecting via TCP with a low number of connections.

My experience with networking programming quickly drove me to where the bottleneck could be.
This search returns no results:
https://github.com/mariadb-corporation/MaxScale/search?utf8=%E2%9C%93&q=TCP_NODELAY

In other words, MaxScale never disabled the Nagle's algorithm, adding latency to any communication with the client. The problem is noticeable only at low number of connections because at high number of connections the latency introduced by Nagle's algorithm become smaller compared to the overall latency caused by processing multiple clients. For reference:
http://en.wikipedia.org/wiki/Nagle%27s_algorithm

I will also soon open a bug report against MaxScale.

What I can't understand, and I would appreciate if someone's else can comment on this, is why Nagle's algorithm doesn't seem to have any effect on AWS or other virtualization environments.
In any case, this is a very interesting example of how software behave differently on physical hardware and virtualization environments.

Because MaxScale performs on average, 5x more slowly at low number of connections via TCP, the follow graphs only use UDS for ProxySQL and MaxScale: the performance of MaxScale on TCP were too low to be considered.






Benchmark with 2 worker threads

Because MaxScale performs really bad at low number of connections via TCP due the Nagle's algorithm on physical hardware, I decided to run all the next benchmark connecting to MaxScale and ProxySQL only through UDS. HAProxy will still be used for comparison, even if connections are through TCP sockets.
I know it is not fair to compare performance of connections via TCP (HAProxy) against connections via UDS (for ProxySQL and MaxScale), but HAProxy is used only for reference.














Average throughput in QPS:
Connections HAProxy MaxScale RW MaxScale RR ProxySQL RW ProxySQL FF
4 14549.61 11627.16 14185.88 13697.03 14795.74
8 27492.41 21865.39 27863.94 25540.61 27747.1
32 81301.32 29602.84 63553.77 62350.89 77449.45
256 109867.66 28329.8 73751.24 81663.75 125717.18
512 105999.84 26696.6 69488.71 81734.18 128512.32
1024 103654.97 27340.47 63446.61 74747.25 118992.24


Notes with 2 worker threads (for MaxScale and ProxySQL) or 2 worker processes (HAProxy):
a) once again, for any proxy, performance become quite unstable when the number of connections increase. Perhaps this is not a bug in the proxies, but it is a result of how the kernel schedules processes;
b) up to 32 client connections, performance of 2 workers is very similar to performance of 1 worker no matter the proxy. Each proxy configuration has its different performance, but it performs the same with either 1 or 2 workers;
c) maxscale-rw reaches its average peak at 32 connections, reaching 29.6k reads/s;
d) maxscale-rr reaches its average peak at 256 connections, reaching 73.8k reads/s;
e) proxysql-rw reaches its average peak at 512 connections, reaching 81.7k reads/s;
f) proxysql-ff reaches its average peak at 512 connections, reaching 128.5k reads/s;

As pointed already, with an increased number of connections the performance become quite unstable, but as in the workload with just one worker thread it is easy to spot that:
1) in R/W split mode, ProxySQL can reach a throughput of nearly 3 times higher than MaxScale;
2) ProxySQL in Fast Forward mode can reach a throughput of 74% more than MaxScale in Read Connection Router mode;
3) ProxySQL in R/W split mode is faster than MaxScale in simple Read Connection Router mode.

The above points confirms what said previously: ProxySQL uses less CPU resources, therefore it is able to scale a lot better than MaxScale with an increased number of client connections.




Benchmark with 4 worker threads

I ran more benchmark using 4 worker threads for ProxySQL and MaxScale, and 4 worker processes  for HAProxy.










Average throughput in QPS:
Connections HAProxy MaxScale RW MaxScale RR ProxySQL RW ProxySQL FF
16 50258.21 41939.8 50621.74 46265.65 51280.99
32 89501.33 50339.81 87192.58 70321.17 85846.94
256 174666.09 52294.7 117709.3 115056.5 183602.6
512 176398.33 46777.17 114743.73 112982.78 188264.03
2048 157304.08 0 107052.01 102456.38 187906.29


What happens with 4 worker threads/processes?
a) as with 1 or 2 workers, for any proxy, performance become quite unstable when the number of connections increase, but this time the fluctuations seems more smooth. Yet, ProxySQL seems the most stable proxy at high number of connections;
b) at 32 connections, ProxySQL and HAProxy gives similar throughput at either 2 or 4 workers;
c) at 32 connections, MaxScale provides more throughput with 4 workers than at 2 workers, showing that MaxScale needs more CPU power to provide better throughput;
d) at 32 connections, HAProxy, ProxySQL and MaxScale provide similar reads/s if they do not analyze traffic (89.5k , 85.8k and 87.2k);
e) using R/W functionality, at 16 connections ProxySQL provides 10% more reads/s than MaxScale (46.3k vs 41.9k), and at 32 connections ProxySQL provides 40% more reads/s than MaxScale (70.3k vs 50.3k);
f) MaxScale in R/W mode wasn't able to run 2048 client's connections;
g) maxscale-rw reaches its average peak at 256 connections, with 52.3k reads/s;
h) maxscale-rr reaches its average peak at 256 connections, with 117.7k reads/s;
i) proxysql-rw reaches its average peak at 256 connections, with 115.1k reads/s;
j) proxysql-ff reaches its average peak at 512 connections, with 188.3k reads/s;
 
Few more notes on scalability with 4 threads:
1) in R/W split mode, ProxySQL can reached a throughput over 2 times higher than MaxScale;
2) ProxySQL in Fast Forward mode can reach a throughput of 60% more than MaxScale in Read Connection Router mode;
3) ProxySQL in R/W split mode is, for the first time, slightly slower than MaxScale in simple Read Connection Router mode (115.1k vs 117.7k).


Note on transport layer load balancing

I consider important only the benchmark related to R/W split because only this provides SQL load balancing; HAProxy, ProxySQL with fast forward and MaxScale with readconnroute module do not provide SQL load balancing, but are present in the benchmark above to provide some reference of the overhead caused by processing SQL traffic.
Furthermore, the performance of MaxScale's readconnroute cannot be compared with the performance of HAProxy or ProxySQL. From a user's prospective, I would prefer to use HAProxy because it can provide way better performance.


Conclusions

One of the main focus while developing ProxySQL is that it must be a very fast proxy, to introduce almost no latency. This goal seems to be very well achieved, and ProxySQL is able to process MySQL traffic with very little overhead, and it is able to scale very well.
In all the benchmark listed above ProxySQL is able to scale easily.
In fact, in R/W split mode (highly configurable in ProxySQL, but hardcoded in MaxScale), ProxySQL is able to provide up to 5 times more throughput than MaxScale, depending from workload.

Since ProxySQL in query processing mode (R/W split) provides more throughput than MaxScale's readconnroute in the majority of the cases, I would always use ProxySQL's query processing that implements important features like query routing, query rewrite, query caching, statistics, connection poll, etc.
At today, the only reason why I wouldn't use ProxySQL in production is that ProxySQL is not GA ... yet!