Sunday, October 27, 2013

Introducing ProxySQL : High Performance Proxy for MySQL

I concluded my previous post stating that what is really missing is a proxy that:
- is stable as HAProxy
- scales like MySQL Proxy
- is rich of features


That is what is driving my development for ProxySQL : a high performance proxy for MySQL.



Some background first.
In the past I have worked with customers that, after providing them a detailed SQL review on how to improve performance rewriting queries, answer with the usual "we can't modify the queries" , and the most common reasons for this are, among others:
- the queries are generated by ORM ;
- they don't own the application ;
- they don't have the time to dig the code and rewrite the queries .
It is a quite spread false believe that adding indexes and tuning MySQL can magically improve performance of badly written queries, but the truth is far from that.


I am pretty confident that any DBA was in the situation of seeing a perfectly tuned MySQL server dying under the load of poorly written SQL statements, or under the load of a high number of redundant queries sent by applications that do not implement any sort of caching.
And often this lead to downtime!
What makes it even worse is that you, as DBA, you have identified the bad queries and you are telling the Devs exactly how they need to rewrite them, but all your work and effort become worthless if Devs don't perform the rewrite. This may either never happen, or happen after few hours of downtime.
Why a DBA should depends from a Dev for this sort of scenarios? The DBA should be able to reconfigure the system with the new rewritten queries. The DBA should have more power over the RDBMS, power that right now is in the hands of developers.
So the idea of the proxy: when a DBA find a broken query, in case of emergency (s)he should fix it immediately on the proxy, without waiting any Dev.
MySQL Proxy can do that, but its current implementation is still in alpha and far from being production ready (see my previous post). I looked for alternatives for a while, but after some time I combined the need of a new proxy with some study of the MySQL protocol, and this leaded to a new project: ProxySQL.


What started as a “weekends project” being a prototype that was simply analyzing mysql protocol packets and rewriting queries using some basic regex pattern/replace, has slowly evolved into a more functional proxy. And ironically, the rewrite functionally is not implemented yet!




ProxySQL is still in alpha phase, and lots of features are not implemented yet, but before adding new features I want to ensure it is fast enough to not affect performance.
In my previous post I compared performance of MySQL vs MySQL Proxy vs HAProxy , using both sysbench and mysqlslap.
Now, it is time to add ProxySQL in the same graphs.

sysbench:
# for i in 1 2 4 8 16 32 64 128 ; do sysbench --num-threads=$i --max-requests=0 --max-time=60 --test=oltp --oltp-table-size=100000 --mysql-user=root --mysql-password=pass --mysql-host=127.0.0.1 --mysql-db=test --db-ps-mode=disable --mysql-port=6033 --oltp-read-only run | grep 'transactions:' ; done
    transactions:                        16357  (272.60 per sec.)
    transactions:                        31678  (527.95 per sec.)
    transactions:                        63219  (1053.60 per sec.)
    transactions:                        117977 (1966.17 per sec.)
    transactions:                        192814 (3213.33 per sec.)
    transactions:                        241416 (4023.18 per sec.)
    transactions:                        239564 (3992.18 per sec.)
    transactions:                        237762 (3961.61 per sec.)




mysqlslap:
# mysqlslap --create-schema=test -u root -ppass -h 127.0.0.1 -P6033 -c 1,2,4,8,16,32,64,128 -q select1.sql  | grep "Average number of seconds to run all queries"
Warning: Using a password on the command line interface can be insecure.
        Average number of seconds to run all queries: 8.189 seconds
        Average number of seconds to run all queries: 8.432 seconds
        Average number of seconds to run all queries: 8.716 seconds
        Average number of seconds to run all queries: 10.256 seconds
        Average number of seconds to run all queries: 11.808 seconds
        Average number of seconds to run all queries: 18.809 seconds
        Average number of seconds to run all queries: 34.577 seconds
        Average number of seconds to run all queries: 69.186 seconds





The above look very promising!
The performance of ProxySQL outnumbers the performance of both MySQL Proxy and HAProxy!!

One of the first features I wanted to introduce in ProxySQL is caching.
This features is not available in HAProxy and it is not available in MySQL Proxy, and I am pretty confident that a good implementation can't be developed with Lua scripting.
In future, users of ProxySQL (either DBAs or Devs) will have the opportunity to decide what to cache and what not, for how long and how to invalidate it, but for now the current implementation is simple as that: every SELECT statement that is not a SELECT FOR UPDATE is cached for a configurable amount of time (30 seconds by default). From an application point of view, it is like caching mysql resultsets in memcached (or other caching solutions), or reading from slaves that are lagging up to 30 seconds.

In the follow graphs I rerun the same benchmark, but with caching enabled.

sysbench:
 
# for i in 1 2 4 8 16 32 64 128 ; do sysbench --num-threads=$i --max-requests=0 --max-time=60 --test=oltp --oltp-table-size=100000 --mysql-user=root --mysql-password=pass --mysql-host=127.0.0.1 --mysql-db=test --db-ps-mode=disable --mysql-port=6033 --oltp-read-only run | grep 'transactions:' ; done
    transactions:                        43855  (730.91 per sec.)
    transactions:                        100656 (1677.58 per sec.)
    transactions:                        204358 (3405.90 per sec.)
    transactions:                        404678 (6744.51 per sec.)
    transactions:                        717239 (11953.78 per sec.)
    transactions:                        960561 (16008.91 per sec.)
    transactions:                        947388 (15788.89 per sec.)
    transactions:                        932289 (15536.14 per sec.)





mysqlslap:

# mysqlslap --create-schema=test -u root -ppass -h 127.0.0.1 -P6033 -c 1,2,4,8,16,32,64,128 -q select1.sql  | grep "Average number of seconds to run all queries"
Warning: Using a password on the command line interface can be insecure.
        Average number of seconds to run all queries: 3.106 seconds
        Average number of seconds to run all queries: 3.300 seconds
        Average number of seconds to run all queries: 3.569 seconds
        Average number of seconds to run all queries: 4.197 seconds
        Average number of seconds to run all queries: 5.028 seconds
        Average number of seconds to run all queries: 7.188 seconds
        Average number of seconds to run all queries: 14.405 seconds
        Average number of seconds to run all queries: 28.475 seconds



With cache enabled, performance of ProxySQL are way better than performance against MySQL directly!
I expect to question "what about MySQL with query cache?" .
The performance of the cache of ProxySQL are superior than the performance of query cache in MySQL, and this is easy to demonstrate without running any benchmark: MySQL query cache cannot cache statements like "SELECT 111" used in the mysqlslap benchmark, thus is of no use and the number of QPS of mysqlslap against MySQL cannot grow.


Being the above results really interesting and promising, I have good reasons to continue the development of ProxySQL , adding features, improving performance and stability, and fixing bugs.



Tuesday, October 15, 2013

Simple benchmark on MySQL Proxy vs HAProxy

A Proxy is a software that acts as an intermediary between clients seeking resources from servers.

In the MySQL world, the most common proxies are:
- MySQL Proxy
- HAProxy

While the first one is able to understand the MySQL protocol and provides a lot of functionality, it is often not used in production as tagged as very slow, unstable and CPU hungry.
HAProxy is widely used with MySQL setups as it is said to be very efficient in term of performance and very stable, although it acts as a Layer 7 proxy and doesn’t recognize the MySQL protocol.

The adjectives slow and efficient are meaningless words if not supported by numbers and fact, so let’s try to make some real comparison.



We are going to tests performance of mysqld vs MySQL Proxy vs HAProxy running 2 very simple benchmark :
1) sysbench OLTP read-only;
2) mysqlslap


Benchmarks were performed on a box with two processors Intel(R) Xeon(R) CPU E5-2620 ( 12 cores and 24 threads total ) , running Debian 7.1 , Linux 3.2 and MySQL 5.6.11 .

sysbench benchmark


sysbench table was created as follow:

 sysbench --test=oltp --oltp-table-size=100000 --mysql-user=root --mysql-password=pass --mysql-host=127.0.0.1 --mysql-db=test --db-ps-mode=disable --mysql-port=3306 prepare  
 sysbench 0.4.12: multi-threaded system evaluation benchmark  
 No DB drivers specified, using mysql  
 Creating table 'sbtest'...  
 Creating 100000 records in table 'sbtest'...  



Let's test the performance of mysqld without any proxy:


# for i in 1 2 4 8 16 32 64 128 ; do sysbench --num-threads=$i --max-requests=0 --max-time=60 --test=oltp --oltp-table-size=100000 --mysql-user=root --mysql-password=pass --mysql-host=127.0.0.1 --mysql-db=test --db-ps-mode=disable --mysql-port=3306 --oltp-read-only run | grep 'transactions:' ; done
    transactions:                        13758  (229.29 per sec.)
    transactions:                        27387  (456.42 per sec.)
    transactions:                        73474  (1224.52 per sec.)
    transactions:                        144594 (2409.78 per sec.)
    transactions:                        231488 (3857.90 per sec.)
    transactions:                        282317 (4704.87 per sec.)
    transactions:                        280961 (4682.07 per sec.)
    transactions:                        282064 (4700.08 per sec.)




Now it is time to try MySQL Proxy.


mysql-proxy was started as follows:

 # mysql-proxy --admin-username=root --admin-password=pass --admin-lua-script=/usr/share/mysql-proxy/admin.lua


# for i in 1 2 4 8 16 32 64 128 ; do sysbench --num-threads=$i --max-requests=0 --max-time=60 --test=oltp --oltp-table-size=100000 --mysql-user=root --mysql-password=pass --mysql-host=127.0.0.1 --mysql-db=test --db-ps-mode=disable --mysql-port=4040 --oltp-read-only run | grep 'transactions:' ; done
    transactions:                        15874  (264.56 per sec.)
    transactions:                        30653  (510.85 per sec.)
    transactions:                        56029  (933.78 per sec.)
    transactions:                        60014  (1000.15 per sec.)
    transactions:                        65829  (1096.99 per sec.)
    transactions:                        62449  (1040.50 per sec.)
    transactions:                        56317  (938.00 per sec.)
    transactions:                        51712  (859.88 per sec.)





Now it is time to try HAProxy.

The configuration file is very simple:

 # cat haproxy.cnf  
 global  
     log 127.0.0.1  local0  
     log 127.0.0.1  local1 notice  
     maxconn 4096  
     uid 99  
     gid 99  
     daemon  
     #debug  
     #quiet  
 defaults  
     log   global  
     mode  http  
     option tcplog  
     option dontlognull  
     retries 3  
     maxconn 2000  
     contimeout   5000  
     clitimeout   50000  
     srvtimeout   50000  
 listen mysql-cluster 0.0.0.0:9000  
   mode tcp  
   balance roundrobin  
   option mysql-check  
   server db01 127.0.0.1:3306 check  


HAProxy was executed as follow:

 # ./haproxy -f haproxy.cnf -db  


And sysbench:


# for i in 1 2 4 8 16 32 64 128 ; do sysbench --num-threads=$i --max-requests=0 --max-time=60 --test=oltp --oltp-table-size=100000 --mysql-user=root --mysql-password=pass --mysql-host=127.0.0.1 --mysql-db=test --db-ps-mode=disable --mysql-port=9000 --oltp-read-only run | grep 'transactions:' ; done
    transactions:                        18735  (312.24 per sec.)
    transactions:                        37342  (622.35 per sec.)
    transactions:                        70616  (1176.87 per sec.)
    transactions:                        70876  (1181.16 per sec.)
    transactions:                        83456  (1390.73 per sec.)
    transactions:                        87357  (1455.58 per sec.)
    transactions:                        66665  (1110.46 per sec.)
    transactions:                        63809  (1062.12 per sec.)



Indeed, HAProxy seems faster than MySQL Proxy .
So far, the statement "MySQL Proxy is slow" seems correct.
 But mysql-proxy has a very interesting option: --event-threads

Running again MySQL Proxy with 8 event threads:

 # mysql-proxy --admin-username=root --admin-password=pass --admin-lua-script=/usr/share/mysql-proxy/admin.lua --event-threads=8 



# for i in 1 2 4 8 16 32 64 128 ; do sysbench --num-threads=$i --max-requests=0 --max-time=60 --test=oltp --oltp-table-size=100000 --mysql-user=root --mysql-password=pass --mysql-host=127.0.0.1 --mysql-db=test --db-ps-mode=disable --mysql-port=4040 --oltp-read-only run | grep 'transactions:' ; done
    transactions:                        16219  (270.31 per sec.)
    transactions:                        26256  (437.58 per sec.)
    transactions:                        46753  (779.16 per sec.)
    transactions:                        83624  (1393.64 per sec.)
    transactions:                        118784 (1979.54 per sec.)
    transactions:                        125283 (2087.63 per sec.)
    transactions:                        127235 (2119.87 per sec.)
    transactions:                        126636 (2109.21 per sec.)


Now, no matter the bad reputation , MySQL Proxy with 8 event-threads seems way faster than HAProxy.

The reason behind that is that HAProxy is not multi-threaded (so is MySQL Proxy by default). Adding more threads can increase throughput.

Maybe who claims that HAProxy is faster than MySQL Proxy is not completely right.


After I completed this sysbench test I stopped haproxy and mysql-proxy pressing ctrl-c on the terminal, and ...

 # mysql-proxy --admin-username=root --admin-password=pass --admin-lua-script=/usr/share/mysql-proxy/admin.lua --event-threads=8  
 ^CSegmentation fault  

... mysql-proxy crashed while closing. I can't consider that stable.



Some readers would have notice that with 1-2 threads the throughput of MySQL without proxies is lower than with proxies. I tried it over and over, and I always get the same behaviour: can't explain that.


In image:



mysqlslap benchmark


With mysqlslap I want to just check the latency.
To do so I created an SQL file with 100k simple SELECT statements:

 # for i in `seq 1 100000` ; do echo "SELECT 1111;" ; done > select1.sql  


Running mysqlslap against mysqld:

 # mysqlslap --create-schema=test -u root -ppass -h 127.0.0.1 -P3306 -c 1,2,4,8,16,32,64,128 -q select1.sql  | grep "Average number of seconds to run all queries"
Warning: Using a password on the command line interface can be insecure.
        Average number of seconds to run all queries: 4.719 seconds
        Average number of seconds to run all queries: 5.090 seconds
        Average number of seconds to run all queries: 5.630 seconds
        Average number of seconds to run all queries: 6.285 seconds
        Average number of seconds to run all queries: 7.514 seconds
        Average number of seconds to run all queries: 11.273 seconds
        Average number of seconds to run all queries: 21.714 seconds
        Average number of seconds to run all queries: 43.227 seconds

Running mysqlslap against HAProxy:

 # mysqlslap --create-schema=test -u root -ppass -h 127.0.0.1 -P9000 -c 1,2,4,8,16,32,64,128 -q select1.sql  | grep "Average number of seconds to run all queries"
Warning: Using a password on the command line interface can be insecure.
        Average number of seconds to run all queries: 6.317 seconds
        Average number of seconds to run all queries: 6.537 seconds
        Average number of seconds to run all queries: 7.205 seconds
        Average number of seconds to run all queries: 13.875 seconds
        Average number of seconds to run all queries: 25.288 seconds
        Average number of seconds to run all queries: 50.841 seconds
        Average number of seconds to run all queries: 107.516 seconds
        Average number of seconds to run all queries: 210.910 seconds

Running mysqlslap against MySQL Proxy:

 # mysqlslap --create-schema=test -u root -ppass -h 127.0.0.1 -P4040 -c 1,2,4,8,16,32,64,128 -q select1.sql  | grep "Average number of seconds to run all queries"
Warning: Using a password on the command line interface can be insecure.
mysqlslap: Error when connecting to server: Bad handshake


So disappointing, it doesn't even connect with MySQL Proxy!
Surely, not something to try on production!


Graphing the number of queries per second:



Conclusions

From these simple benchmarks we can conclude that:
  • HAProxy is fast with few connections, but it has scalability issues;
  • MySQL Proxy has a reasonable throughput and latency , and it scales better than HAProxy;
  • MySQL Proxy is not stable at all
What is really missing is a proxy that:
- is stable as HAProxy
- scales like MySQL Proxy
- is rich of features