« MySQL のボトルネックを統計的に監視・解析する方法 | Main | PostgreSQL のボトルネックを統計的に監視・解析する方法 »

July 22, 2009

Using a statistical approach to analyze / monitor MySQL bottleneck queries

What comes up to your mind when you hear the phrase "MySQL performance tuning?"  Most of the discussions covering the topic are things like server parameter tuning and blog entries describing the use of the "EXPLAIN" command.  However, it is too time-consuming to check every single SQL query.  A better approach is at first to determine the SQL queries that are actually consuming the server resources, and then to optimize those queries.

So the question is how to find out the bottleneck queries.  At MySQL Conference & Expo 2009, Mark Callaghan explained in his keynote that Google was taking a statistical approach using the "SHOW PROCESSLIST" as a solution.  The same command is mentioned as a rival approach in the webpages of MySQL Query Analyzer as well.  Today, having a need for something alike, I wrote a set of tiny script files under the name "mprofile" that does the same thing.  It can be used as follows.

Sampling SQL queries

Use mpdump to sample running queries from mysqld periodically.  The default setting is to fetch 1,000 samples in 0.1 second interval  (in other words, it would take a little more than 100 seconds to execute).  Please use the --help option to find out the way to specify the number of samples, the interval, and connection parameters to mysqld can be specified as command line parameters.

% mpdump > mpdump.txt

Merging the queries and generating the report

Use mpfilter command to rewrite the variables in the sampled queries, then feed the result into mpreport command to sort the queries from the heaviest one to the lighter.  Each line of the report is in the format "probability:SQL query," where probability is the average percentage of if the SQL query is executed.  The example below is taken from the mysqld of Pathtraq.  It is obvious that which queries are consuming the resources (and that the heaviest query is running at the probability of 138.3%: it is run by 1.383 threads by average).

% mpfilter < mpdump.txt | mpreport | head -100
138.300:select pt_sum(hits,?) as cnt from url inner join counter on url.id=counter.url where (pt_url_compress(?)<=rurlc and rurlc<pt_url_compress(?,?))
99.100:insert into tt select t1.url,cnt_mixed&? as cnt,********** where url.status_code=?
94.200:select id,pt_url_decompress(rurlc) as rurl from url where crawl_cnt<? and crawl_at>=? and crawl_at<?-?*pow(?,crawl_cnt) limit ?
22.000:select id,status_code from url where rurlc=pt_url_compress(?)
14.300:select id,pt_url_decompress(rurlc) as rurl,cnt from (select url,cnt from counter_by_day **********
9.300:insert into counter (url,referer,hits) values (?,?,pt_cnt_add(null,?,?)) on duplicate key update hits=pt_cnt_add(hits,?,?)
7.400:insert ignore into **********
6.100:select pt_sum(hits,?) as cnt from counter where url=?
5.900:select (crawl_at<? and -crawl_at<? and pt_sum(hits,floor(-crawl_at/?))>?) as recrawl from url inner join counter on url.id=counter.url where id=?
5.700:insert into **********
5.700:select t1.id,pt_url_decompress(rurlc) as rurl,cnt,score from **********
4.600:select id,name,dispname from category_ja order by disporder,id
4.300:select addr from **********
3.200:update url set crawl_at=?,lang=?,status_code=if(status_code=?,?,?) where i
d=?
(snip)

By using the report, we can decide from which query to start optimizing, by using explain, or by letting their responses be cached by memcached, etc.  It is also possible to replace mpfilter with a house-made script to customize the report.

Conclusion

There are other ways to find out bottlneck queries, like using the slow query log or MySQL Enterprise Monitor.  Comparing with them, the merits of using the described solution using "SHOW PROCESSLIST" are

  • no need to change current environment
  • practically no overhead
  • potential to build a customized performance tuning / monitoring tool
As an example of using "SHOW PROCESSLIST" for other purposes than performance tuning, it would be possible to keep a daily sample log of SQL queries using mprofile, and use them to find out the long-term changes of the bottlenecks to build server reinforcement plans or for refactoring the design of the application.

Finally, the source code of mprofile is available from /platform/mysql/mprofile – CodeRepos::Share – Trac.  Have fun!

TrackBack

TrackBack URL for this entry:
http://www.typepad.jp/t/trackback/404050/20683269

Listed below are links to weblogs that reference Using a statistical approach to analyze / monitor MySQL bottleneck queries:

Comments

mk-query-digest from Maatkit does similar analysis, and it can use the processlist as a source of queries.

It's good to examine two types of queries: those consuming resources (a lot of execution time in aggregate), and those that will create a poor user experience (slow execution time, even if rarely executed).

Thank you for your comment. I didn't know that Maakit could use processlist as a source. I will try using it.

Jet Profiler for MySQL uses SHOW PROCESSLIST polling too. I totally agree on the advantages of using a statistical approach. It will give you the top offending queries. I've seen a lot of DBAs fighting (unindexed) queries in the slow query log, resulting only in minor performance improvements. Later, it turns out that the main load is caused by well-formed queries that are issued 10 times too often or so. They will never show up in the slow query log.

There's a big issue with continually polling SHOW PROCESSLIST like that though - one of the main reasons that MySQL Query Analyzer doesn't use that approach - it locks a major mutex when generating the processlist, LOCK_thread_count.

This is used pretty widely internally (and is one of the major mutex issues for scalability at the SQL layer imho), just checking the sql directory of a 5.1.35 source tree:

Cerberus:sql mark$ grep -rn LOCK_thread_count ./* | wc -l
163

So anybody reading this - beware. Using it "constantly" in production is *highly* not recommended (at least by me).

It's great for sampling though. :)

developer.cybozu.co.jp is very informative. Thank you very much for all your work. AAA Toronto Payday Loans 1172 Bay St #101, Toronto, ON M5S 2B4 (416) 477-2817

Post a comment