Some days ago I got a call from our support engineer on duty that MySQL on one of our database servers was lagging more than 1000 seconds behind in replication and the server got kicked out of the pool because of the delay. He was unable to find out why and there was absolutely nothing in the mysql log files. When I got the call it was still lagging behind but the lag was slowly decreasing again.
After a quick peek in all our monitoring systems I isolated it to this message:
Cache Battery 0 in controller 0 is Charging (Ready) [probably harmless]
Apparently not that harmless!
Obviously we did encountered this situation a couple of times before but apparently there was no detection on this machine.
The relearn cycle happens every 90 days and gets first scheduled when the machine gets powered on. Now imagine this happening in a master-master setup where both machines were powered on at the same time. Lucky enough you can use omconfig to reschedule the cycle up to 7 days, but then you obviously need to have detection in place.
Why did nobody come up with the idea to have a dual battery backed up cache with alternating relearn cycles? That way you can have your battery relearn without the controller going back into write-through mode.
At the company I work for we are still running Percona Server 5.1 in production and are slowly heading towards a Percona Server 5.5 rollout. It did take a lot of preparation in the past few months (write a my.cnf conversion script for example) and a lot of testing. A couple of machines already have been upgraded this week to 5.5 to compare performance and stability. So far the machines proved to be stable enough to keep them on 5.5 and even better: we already see a couple of benefits! However, the title wouldn’t have been blues if everything would have been a breeze, right?
First problem we ran into was that our Cacti templates broke due to the changed InnoDB status output. So I headed towards the Cacti templates and looked in the issue tracker if the issue was already known. Apparently it was already known, but unfortunately not fixed yet. Lucky enough writing the fixes myself wasn’t much of a problem.
Secondly we ran into the issue that the history list was growing from a “steady” 200 to 4000 after upgrade. Searching on this topic revealed a problem with the purge operations but it was not clear to me what exactly was the problem. According to the MySQL documentation the default should suffice. Uhm, right?
Now I knew some things have changed in 5.5 regarding purging: a separate purge thread was already introduced in 5.1 but could it have been so different then? So I tried to find out what each and every purge variable would do:
At first I assumed that increasing the batch size would make the purging more efficient: the larger the batch the more it could handle, right?
Wrong: the larger you set it, the later it will purge! I found this on the MySQL documentation about it:
The granularity of changes, expressed in units of redo log records, that trigger a purge operation, flushing the changed buffer pool blocks to disk.
So the name is actually confusing! In our case it went from 20 to 40 making things worse and then from 40 to 10 making the history list go from 4000 to 1800.
Then I decided to see what the purge lag would do. Changing the purge lag as described by Peter did indeed lower the history list for a short while, but MMM also kicked the 5.5 server out of its pool because it started lagging behind in replication! So this is definitely something to keep in the back of your mind!
I did not change the purge threads to 0 since it is a machine that runs in our production environment. Also confusing is the deprecated innodb_use_purge_thread that could be set to different values than 0 and 1 but is marked as experimental.
I’m positive a part two will come shortly, so stay tuned.
Just started MySQL Quicksand and hopefully it will turn out to be something fruitful.
Well, I feel the urge to share some experience and willing to learn from the feedback.
What to expect here?
Well I do have a lot of warstories to share and tons of stuff that I try to find out when reading other peoples blogs… More stuff to come soon!