Weekly Lockup - backup_database

Discussion for TimeTrex open source community developers.
Post Reply
EricM
Posts: 30
Joined: Wed Jan 16, 2013 2:22 pm

Weekly Lockup - backup_database

Post by EricM »

It appears that the CronJobFactory system built into Timetrex doesn't wait for backup_database to complete before starting new jobs. If backup_database takes longer than a minute, this can cause it to overlap into other jobs.

If backup_database overlaps into the running of PurgeDatabase (in MiscWeekly.php), the spike in load will almost always cause the virtual machine that we are running Timetrex on to lock up.

My solution was to explicitly stop Timetrex cron jobs during a backup window and then run backup_database via my own cron job. This has been working reliably for several weeks.
shaunw
Posts: 7839
Joined: Tue Sep 19, 2006 2:22 pm

Re: Weekly Lockup - backup_database

Post by shaunw »

When you say "lock-up" what do you mean exactly?

Also what database are you using?
EricM
Posts: 30
Joined: Wed Jan 16, 2013 2:22 pm

Re: Weekly Lockup - backup_database

Post by EricM »

PostgreSQL 8.3.3 on Ubuntu, it's what the Timetrex installer set up. It's on a Dell T110 w/ X3440 CPU, the Timetrex VM is running in Proxmox and is given 4GB ram and a 2 core CPU via KVM32.

When I say lock up, I mean Timerex & Postgre will appear to work normally for a short period of time after the overlapping of jobs happens, but it seems like Postgre can't keep up. Disk access skyrockets and then it eventually locks up itself and it's entire virtual machine. I then need to forcibly kill the Virtual Machine, start it back up, give Postgre a little time to catch up (seems like it's finishing up whatever it was trying to do before, but I'm not sure how to verify that yet), and then start Timetrex like normal.

I do have a capture of timetrex.log, but it doesn't seem to show anything interesting other than the overlapping of jobs.

The RRDtool graphs just show a massive spike around the time it happens (see attachment, lock up happened Sunday).
Attachments
Untitled.png
shaunw
Posts: 7839
Joined: Tue Sep 19, 2006 2:22 pm

Re: Weekly Lockup - backup_database

Post by shaunw »

The Linux automated installer is primarily designed for extremely old Linux distributions that didn't offer the minimum requirements for TimeTrex, since that hasn't been the case for about 5 years now, we recommend using the manual .ZIP installer and using the bundled versions of Apache/PHP/PostgreSQL that come with your Linux distribution instead. Doing so may actually help alleviate the problem.

The Linux automated installer will be discontinued over the next year or so.
EricM
Posts: 30
Joined: Wed Jan 16, 2013 2:22 pm

Re: Weekly Lockup - backup_database

Post by EricM »

Thanks for the quick reply! I didn't realize the Linux installer had been depreciated.

I set up a test VM running 64 bit Ubuntu with PostgreSQL 9.3.9 & PHP 5.5.9 and am currently restoring a copy of our database.

Assuming everything works well in testing I'll move it into production next week. Then I'll re-enable the backup_database job and report back with what I find out.
EricM
Posts: 30
Joined: Wed Jan 16, 2013 2:22 pm

Re: Weekly Lockup - backup_database

Post by EricM »

Upgrading Postgre to 9.3 and using the tweaks I posted in the installation forums seems to have fixed my problem and made Timetrex run smoother in general.

http://forums.timetrex.com/viewtopic.php?f=1&t=6946

Thanks for the suggestion.
Attachments
stat.png
EricM
Posts: 30
Joined: Wed Jan 16, 2013 2:22 pm

Re: Weekly Lockup - backup_database

Post by EricM »

Update: It looks like everything has settled down into a nice regular load cycle. I'm not sure why I was getting high loads towards the end of the week initially, but it's gone now.
Attachments
usage.png
Post Reply