One of our server encounter backup problem because there are too many files need to be backup, near 200 million total of files. What I need to do is to remove some files in some folders, and let it run automatically everyday to remove the unwanted files. In our web server, we have one temporary folder that use to have lots of temporary files. The folder is located under /home/mywebsite/temp_upload/ .
I started by checking the inodes (the number of files) in this folder:
$ du -sk /home/mywebsite/temp_upload/ 10543660 /home/mywebsite/temp_upload/
As you can see, I have 10 million files inside this directory. Our developer has forgot to remove the unused files so I need to create a cron job to remove files which older than 3 months (90 days) in this directory. To remove, the command should be as below:
$ find /home/mywebsite/temp_upload/ -type f -mtime +90 | xargs rm -Rf
It takes some time to complete and once done, the inodes has dropped to 2077:
$ du -sk /home/mywebsite/temp_upload/ 2077 /home/mywebsite/temp_upload
To automate this, just add the command into crontab and schedule to run on weekly basis (at 6 AM every Sunday):
$ crontab -e
Add following line:
0 6 * * 0 /bin/find /home/mywebsite/temp_upload/ -type f -mtime +90 | xargs rm -Rf
Restart crond to apply the cron changes:
$ service crond restart
Warning: Make sure you run the command during low peak hours. This process might overloading your server, as what happened to me due to wrong time zone 🙂
Now you should automate the files removal and you can focus on other things!