Disk Space Problem

8/23/2011: Glen implemented a nightly reboot to clean things out.  This morning, disk space is at 87%.  New disk space is on the way.  /tmp only has 57MB that my account can see.  Console.log in jboss_home/logs is very large (about 835MB), over 5 million lines (starting from reboot at 11 hours ago), first 800 lines are server startup and the remaining lines are like:

1000 22:08:21,778 WARN  [arjLoggerI18N] [com.arjuna.ats.arjuna.recovery.RecoverAtomicAction_4] - RecoverAtomicAction: transaction not activated, unable to replay phase 2 commit

occurring several times per second.  It probably gets very large very quickly.  Same error message appears in the server.log files.

Chris 8/22/2011: disk on bgcspace.berkeley.edu filled up again.  When I checked on Friday morning (19 August), disk was at 85%.  Ran "runJboss restart all" to shutdown and restart both servers.  Disk space at 86%.  Didn't check /tmp before, but now it has 1.8G.

Glen 8/16/2011

On bgcspace.berkeley.edu, which currently houses the UCJEPS Seaweed project, there have been several instances of the disk system reaching 100% capactiy. This naturally causes all operations to fail. There are several contributing factors: large log files, large data files, numerous temporary files in "/tmp",  but these do not appear to fully explain the problem.

The last two times this occurred, I was able to account for only a small part of the disk usage.  The remainder, I believe, belongs to cache or swap space that is not part of the file system, so it does not appear in file system usage figures.  After removing as many temporary files as could be identified. usage had only dropped from 100% to ~85%. After halting both Jboss instances it dropped to ~82%.  After rebooting the VPS, it went to 46%.

The most recent occurrence involved temporary files in "/tmp". When importing media files, it should be noted, there are two copies of each file on the system.  The "official" file in the Nuxeo repository, and a second copy in "/tmp", so disk usage grows faster than might be expected. In this case, the temp files eventually grew to 3 gigabytes, at which point the disk was full. There was also a 1.5 gigabyte log file. After deleting the log file the disk was still at 100%, so I started deleting the temp files. The disk continued to show 100% usage until more than half of the temp files had been deleted. After deleting all of the temp files, the usage was still around 97%.

After rebooting the server, usage was back to 67%.