Monday, June 18, 2007

dCache pools went down

From friday afternoon several dCache pools went down. It ran out of memory, and here is the content of the sedsk01Domain.log file.

06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : at java.lang.Thread.run(Thread.java:595)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : Storing incomplete file : 0003000000000000006E0B80 with 2756018417
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : Stacked Exception (Original) for : 0003000000000000006E0B80 <-P---------(0)[0]> 2756018417 si={cms:cms} : CacheException(rc=10006;msg=Pnfs request timed out)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : Stacked Throwable (Resulting) for : 0003000000000000006E0B80 <-P---------(0)[0]> 2756018417 si={cms:cms} : CacheException(rc=33;msg=Illegal State Transition -P-------- -> -P--------)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : CacheException(rc=33;msg=Illegal State Transition -P-------- -> -P--------)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : at diskCacheV111.repository.CacheRepository2$CacheEntry.setPrimaryState(CacheRepository2.java:107)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : at diskCacheV111.repository.CacheRepository2$CacheEntry.setPrecious(CacheRepository2.java:219)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : at diskCacheV111.repository.CacheRepository2$CacheEntry.setPrecious(CacheRepository2.java:215)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : at diskCacheV111.pools.MultiProtocolPool2$RepositoryIoHandler.run(MultiProtocolPool2.java:1538)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : at diskCacheV111.util.SimpleJobScheduler$SJob.run(SimpleJobScheduler.java:64)
06/15 16:32:13 Cell(sedsk01_1@sedsk01Domain) : at java.lang.Thread.run(Thread.java:595)
06/15 16:35:02 Cell(c-100@sedsk01Domain) : runIO : java.lang.OutOfMemoryError: Java heap space
06/15 16:35:02 Cell(c-100@sedsk01Domain) : java.lang.OutOfMemoryError: Java heap space
06/15 16:35:02 Cell(c-100@sedsk01Domain) : java.lang.OutOfMemoryError: Java heap space
06/15 16:38:25 Cell(c-100@sedsk01Domain) : runIO : java.lang.OutOfMemoryError: Java heap space


dCache is started with those parameters:
-server -Xmx512m -XX:MaxDirectMemorySize=512m

We don't know what happened.

No comments: