= Write performance = erinye2-vm2 writes a large file (16gb): {{{ # /afs/ipp-garching.mpg.de/.cs/perftest/i386_rh90/write_test /afs/ifh.de/testsuite/testosd/testfile-large 0 17179869184 ... write of 17179869184 bytes took 187.631 sec. close took 0.602 sec. Total data rate = 89130 Kbytes/sec. for write }}} = Read performance = == load impact == ...is considerable after all. Upon start of a batch job, read throughput has been observed to drop from ~115MBps to ~50MBps. Things to find out: * what role does CPU load play? * what role does system load play? * is the amount of time jobs spend in kernel context important? = Problems = We did at least once observe cache corruptions. This is what five machines said during one run: {{{ wrong offset found: (0x1, 0xa71c0000) instead of (0x0, 0x15300000) wrong offset found: (0x0, 0x73f90000) instead of (0x1, 0x22160000) wrong offset found: (0x1, 0x7d120000) instead of (0x0, 0x94400000) wrong offset found: (0x2, 0xab0f0000) instead of (0x0, 0xbd0000) wrong offset found: (0x1, 0x50580000) instead of (0x0, 0x88900000) }}} Apparently, all of them were running pre-r690 versions of the AFS+OSD client, so this seems to confirm that * there was indeed a problem with the vicep-access code on Lustre and * r690 may have indeed fixed that very problem.