gpu work queue and recent issues
The GPU side queue ran dry on Friday and in our haste to fill them, encountered some additional speed-bumps. Please see my post for more details.
19 Nov 2012 | 1:11:22 UTC
We had a server outage from about 2012-11-06 19:00 GMT to 2012-11-06 23:00 GMT. The primary cause was a PDU, owned and operated by the data center, going bad. There was some back and forth as the tech initially claimed the server power supply was dead (he made a few more trips to check things when I questioned the simultaneous death of the dual PSUs).
It seems to be functioning normally and is still handling a higher than normal load catching up with all the clients. Please make us aware of any issues that this may have caused.
further timeline details:
19:05: PowerBlade and I received alerts
19:18: ticket opened for remote hands given the server *and* out of band management were not reachable
19:58: call to the DC to ask what the hold up was
20:16: tech claims the power supply is dead
20:23: I fired back questioning their conclusion
20:49: briefly we regained power
20:59: request for another check of the machine
21:32: machine made available again
file system checks, db integrity checks, RAID integrity confirmation, etc., etc.
22:46: initially bringing everything back online
23:00: noticed and started 1 missed important service (upload handler)
6 Nov 2012 | 23:33:56 UTC
cuda toolkit driver upgrades
In order to focus efforts on supporting performance on newer cards we plan to cease creation and phase out the cuda23 version of distrrtgen.
In order to support cuda40 everyone must run *at least* driver version 275.33.
While the clients left with drivers older than this is the minority, it is still a fair number. However, we'd really like to get everyone running a driver that supports at least cuda42 and the driver versions are listed below.
linux >= 285.05.33
win >= 286.19
cuda42 (first kepler release):
linux >= 295.41
win >= 301.27
linux >= 304.54
win >= 306.94
28 Oct 2012 | 21:31:20 UTC
First, wow and thanks everyone who participated! A special thanks to the Polish National Team for their work on the OpenCL app version for ATI cards.
The server held up nicely and as far as work hand out and return the only glitches that were raised relate to ATI cards having trouble receiving work and the massive bandwidth demands for the uploads.
Statistics are now final in the data for the team and user stats on the project home page. You will note that the timestamp on it is 10 mins after the end of the challenge but after comparing the team data from 16:05:03 to 16:10:03 there we no changes. There were some changes in points from 16:00:03 to 16:05:03 and this is because we counted all work by the reported time and not all the validation on the server side completed until the 16:05:03 update. These few changes are 10947.5 points for the Polish National Team, 3124.5 points for the SETI@klamm.de, 7823 points for Ars Technica, and 10947.5 points for TaiwainROC. Additionally, we have an archive of most of the statistics from the challenge available for public visibility, scrutiny, historical data, etc. via this archive.
From the start it looked like the Polish National Team was really putting some massive computation power to work and it began to look like Sicituradastra. was to be held at the number two rank. In the end Sicituradastra. did manage to finish in the top spot with a lead of approximately 353 (GPU) WUs in total where each team completed approximately 22000 WUs if just looking at credit of GPU WUs against the total.
On the user side [GPU Force] Robert 7NBI and STE\/E put out insane crunching efforts! They finished first and second.
The battles within various ranks of team and user are much more than just those at the top but equally interesting and best told by all those who wish to do so here. Congratulations to everyone!
The server saw some 40% increase in computational power from clients during the 3 day build up to the challenge and during the challenge. Initially it was only about a 12% increase towards the start.
Our biggest glitches related to getting boincstats.com data working correctly and some rather strange data for the last hour stats on our part. Then, of course no countdown clock, a wrong clock, etc.
Everyone wants to know if there will be another challenge and given how well this one seemed to go the answer is YES! No, we do not have a date or time yet. For now there is lots of work to do besides planning another challenge :)
10 Apr 2012 | 17:17:19 UTC
First ever project challenge
I am pleased to announce that we will be hosting our first project challenge!
Rainbow Resurrection: 2012-04-07 16:00:00 GMT - 2012-04-10 16:00:00 GMT
This 72 hour challenge will feature Primegrid style user and team statistics and only work handed out after the start time and returned before the completion time will be counted. In addition all WUs that are part of the challenge will receive 1.25x the credit that is received ordinarily.
The reason for this challenge is two fold:
1) The new hardware has been performing excellently and is averaging less than 20% utilization
2) An ATI/OpenCL app version is in early alpha stages and this challenge will be helpful in seeing if we can handle full 24/7 CPU+GPU where ATI will be added in the future.
31 Mar 2012 | 2:20:24 UTC
News is available as an RSS feed