GPU Client stürzt ständig ab

sandman85

Komplett-PC-Aufrüster(in)
GPU Client stürzt ständig ab

Hallöchen zusammen...

Ich hab da n kleines Problem mit meinem Rechner. Ich folde eigentlich mit 2 Clients auf meiner GPU, allerdings war das in den letzten Tagen nicht besonders von Erfolg gekrönt :)

Beide Clients haben ne 1888er WU fertig gerechnet und seitdem bekomm ich direkt nach dem Start der Clients bekomm ich folgendes:

Code:
# Windows GPU Console Edition #################################################
###############################################################################
                       [EMAIL="Folding@Home"]Folding@Home[/EMAIL] Client Version 6.23
                          [url=http://folding.stanford.edu]Folding@home - Main[/url]
###############################################################################
###############################################################################
Launch directory: E:\Folding\GeForce_1
Executable: E:\Folding\GeForce_1\Folding@home-Win32-GPU.exe
Arguments: -verbosity 9 -local 
[16:49:04] - Ask before connecting: No
[16:49:04] - User ID: 42F820D564EAB46F
[16:49:04] - Machine ID: 2
[16:49:04] 
[16:49:04] Work directory not found. Creating...
[16:49:04] Could not open work queue, generating new queue...
[16:49:04] - Preparing to get new work unit...
[16:49:04] + Attempting to get work packet
[16:49:04] - Autosending finished units... [October 2 16:49:04 UTC]
[16:49:04] - Will indicate memory of 3063 MB
[16:49:04] Trying to send all finished work units
[16:49:04] - Detect CPU.[16:49:04] + No unsent completed units remaining.
 Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 5
[16:49:04] - Autosend completed
[16:49:04] - Connecting to assignment server
[16:49:04] Connecting to [URL]http://assign-GPU.stanford.edu:8080/[/URL]
[16:49:05] Posted data.
[16:49:05] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[16:49:05] + News From [EMAIL="Folding@Home"]Folding@Home[/EMAIL]: Welcome to [EMAIL="Folding@Home"]Folding@Home[/EMAIL]
[16:49:05] Loaded queue successfully.
[16:49:05] Connecting to [URL]http://171.67.108.11:8080/[/URL]
[16:49:06] Posted data.
[16:49:06] Initial: 0000; - Receiving payload (expected size: 45886)
[16:49:07] - Downloaded at ~44 kB/s
[16:49:07] - Averaged speed for that direction ~44 kB/s
[16:49:07] + Received work.
[16:49:07] + Closed connections
[16:49:07] 
[16:49:07] + Processing work unit
[16:49:07] Core required: FahCore_11.exe
[16:49:07] Core not found.
[16:49:07] - Core is not present or corrupted.
[16:49:07] - Attempting to download new core...
[16:49:07] + Downloading new core: FahCore_11.exe
[16:49:07] Downloading core (/~pande/Win32/x86/NVIDIA/G80/Core_11.fah from [URL="http://www.stanford.edu"]www.stanford.edu[/URL])
[16:49:11] Initial: AFDE; + 10240 bytes downloaded
[16:49:11] Initial: 8CE4; + 20480 bytes downloaded
[16:49:11] Initial: D8CE; + 30720 bytes downloaded
[16:49:11] Initial: 268C; + 40960 bytes downloaded
[16:49:11] Initial: F040; + 51200 bytes downloaded
[16:49:11] Initial: 8F3D; + 61440 bytes downloaded
[16:49:11] Initial: E22C; + 71680 bytes downloaded
[16:49:11] Initial: 232D; + 81920 bytes downloaded
[16:49:11] Initial: 7B96; + 92160 bytes downloaded
[16:49:11] Initial: 35ED; + 102400 bytes downloaded
[16:49:11] Initial: 12C7; + 112640 bytes downloaded
[16:49:11] Initial: CE46; + 122880 bytes downloaded
[16:49:11] Initial: 4CD5; + 133120 bytes downloaded
[16:49:11] Initial: 0CD7; + 143360 bytes downloaded
[16:49:11] Initial: 0D37; + 153600 bytes downloaded
[16:49:11] Initial: 2D4A; + 163840 bytes downloaded
[16:49:11] Initial: 602C; + 174080 bytes downloaded
[16:49:11] Initial: 4001; + 184320 bytes downloaded
[16:49:11] Initial: D84F; + 194560 bytes downloaded
[16:49:11] Initial: 2151; + 204800 bytes downloaded
[16:49:11] Initial: 2ABF; + 215040 bytes downloaded
[16:49:11] Initial: 5CA8; + 225280 bytes downloaded
[16:49:11] Initial: 40D4; + 235520 bytes downloaded
[16:49:11] Initial: 1804; + 245760 bytes downloaded
[16:49:11] Initial: 1478; + 256000 bytes downloaded
[16:49:11] Initial: D0B4; + 266240 bytes downloaded
[16:49:11] Initial: B858; + 276480 bytes downloaded
[16:49:11] Initial: 1830; + 286720 bytes downloaded
[16:49:11] Initial: BA4E; + 296960 bytes downloaded
[16:49:11] Initial: 3985; + 307200 bytes downloaded
[16:49:11] Initial: 232C; + 317440 bytes downloaded
[16:49:11] Initial: 63DF; + 327680 bytes downloaded
[16:49:11] Initial: 0AAF; + 337920 bytes downloaded
[16:49:11] Initial: 7F28; + 348160 bytes downloaded
[16:49:11] Initial: 9005; + 358400 bytes downloaded
[16:49:11] Initial: 8084; + 368640 bytes downloaded
[16:49:11] Initial: FBCE; + 378880 bytes downloaded
[16:49:11] Initial: F76F; + 389120 bytes downloaded
[16:49:11] Initial: C594; + 399360 bytes downloaded
[16:49:11] Initial: B0BC; + 409600 bytes downloaded
[16:49:11] Initial: DD0D; + 419840 bytes downloaded
[16:49:11] Initial: DCF5; + 430080 bytes downloaded
[16:49:11] Initial: 6D67; + 440320 bytes downloaded
[16:49:11] Initial: A987; + 450560 bytes downloaded
[16:49:11] Initial: E0CA; + 460800 bytes downloaded
[16:49:11] Initial: 6090; + 471040 bytes downloaded
[16:49:11] Initial: 83FD; + 481280 bytes downloaded
[16:49:11] Initial: D0F9; + 491520 bytes downloaded
[16:49:11] Initial: A4B9; + 501760 bytes downloaded
[16:49:11] Initial: 7046; + 512000 bytes downloaded
[16:49:11] Initial: 3F00; + 522240 bytes downloaded
[16:49:11] Initial: 07F1; + 532480 bytes downloaded
[16:49:11] Initial: 6E7D; + 542720 bytes downloaded
[16:49:11] Initial: 4416; + 552960 bytes downloaded
[16:49:11] Initial: 353A; + 563200 bytes downloaded
[16:49:11] Initial: 6C4D; + 573440 bytes downloaded
[16:49:11] Initial: 305A; + 583680 bytes downloaded
[16:49:11] Initial: 3D23; + 593920 bytes downloaded
[16:49:11] Initial: A242; + 604160 bytes downloaded
[16:49:11] Initial: 1E02; + 614400 bytes downloaded
[16:49:11] Initial: 75E0; + 624640 bytes downloaded
[16:49:11] Initial: 8F3A; + 634880 bytes downloaded
[16:49:11] Initial: 1E1B; + 642475 bytes downloaded
[16:49:11] Verifying core Core_11.fah...
[16:49:11] Signature is VALID
[16:49:11] 
[16:49:11] Trying to unzip core FahCore_11.exe
[16:49:11] Decompressed FahCore_11.exe (1843200 bytes) successfully
[16:49:16] + Core successfully engaged
[16:49:21] 
[16:49:21] + Processing work unit
[16:49:21] Core required: FahCore_11.exe
[16:49:21] Core found.
[16:49:21] Working on queue slot 01 [October 2 16:49:21 UTC]
[16:49:21] + Working ...
[16:49:21] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -checkpoint 3 -verbose -lifeline 2712 -version 623'
[16:49:21] 
[16:49:21] *------------------------------*
[16:49:21] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[16:49:21] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[16:49:21] 
[16:49:21] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:49:21] Build host: amoeba
[16:49:21] Board Type: Nvidia
[16:49:21] Core      : 
[16:49:21] Preparing to commence simulation
[16:49:21] - Looking at optimizations...
[16:49:21] - Created dyn
[16:49:21] - Files status OK
[16:49:21] - Expanded 45374 -> 251112 (decompressed 553.4 percent)
[16:49:21] Called DecompressByteArray: compressed_data_size=45374 data_size=251112, decompressed_data_size=251112 diff=0
[16:49:21] - Digital signature verified
[16:49:21] 
[16:49:21] Project: 5769 (Run 12, Clone 255, Gen 244)
[16:49:21] 
[16:49:21] Assembly optimizations on if available.
[16:49:21] Entering M.D.
[16:49:27] Working on Protein
[16:49:28] Client config found, loading data.
[16:49:28] mdrun_gpu returned 
[16:49:28] NANs detected on GPU
[16:49:28] 
[16:49:28] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[16:49:31] CoreStatus = 7A (122)
[16:49:31] Sending work to server
[16:49:31] Project: 5769 (Run 12, Clone 255, Gen 244)
[16:49:31] - Error: Could not get length of results file work/wuresults_01.dat
[16:49:31] - Error: Could not read unit 01 file. Removing from queue.
[16:49:31] Trying to send all finished work units
[16:49:31] + No unsent completed units remaining.
[16:49:31] - Preparing to get new work unit...
[16:49:31] + Attempting to get work packet
[16:49:31] - Will indicate memory of 3063 MB
[16:49:31] - Connecting to assignment server
[16:49:31] Connecting to [URL]http://assign-GPU.stanford.edu:8080/[/URL]
[16:49:32] Posted data.
[16:49:32] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[16:49:32] + News From [EMAIL="Folding@Home"]Folding@Home[/EMAIL]: Welcome to [EMAIL="Folding@Home"]Folding@Home[/EMAIL]
[16:49:32] Loaded queue successfully.
[16:49:32] Connecting to [URL]http://171.67.108.11:8080/[/URL]
[16:49:33] Posted data.
[16:49:33] Initial: 0000; - Error: Bad packet type from server, expected work assignment
[16:49:33] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[16:49:48] + Attempting to get work packet
[16:49:48] - Will indicate memory of 3063 MB
[16:49:48] - Connecting to assignment server
[16:49:48] Connecting to [URL]http://assign-GPU.stanford.edu:8080/[/URL]
[16:49:48] Posted data.
[16:49:48] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[16:49:48] + News From [EMAIL="Folding@Home"]Folding@Home[/EMAIL]: Welcome to [EMAIL="Folding@Home"]Folding@Home[/EMAIL]
[16:49:49] Loaded queue successfully.
[16:49:49] Connecting to [URL]http://171.67.108.11:8080/[/URL]
[16:49:49] Posted data.
[16:49:49] Initial: 0000; - Receiving payload (expected size: 45968)
[16:49:50] - Downloaded at ~44 kB/s
[16:49:50] - Averaged speed for that direction ~44 kB/s
[16:49:50] + Received work.
[16:49:50] Trying to send all finished work units
[16:49:50] + No unsent completed units remaining.
[16:49:50] + Closed connections
[16:49:55] 
[16:49:55] + Processing work unit
[16:49:55] Core required: FahCore_11.exe
[16:49:55] Core found.
[16:49:55] Working on queue slot 02 [October 2 16:49:55 UTC]
[16:49:55] + Working ...
[16:49:55] - Calling '.\FahCore_11.exe -dir work/ -suffix 02 -checkpoint 3 -verbose -lifeline 2712 -version 623'
[16:49:55] 
[16:49:55] *------------------------------*
 
.
.
.
 
[16:50:31] *------------------------------*
[16:50:31] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[16:50:31] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[16:50:31] 
[16:50:31] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:50:31] Build host: amoeba
[16:50:31] Board Type: Nvidia
[16:50:31] Core      : 
[16:50:31] Preparing to commence simulation
[16:50:31] - Looking at optimizations...
[16:50:31] - Created dyn
[16:50:31] - Files status OK
[16:50:31] - Expanded 45456 -> 251112 (decompressed 552.4 percent)
[16:50:31] Called DecompressByteArray: compressed_data_size=45456 data_size=251112, decompressed_data_size=251112 diff=0
[16:50:31] - Digital signature verified
[16:50:31] 
[16:50:31] Project: 5770 (Run 13, Clone 63, Gen 644)
[16:50:31] 
[16:50:31] Assembly optimizations on if available.
[16:50:31] Entering M.D.
[16:50:37] Working on Protein
[16:50:38] Client config found, loading data.
[16:50:38] mdrun_gpu returned 
[16:50:38] NANs detected on GPU
[16:50:38] 
[16:50:38] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[16:50:41] CoreStatus = 7A (122)
[16:50:41] Sending work to server
[16:50:41] Project: 5770 (Run 13, Clone 63, Gen 644)
[16:50:41] - Error: Could not get length of results file work/wuresults_04.dat
[16:50:41] - Error: Could not read unit 04 file. Removing from queue.
[16:50:41] Trying to send all finished work units
[16:50:41] + No unsent completed units remaining.
[16:50:41] - Preparing to get new work unit...
[16:50:41] + Attempting to get work packet
[16:50:41] - Will indicate memory of 3063 MB
[16:50:41] - Connecting to assignment server
[16:50:41] Connecting to [URL]http://assign-GPU.stanford.edu:8080/[/URL]
[16:50:42] Posted data.
[16:50:42] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[16:50:42] + News From [EMAIL="Folding@Home"]Folding@Home[/EMAIL]: Welcome to [EMAIL="Folding@Home"]Folding@Home[/EMAIL]
[16:50:42] Loaded queue successfully.
[16:50:42] Connecting to [URL]http://171.67.108.11:8080/[/URL]
[16:50:43] Posted data.
[16:50:43] Initial: 0000; - Receiving payload (expected size: 45968)
[16:50:44] - Downloaded at ~44 kB/s
[16:50:44] - Averaged speed for that direction ~44 kB/s
[16:50:44] + Received work.
[16:50:44] Trying to send all finished work units
[16:50:44] + No unsent completed units remaining.
[16:50:44] + Closed connections
[16:50:49] 
[16:50:49] + Processing work unit
[16:50:49] Core required: FahCore_11.exe
[16:50:49] Core found.
[16:50:49] Working on queue slot 05 [October 2 16:50:49 UTC]
[16:50:49] + Working ...
[16:50:49] - Calling '.\FahCore_11.exe -dir work/ -suffix 05 -checkpoint 3 -verbose -lifeline 2712 -version 623'
[16:50:49] 
[16:50:49] *------------------------------*
[16:50:49] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[16:50:49] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[16:50:49] 
[16:50:49] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:50:49] Build host: amoeba
[16:50:49] Board Type: Nvidia
[16:50:49] Core      : 
[16:50:49] Preparing to commence simulation
[16:50:49] - Looking at optimizations...
[16:50:49] - Created dyn
[16:50:49] - Files status OK
[16:50:49] - Expanded 45456 -> 251112 (decompressed 552.4 percent)
[16:50:49] Called DecompressByteArray: compressed_data_size=45456 data_size=251112, decompressed_data_size=251112 diff=0
[16:50:49] - Digital signature verified
[16:50:49] 
[16:50:49] Project: 5770 (Run 13, Clone 63, Gen 644)
[16:50:49] 
[16:50:49] Assembly optimizations on if available.
[16:50:49] Entering M.D.
[16:50:55] Working on Protein
[16:50:56] Client config found, loading data.
[16:50:56] Starting GUI Server
[16:50:56] mdrun_gpu returned 
[16:50:56] NANs detected on GPU
[16:50:56] 
[16:50:56] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[16:50:59] CoreStatus = 7A (122)
[16:50:59] Sending work to server
[16:50:59] Project: 5770 (Run 13, Clone 63, Gen 644)
[16:50:59] - Error: Could not get length of results file work/wuresults_05.dat
[16:50:59] - Error: Could not read unit 05 file. Removing from queue.
[16:50:59] EUE limit exceeded. Pausing 24 hours.
[17:18:22] ***** Got a SIGTERM signal (2)
[17:18:22] Killing all core threads
[EMAIL="Folding@Home"]Folding@Home[/EMAIL] Client Shutdown.

Löschen des Workfolder etc. hat leider auch nix gebracht...

Meine Graka is ne GTX260 mit Treiber 190.62 auf Windows 7 (32bit) und is net übertaktet...
Könnt ihr mir da weiterhelfen?

Grüße
Sandman
 
AW: GPU Client stürzt ständig ab

Irgendwo dürfte ich auch geschrieben haben das ich GENAU das Problem auch habe aber NUR bei den 353er WUs und während ich EINEN clienten pro karte laufen lasse, starte ich bei trotzdem laufender WU einen zweiten ist das problem das gleiche - der 2. errort. Treiberdowngrade brachte bei meinen 9800ern nix, ebensowenig ein andres physxpack.

jedoch packt die 190.38er forceware das problem ganz gut. nur noch 50% fehlerquote :ugly:
 
Zuletzt bearbeitet:
AW: GPU Client stürzt ständig ab

Also, es gibt ne kleine Neuerung...
Ich hab grade vorhin mal den GPU Client im Kompatibilitätsmodus für Win XP SP3 gestartet und seit ca. 1h läuft der Client anstandslos. Hab nichtmal den Workfolder vorher gelöscht.
Mal sehen, ob das jetz ne Problemlösung war, oder einfach nur Zufall :ugly:
Falls es net hinhaut, werd ich als nächstest mal n Treiberdowngrade versuchen (falls es aus der 188er Serie überhaupt Win7 kompatible Treiber gibt)...

Grüße
Sandman
 
AW: GPU Client stürzt ständig ab

also ich hab im falter den 190.62er und keine besonderen abstürze registriert...nur die obere karte da sie desöftern überhitzt...ist also reproduzierbar, liegt aber nicht am treiber.
 
AW: GPU Client stürzt ständig ab

Falte auf all meinen Rechnern mit 190.62 - ohne Probleme

Ich kann, mangels eigener Erfahrung, allerdings nicht ausschliessen, dass die Kombination von 190.62 mit WIN7 und 2 (353er) Clients auf einer 260GTX Probleme machen

Wenn es im Komp-Modus läuft - lass es so
 
AW: GPU Client stürzt ständig ab

Also, mittlerweile bin ich mir ziemlich sicher, dass der Komp. Modus die Lösung war. Die zwei Clients laufen jetz seit Freitag durch, machen keinerlei Probleme und Ham schon alle möglichen WUs gefaltet...
Danke euch auf jeden Fall für eure Tipps...

Grüße
Sandman
 
Zurück