Linux / Folding Fragen

ADGMike

F@H-Team-Member (m/w)
Wie wird grundsätzlich beendet ?
Wenn ich wie bei WIN "strg c" nutze, bleibt die CPU 100% aktiv.

Ist bei mir etwas zu verändern/einzustellen, ... ? Wenn eine WU fertig ist, dauert es rd. 1 Std. bis "Shutting down core", bzw. dauert es rd. 3 Std. bis er an der neuen WU rechnet.

[00:39:08] Completed 245001 out of 250001 steps (98%)
[01:09:56] Completed 247501 out of 250001 steps (99%)
[01:40:44] Completed 250001 out of 250001 steps (100%)

Writing final coordinates.

Average load imbalance: 0.1 %
Part of the total run time spent waiting due to load imbalance: 0.1 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 %


Parallel run - timing based on wallclock.

NODE (s) Real (s) (%)
Time: 185249.103 185249.103 100.0
2d03h27:29
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 492.458 26.805 0.466 51.458

gcq#0: Thanx for Using GROMACS - Have a Nice Day

[01:40:52] DynamicWrapper: Finished Work Unit: sleep=10000
[01:41:02]
[01:41:02] Finished Work Unit:
[01:41:02] - Reading up to 52713120 from "work/wudata_00.trr": Read 52713120
[01:41:02] trr file hash check passed.
[01:41:02] - Reading up to 42792752 from "work/wudata_00.xtc": Read 42792752
[01:41:02] xtc file hash check passed.
[01:41:02] edr file hash check passed.
[01:41:02] logfile size: 205947
[01:41:02] Leaving Run
[01:41:05] - Writing 95877127 bytes of core data to disk...
[01:41:09] ... Done.
[02:40:38] - Shutting down core
[02:40:38]
[02:40:38] Folding@home Core Shutdown: FINISHED_UNIT
Attempting to use an MPI routine after finalizing MPICH
[02:45:32] CoreStatus = 64 (100)
[02:45:32] Unit 0 finished with 64 percent of time to deadline remaining.
[02:45:32] Updated performance fraction: 0.692435
[02:45:32] Sending work to server
[02:45:32] Project: 2681 (Run 1, Clone 17, Gen 68)


[02:45:32] + Attempting to send results [January 21 02:45:32 UTC]
[02:45:32] - Reading file work/wuresults_00.dat from core
[02:45:32] (Read 95877127 bytes from disk)
[02:45:32] Connecting to http://171.67.108.22:8080/
[03:00:35] Posted data.
[03:00:35] Initial: 0000; - Uploaded at ~75 kB/s
[03:06:05] - Averaged speed for that direction ~99 kB/s
[03:06:05] + Results successfully sent
[03:06:05] Thank you for your contribution to Folding@Home.
[03:06:05] + Number of Units Completed: 57

[03:21:42] - Warning: Could not delete all work unit files (0): Core file absent
[03:21:42] Trying to send all finished work units
[03:21:42] + No unsent completed units remaining.
[03:21:42] - Preparing to get new work unit...
[03:21:42] Cleaning up work directory
[04:10:49] - Autosending finished units... [January 21 04:10:49 UTC]
[04:10:49] Trying to send all finished work units
[04:10:49] + No unsent completed units remaining.
[04:10:49] - Autosend completed
[04:11:15] + Attempting to get work packet
[04:11:15] - Will indicate memory of 5979 MB
[04:11:15] - Connecting to assignment server
[04:11:15] Connecting to http://assign.stanford.edu:8080/
[04:11:16] Posted data.
[04:11:16] Initial: 43AB; - Successful: assigned to (171.67.108.22).
[04:11:16] + News From Folding@Home: Welcome to Folding@Home
[04:11:16] Loaded queue successfully.
[04:11:16] Connecting to http://171.67.108.22:8080/
[04:11:55] Posted data.
[04:11:55] Initial: 0000; - Receiving payload (expected size: 30236950)
[04:12:18] - Downloaded at ~1283 kB/s
[04:12:18] - Averaged speed for that direction ~973 kB/s
[04:12:18] + Received work.
[04:12:18] Trying to send all finished work units
[04:12:18] + No unsent completed units remaining.
[04:12:18] + Closed connections
[04:12:18]
[04:12:18] + Processing work unit
[04:12:18] Core required: FahCore_a2.exe
[04:12:18] Core found.
[04:12:18] Working on queue slot 01 [January 21 04:12:18 UTC]
[04:12:18] + Working ...
[04:12:18] - Calling './mpiexec -np 8 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -nice 19 -suffix 01 -checkpoint 15 -verbose -lifeline 2509 -version 624'

[04:12:18]
[04:12:18] *------------------------------*
[04:12:18] Folding@Home Gromacs SMP Core
[04:12:18] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[04:12:18]
[04:12:18] Preparing to commence simulation
[04:12:18] - Ensuring status. Please wait.
[04:12:21] Called DecompressByteArray: compressed_data_size=30236438 data_size=159270593, decompressed_data_size=159270593 diff=0
[04:12:22] - Digital signature verified
[04:12:22]
[04:12:22] Project: 2683 (Run 14, Clone 12, Gen 18)
[04:12:22]
[04:12:22] Assembly optimizations on if available.
[04:12:22] Entering M.D.
[04:12:33] Run 14, Clone 12, Gen 18)
[04:12:33]
[04:12:33] Entering M.D.
NNODES=8, MYRANK=0, HOSTNAME=adgmike
NODEID=0 argc=20
NNODES=8, MYRANK=2, HOSTNAME=adgmike
NODEID=2 argc=20
NNODES=8, MYRANK=3, HOSTNAME=adgmike
NODEID=3 argc=20
NNODES=8, MYRANK=5, HOSTNAME=adgmike
NNODES=8, MYRANK=6, HOSTNAME=adgmike
NODEID=6 argc=20
NNODES=8, MYRANK=7, HOSTNAME=adgmike
NNODES=8, MYRANK=4, HOSTNAME=adgmike
NODEID=4 argc=20
NNODES=8, MYRANK=1, HOSTNAME=adgmike
NODEID=1 argc=20
Reading file work/wudata_01.tpr, VERSION 3.3.99_development_20070618 (single precision)
NODEID=7 argc=20
NODEID=5 argc=20
Note: tpx file_version 48, software version 68

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 8 x 1 x 1
starting mdrun 'SINGLE VESICLE in water'
4750001 steps, 19000.0 ps (continuing from step 4500001, 18000.0 ps).
[04:43:26] pleted 2500 out of 250000 steps (1%)
 
Hallo,
Also wenn ich bei mir unter Linux Strg+C drücke werden die Prozesse beendet! Und die CPU geht ins idle! Alle 8 Cores.
Ich hatte am Anfang zeitweise änliche Probs. Ich kann dir aber wirklich nicht mehr genau sagen woran es gelegen hatte.
Ich bin mir nicht mehr ganz sicher, aber ich meine ich habe damals mein Linux nochmals neu aufgesetzt und zwar mit den default Einstellungen und nicht mit der OC CPU da immer wenn ich Linux mit der OC CPU installiert hatte, dann ist irgendetwas nicht richtig gelaufen und ich hatte immer wieder irgendwelche CPU Belastungen. Oder andere Probs. Und beim arbeiten einer big WU hatte ich auch immer starke Schwankungen. Seitdem ich so verfahre, klappt das auch mit der Stabilität unter Linux. Erst nach der Installation von Linux habe ich dann die CPU wieder hochgetaktet.
Im System Monitor unter Linux kannst du das Verhalten unter Ressourcen schön beobachten.

Welches Linux und welchen Kernel verwendest du?

Drei Stunden bis zur nächsten WU kommt mir jetzt allerdings etwas lange vor. Im Schnitt sind es bei mir ca. 1- max.2 Stunden bis zur nächsten WU. Ich habe gerade nachgeschaut, es hat gerade eine neue angefangen.

Grüße Teci
 
Hi Teci,
ich nutze Xubuntu Ver. 9.10 / Kernel 2.6.31-19-generic / GNOME 2.28.1.
Ich versuche es nochmal mit STRG+C und warte mal ein paar Minuten, falls die CPU-Leistung noch 100% anzeigt, versuche ich es mal mit dem shutdown -r Befehl.
Info folgt.
 
Hi Teci,
ich nutze Xubuntu Ver. 9.10 / Kernel 2.6.31-19-generic / GNOME 2.28.1.
Ich versuche es nochmal mit STRG+C und warte mal ein paar Minuten, falls die CPU-Leistung noch 100% anzeigt, versuche ich es mal mit dem shutdown -r Befehl.
Info folgt.

Länger wie 10 Sekunden sollte es auf keinen Fall dauern bis alle cores runtergefahren sind.

Der Kernel ist gut den verwende ich auch. Der ist sehr Perform.
 
Tja, Ctrl+C bei 83% - und er hört nicht auf.
Diesmal hat er nur 2 Std. benötigt, um die nächste WU anzufangen:
[22:35:13] Completed 207500 out of 250000 steps (83%)
^C
adgmike@adgmike:~/fah$ [23:06:31] Completed 210000 out of 250000 steps (84%)
[23:40:09] Completed 212500 out of 250000 steps (85%)
[00:10:44] Completed 215000 out of 250000 steps (86%)
[00:41:19] Completed 217500 out of 250000 steps (87%)
[01:11:55] Completed 220000 out of 250000 steps (88%)
[01:42:30] Completed 222500 out of 250000 steps (89%)
[02:13:07] Completed 225000 out of 250000 steps (90%)
[02:43:42] Completed 227500 out of 250000 steps (91%)
[03:14:18] Completed 230000 out of 250000 steps (92%)
[03:44:53] Completed 232500 out of 250000 steps (93%)
[04:10:49] - Autosending finished units... [January 23 04:10:49 UTC]
[04:10:49] Trying to send all finished work units
[04:10:49] + No unsent completed units remaining.
[04:10:49] - Autosend completed
[04:15:29] Completed 235000 out of 250000 steps (94%)
[04:46:05] Completed 237500 out of 250000 steps (95%)
[05:16:41] Completed 240000 out of 250000 steps (96%)
[05:47:17] Completed 242500 out of 250000 steps (97%)
[06:17:53] Completed 245000 out of 250000 steps (98%)
[06:49:03] Completed 247500 out of 250000 steps (99%)
[07:19:40] Completed 250000 out of 250000 steps (100%)

Writing final coordinates.

Average load imbalance: 0.2 %
Part of the total run time spent waiting due to load imbalance: 0.1 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 %


Parallel run - timing based on wallclock.

NODE (s) Real (s) (%)
Time: 184015.990 184015.990 100.0
2d03h06:55
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 493.232 26.895 0.470 51.115

gcq#0: Thanx for Using GROMACS - Have a Nice Day

[07:19:48] DynamicWrapper: Finished Work Unit: sleep=10000
[07:19:58]
[07:19:58] Finished Work Unit:
[07:19:58] - Reading up to 52544928 from "work/wudata_01.trr": Read 52544928
[07:19:58] trr file hash check passed.
[07:19:58] - Reading up to 41980720 from "work/wudata_01.xtc": Read 41980720
[07:19:58] xtc file hash check passed.
[07:19:58] edr file hash check passed.
[07:19:58] logfile size: 204666
[07:19:58] Leaving Run
[07:20:01] - Writing 94895230 bytes of core data to disk...
[07:20:04] ... Done.
adgmike@adgmike:~/fah$ [08:24:08] - Shutting down core
[08:24:08]
[08:24:08] Folding@home Core Shutdown: FINISHED_UNIT
Attempting to use an MPI routine after finalizing MPICH
[08:30:05] CoreStatus = 64 (100)
[08:30:05] Unit 1 finished with 64 percent of time to deadline remaining.
[08:30:05] Updated performance fraction: 0.681314
[08:30:05] Sending work to server
[08:30:05] Project: 2683 (Run 14, Clone 12, Gen 18)


[08:30:05] + Attempting to send results [January 23 08:30:05 UTC]
[08:30:05] - Reading file work/wuresults_01.dat from core
[08:30:05] (Read 94895230 bytes from disk)
[08:30:05] Connecting to http://171.67.108.22:8080/
[08:43:23] Posted data.
[08:43:24] Initial: 0000; - Uploaded at ~80 kB/s
[08:49:10] - Averaged speed for that direction ~96 kB/s
[08:49:10] + Results successfully sent
[08:49:10] Thank you for your contribution to Folding@Home.
[08:49:10] + Number of Units Completed: 58

[09:04:40] - Warning: Could not delete all work unit files (1): Core file absent
[09:04:40] Trying to send all finished work units
[09:04:40] + No unsent completed units remaining.
[09:04:40] - Preparing to get new work unit...
[09:04:40] Cleaning up work directory
[09:09:14] + Attempting to get work packet
[09:09:14] - Will indicate memory of 5979 MB
[09:09:14] - Connecting to assignment server
[09:09:14] Connecting to http://assign.stanford.edu:8080/
[09:09:15] Posted data.
[09:09:15] Initial: 43AB; - Successful: assigned to (171.67.108.22).
[09:09:15] + News From Folding@Home: Welcome to Folding@Home
[09:09:15] Loaded queue successfully.
[09:09:15] Connecting to http://171.67.108.22:8080/
[09:10:00] Posted data.
[09:10:00] Initial: 0000; - Receiving payload (expected size: 30234017)
[09:10:24] - Downloaded at ~1230 kB/s
[09:10:24] - Averaged speed for that direction ~1024 kB/s
[09:10:24] + Received work.
[09:10:24] Trying to send all finished work units
[09:10:24] + No unsent completed units remaining.
[09:10:24] + Closed connections
[09:10:24]
[09:10:24] + Processing work unit
[09:10:24] Core required: FahCore_a2.exe
[09:10:24] Core found.
[09:10:24] Working on queue slot 02 [January 23 09:10:24 UTC]
[09:10:24] + Working ...
[09:10:24] - Calling './mpiexec -np 8 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -nice 19 -suffix 02 -checkpoint 15 -verbose -lifeline 2509 -version 624'

[09:10:24]
[09:10:24] *------------------------------*
[09:10:24] Folding@Home Gromacs SMP Core
[09:10:24] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[09:10:24]
[09:10:24] Preparing to commence simulation
[09:10:24] - Ensuring status. Please wait.
[09:10:27] Called DecompressByteArray: compressed_data_size=30233505 data_size=159270593, decompressed_data_size=159270593 diff=0
[09:10:27] - Digital signature verified
[09:10:27]
[09:10:27] Project: 2683 (Run 8, Clone 17, Gen 28)
[09:10:27]
[09:10:28] Assembly optimizations on if available.
[09:10:28] Entering M.D.
[09:10:38] (Run 8, Clone 17, Gen 28)
[09:10:38]
[09:10:39] Entering M.D.
NNODES=8, MYRANK=0, HOSTNAME=adgmike
NODEID=0 argc=20
NNODES=8, MYRANK=3, HOSTNAME=adgmike
NNODES=8, MYRANK=4, HOSTNAME=adgmike
NODEID=4 argc=20
NNODES=8, MYRANK=5, HOSTNAME=adgmike
NODEID=5 argc=20
NNODES=8, MYRANK=7, HOSTNAME=adgmike
NNODES=8, MYRANK=2, HOSTNAME=adgmike
NODEID=2 argc=20
NODEID=3 argc=20
NNODES=8, MYRANK=6, HOSTNAME=adgmike
NODEID=6 argc=20
Reading file work/wudata_02.tpr, VERSION 3.3.99_development_20070618 (single precision)
NODEID=7 argc=20
NNODES=8, MYRANK=1, HOSTNAME=adgmike
NODEID=1 argc=20
Note: tpx file_version 48, software version 68

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 8 x 1 x 1
starting mdrun 'SINGLE VESICLE in water'
7250000 steps, 29000.0 ps (continuing from step 7000000, 28000.0 ps).
[09:11:01] Completed 0 out of 250000 steps (0%)
[09:41:44] Completed 2500 out of 250000 steps (1%)
Ich wollte die 83% nicht riskieren, habe einfach laufen lassen. Jetzt, bei der neuen WU versuche ich es mal mit "shutdown -r".
 
Zurück