Below are some results on a 3D problem, on a Linux cluster at UTEP.
On this problem, the iterative method (ISOLVE=3) was much faster than
the others, but for many, more difficult, problems, the iterative method
converges very slowly, or not at all, and then the direct methods
(ISOLVE=1 is a sparse direct solver, ISOLVE=2 is a frontal method,
ISOLVE=6 is a parallel band solver) are the only options.
Uxx+Uyy+Uzz=3U in the unit cube.
| ISOLVE | procs | 13x13x13 grid | 20x20x20 grid | 20x20x20 grid
|
| | | 17576 unknowns | 64000 unknowns |
|
| | | rel. err. = 5.E-8 | rel. err. = 9.E-9 |
|
| 1 | 1 | 117 seconds | 2566 seconds | 517 MW memory/node
|
| 2 | 1 | 432 | (large) | 22
|
| 3 | 1 | 10 | 68 | 13
|
| | 2 | 7 | 42 | 13
|
| | 4 | 6 | 36 | 7
|
| | 8 | 6 | 37 | 4
|
| | 16 | 7 | 39 | 2
|
| 6 | 1 | 276 | 14953 | 643
|
| | 2 | 106 | 1893 | 327
|
| | 4 | 87 | 1104 | 164
|
| | 8 | 52 | 653 | 82
|
| | 16 | 43 | 377 | 41
|
For PDE2D, multiple processors are primarily useful for 3D problems,
because on 2D problems the sparse direct solver is usually (though not
always) faster on one processor than the parallel methods on many
processors. However, they are useful for 2D problems when all eigenvalues
of an eigenproblem are calculated, as illustrated by the example below.
Uxx+Uyy=λU in the unit circle.
| procs | 21x51 grid
|
| | 4284 unknowns
|
| | 1.0E-7 rel. err. in first eigenvalue
|
| 1 | 3453 seconds
|
| 2 | 1524
|
| 4 | 1044
|
| 8 | 795
|
| 16 | 1171
|