\begin{verbatim}
PARDIS0 - Release Notes - version 6.2.0
---------------------------------------
PARDISO Contents
----------------
PARDISO - a direct and iterative sparse linear solver library - is a
tuned math solver library designed for high performance on homogeneous
multicore machines and cluster of multicores including Intel Xeon, Intel
Itanium, AMD Opteron, and IBM Power processors, and includes both 32-bit
and 64-bit library versions. Different versions are available for Linux
MAC OSX, and Windows 64-bit operating systems.
A full suite of sparse direct linear solver routines for all kind of
sparse matrices is provided, and key datastructures have been designed
for high performance on various multicore processors and cluster of multicore
processors in 64-bit modes.
A selected suite of iterative linear solvers for real and complex symmetric
indefinite matrices is provided that takes advantage of a new algebraic
multilevel incomplete factorization that is especially designed for good
performance in very large-scale applications.
New Features
------------
New features of release PARDISO 6.2.0:
(o) Added incremental LU updates, multiple-rank parallel update algorithms for
sparse LU factors. The incremental update been been proven to be very useful
for transient simulations in circuit simulation.
(o) Added much faster internal block factorization method.
New features of release PARDISO 6.1.0:
(o) Added approximate minimum degree orderings.
New features of release PARDISO 6.0.0:
(o) Added support for the R-INLA project.
(o) Added acceleration of block orderings for symmetric indefinite matrices.
(o) Significantly improved the reordering time for matrices including dense columns.
(o) Added METIS 5.1 as additional preprocessing method.
(o) Improved scalability for factorization on higher number of cores.
(o) Added out-of-core option for real and complex symmetric indefinite matrices.
(o) New internal data structure to simpify future developments.
New features of release PARDISO 5.0.0:
(o) Switch to host-free license for all 64-bit libraries. This allows all users
to use the PARDISO software within a cluster environment.
(o) Full support of multi-threaded Schur-complement computations for all
kind of matrices.
(o) Full support of multi-threaded parallel selected inversion to compute selected
entries of the inverse of A.
(o) Faster multi-threaded code for the solution of multiple right hand sides.
(o) Full support of 32-bit sequential and parallel factorizations for all kind
of matrices.
New features of release PARDISO 4.1.3:
(o) Bug fix in the computation of the log of the determinant.
New features of release PARDISO 4.1.2:
(o) Support of different licensing types
- Evaluation license for academic and commercial use.
- Academic license (host-unlocked, user-locked, 1 year).
- Commercial single-user license (host-unlocked, user-locked, 1 year).
- Commercial license (host-unlocked, user-unlocked, redistributable, 1 year).
New features of release PARDISO 4.1.0:
(o) New MPI-based numerical factorization and parallel forward/backward
substitution on distributed-memory architectures for symmetric
indefinite matrices. PARDISO 4.1.0 has the unique feature among all
solvers that it can compute the exact bit-identical solution
on multicores and cluster of multicores. Here are some results for
a nonlinear FE model with 800'000 elements from automobile sheet
metal forming simulations.
CPUs per node:
4 x Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (8 cores in total)
Memory per node:
12 GiB, Interconnect: Infiniband 4xQDR
(t_fact: factorization in seconds, t_solve= solve in seconds)
PARDISO 4.1.0 (deterministic) :
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
t_fact | 1 core 4 cores 8 cores
---------+--------+---------+---------
1 host | 92.032 23.312 11.966
2 hosts | 49.051 12.516 7.325
4 hosts | 31.646 8.478 5.018
t_solve | 1 core 4 cores 8 cores
---------+--------+---------+---------
1 host | 2.188 0.767 0.545
2 hosts | 1.205 0.462 0.358
4 hosts | 0.856 0.513 0.487
Intel MKL 10.2 (non-deterministic):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
t_fact | 1 core 4 cores 8 cores
--------+--------+---------+---------
1 host | 94.566 27.266 14.018
t_solve | 1 core 4 cores 8 cores
---------+--------+---------+---------
1 host | 2.223 2.183 2.207
- The MPI version is only available for academic research purposes.
(o) New host-unlimited licensing meachnism integrated into PARDISO 4.1.0.
We now have two different options available:
- a time-limited user-host locked license, and
- a time-limited user-locked (host-free) license.
(o) 32-bit sequential and parallel factorization and solve routines for
real unsymmetric matrices (matrix_type = 11). Mixed-precision refinement
can used for these 32-bit sparse direct factorizations.
(o) Additional routines that can check the input data (matrix, right-hand-side)
(contribution from Robert Luce, TU Berlin)
New features of release 4.0.0 of PARDISO since version 3.3.0:
(o) Due to the new features the interface to PARDISO and PARDISOINIT has
changed! This version is not backward compatible!
(o) Reproducibility of exact numerical results on multi-core architectures.
The solver is now able to compute the exact bit identical solution
independent on the number of cores without effecting the scalability.
Here are some results for a nonlinear FE model with 500'000 elements.
Intel MKL PARDISO 10.2
1 core - factor: 17.980 sec., solve: 1.13 sec.
2 cores - factor: 9.790 sec., solve: 1.13 sec.
4 cores - factor: 6.120 sec., solve: 1.05 sec.
8 cores - factor: 3.830 sec., solve: 1.05 sec.
U Basel PARDISO 4.0.0:
1 core - factor: 16.820 sec., solve: 1.09 sec.
2 cores - factor: 9.021 sec., solve: 0.67 sec.
4 cores - factor: 5.186 sec., solve: 0.53 sec.
8 cores - factor: 3.170 sec., solve: 0.43 sec.
This method is currently only working for symmetric indefinite matrices.
(o) 32-bit sequential and parallel factorization and solve routines for
real symmetric indefinite matrices, for symmetric complex matrices
and for structurally symmetric matrices. Mixed-precision refinement
is used for these 32-bit sparse direct factorizations.
(o) Internal 64-bit integer datastructures for the numerical factors allow
to solve very large sparse matrices with over 2^32 nonzeros in the sparse
direct factors.
(o) Work has been done to significantly improve the parallel performance of
the sparse direct solver which results in a much better scalability for
the numerical factorization and solve on multicore machines. At the same
time, the workspace memory requirements have been substantially reduced,
making the PARDISO direct routine better able to deal with large problem
sizes.
(o) Integration of a parallel multi-threaded METIS reordering that helps to
accelerate the reordering phase (Done by to Stefan Roellin, ETH Zurich)
(o) Integration of a highly efficient preconditioning method that is
based on a multi-recursive incomplete factorization scheme and
stabilized with a new graph-pivoting algorithm. The method have been
selected by the SIAM Journal of Scientific Computing as a very important
milestone in the area of new solvers for symmetric indefinite matrices
and the related paper appeared as a SIGEST SIAM Paper in 2008.
This preconditioner is highly effective for large-scale matrices with
millions of equations.
[1] O. Schenk, M. Bollhoefer, and R. Roemer, On large-scale
diagonalization techniques for the Anderson model of localization.
Featured SIGEST paper in the SIAM Review selected "on the basis of its
exceptional interest to the entire SIAM community".
SIAM Review 50 (2008), pp. 91--112.
(o) Support of 32-bit and 64-bit Windows operating systems (based on
Intel Professional Compiler Suite and the Intel MKL Performance
Library)
(o) A new extended interface to direct and iterative solver.
Double-precision parameters are passed by a dparm array to the solver.
The interface allow for greater flexibility in the storage of input
and output data within supplied arrays through the setting of increment
arguments.
(o) Note that the interface to PARDISO and PARDISOINIT has changed and that
this version is not backward compatible.
(o) Computation of the determinant for symmetric indefinite matrices.
(o) Solve A^Tx=b using the factorization of A.
(o) The solution process e.g. LUx=b can be performed in several phases that
the user can control.
(o) This version of PARDISO is compatible with the interior-point optimization
package IPOPT version 3.7.0
(o) A new matlab interface has been added that allows a flexible use of all
direct and iterative solvers.
Contributions
-------------
The following colleagues have contributed to the solver (in alphabetical order):
Peter Carbonetto (UBC Vancouver, Canada).
Radim Janalik (USI Lugano, Switzerland
George Karypis (U Minnesota, US)
Arno Liegmann (ETHZ, Switzerland)
Esmond Ng (LBNL, US)
Stefan Roellin (ETHZ, Switzerland)
Michael Saunders (Stanford, US)
Applicability
-------------
Different PARDISO versions are provided for use with the GNU compilers
gfortran/gcc, with the Intel ifort compiler, and (for use under Solaris)
with the Sun f95/cc compilers.
Required runtime libraries under Microsoft Windows
--------------------------------------------------
PARDISO version 4.0.0 and later link with the standard runtime library
provided by the Microsoft Visual Studio 2008 compilers. This requires
that the machine PARDISO runs on either has VS2K8 installed (or the Windows
SDK for Windows Server 2008), or the runtime libraries can be separately
downloaded from the appropriate Microsoft platform links provided below:
Visual Studio 2K8 Redist:
x86
x64
Bug Reports
-----------
Bugs should be reported to info@panua.chwith the string
"PARDISO-Support" in the subject line.
\end{verbatim}