Andrea and George,
1) please write a standalone test/timing of the QT code so that we can profile it using standard tools 2) Compare with call to dgmev() using Lapack + optimized Blas (possibly from a Matlab distribution). 3) Upload the code to SVN, so we can test it on other machines. The huge variability between 2 runs reported by George, may be due to Windows and usually is less important under Linux.
All the best,
Michel
G. Perendia wrote:
Dear Andrea
Thanks for the new libraries.
I run some initial performance tests today for the simple T*a matrix*vector multiplication with 2 different QT matrices sizes but in summary this is what I am, getting:
10000 iteration loop with 100x100 random QT matrix (from qz decomposition) and a vector:
1st & 2nd run (after restart)
Native matlab matrix multiplication in a loop
Ta_time = 0.3010 & 0.6610
Calling dgemv() using Sylv Vector and General Matrix is faster than Matlab loop:
GMcppTaInnrLoop_time = 0.1600 & 0.3310
Calling QT f90 library using of Sylv Vector and General Matrix:
QTcppTaInnerLoop_time = 8.5730 & 20.5300
Calling QT f90 library without use of Sylv Vector and General Matrix but using only pure C/C++ double arrays is only marginally faster:
QTcpp_noSylv_TaInnerLoop_time = 8.4420
1000 loop with 10x10 random QT matrix and vector:
For a 10x10 matrix, calling QT f90 library takes about twice the time Matlab loop does but dgemv is still faster.
Matlab: 0.0400
GMcppTaInnrLoop_time = 0.0300
QTcppTaInnerLoop_time = 0.0800
It is, however, possible that the MinGW f95 I am using is not the best optimising compiler that can be used and/or that tests for PTP', which I am planning to do next, may be better..
What are your thoughts? Do you think that we may be able to improve performance of this multiplication somehow.
I wander if making many cross-language calls may be rather detrimental and that we may improve performance if we reduce this high level of modularisation and calling, e.g. by using a higher level subroutine that will perform all operations within f90, passing back only the final Ta?
NOTES:
After a restart, Matlab appears to be much slower than later!
Also, matlab multiplication reports both, the real and the imaginary part of the result which appear complex but the real part matches QT and dgemv outputs..
Best regards
George artilogica@btconnect.com
----- Original Message ----- From: "Michel Juillard" michel.juillard@ens.fr To: "andrea pagano" pagano.andrea@gmail.com Cc: "G. Perendia" george@perendia.orangehome.co.uk Sent: Friday, June 19, 2009 1:33 PM Subject: Re: Quasi triangular matrices in Kalman filters
Thanks Andrea
amities
Michel
andrea pagano wrote:
Hi all I would go for subroutines. I will do it over the weekend while looking at other possibilities fortran pointers.
Best
Andrea
On Fri, Jun 19, 2009 at 10:04 AM, G. Perendiageorge@perendia.orangehome.co.uk wrote:
Dear Andrea
Problem:
I have encountered a problem integrating KalmanFilter with the f90 QT library - passing the QT result arrays back to C++.
QT Fortran routines have been written in standard Fortran FUNCTION
format,
(i.e., not SUBROUTINE), so that they are returning double or single dimensional array (they are named by), by value ( not reference).
However,
as it appears, only simple, single variables seems can be passed from Fortran FUNCTIONs back to C++ (e.g. INT or REAL).
On the other hand, NAG, BLAS and LAPACK routines have all been written
as
Fortran SUBROUTINEs and they can be integrated with C more easily -
they
receive parameters and return their results through the variables passed as calling
parameters,
by references.
For example, dgemv.f from BLAS library gets Y by reference and returns modified Y passed as calling parameter reference.
SUBROUTINE DGEMV(TRANS,M,N,ALPHA,A,LDA,X,INCX,BETA,Y,INCY)
....
- Y - DOUBLE PRECISION array of DIMENSION at least
...
Before entry .... the incremented array Y
must contain the vector y. On exit, Y is overwritten by the
updated vector y.
....
Poss. Solutions:
I could not find any references on how to get arrays from Fortran
FUCTION as
return value back to C - do you or anyone around you know how to do it,
if
at all possible? In any case, passing array by value is also not
recommended
as rather un-economical, especially for larger matrices.
One alternative way I can think of is less explored option of returning Fortran pointer to the resulting array from the QT functions instead
of the
array by value and I think I can work one of that out but suggestions are more than welcome.
I can see few options:
a) to rewrite QT library as SUBROUTINE instead FUCTION routines, or,
b) try to use Fortran pointers and, if we can then also rewrite QT library to return pointers, or,
c) write 3 or more high level f90 shell SUBROUTINES calling the
existing and
unmodified QT functions and performing the all operations needed to construct the resulting Ta and TPT' (for both cases 1 and 2) instead of doing low-level QT
manipulation in
C++.
This way QT library need not be changed and those new SUBROUTINEs will
also
act as interface with C++. I think this is a more productive and
optimal
alternative of the three since those combination utilities would have
to be
written anyway, except it seem to be easier to do that now in Fortran
than
in C++.
If you like and/or are busy, I can by Monday develop the Ta and the
first
case of TPT' SUBROUTINES whilst the second case may need more thinking
and
more granular approach to take advantage of multiple processors.
Please let me know your thoughts on this issue and, whether if you have
time
to make the needed changes or additions in the f90 files.
Best regards
George artilogica@btconnect.com
----- Original Message ----- From: "andrea pagano" pagano.andrea@gmail.com To: george@perendia.orangehome.co.uk Cc: "Michel Juillard" Michel.Juillard@ens.fr Sent: Monday, June 01, 2009 7:47 PM Subject: Quasi triangular matrices in Kalman filters
Hi all
I am sending you a set of Fortran routines to calculate the matricial expression in Kalman filter together with some explenations.
Hope they can be a starting point in optimizing the overall
computation
Best
Andrea
-- Andrea Pagano via Veratti VARESE tel. +3903321691261 cell.+393403804397