Re: [DynareDev] Kalman Filter

10 Jun 2009

      Dear Michel
After the first cut of refactoring, we have mixed results: there is about
19% performance improvement and approx 30% reduction in use of copy
constructor running small model as expected, though no significant
performance improvement could be measured on the larger  models
(euro_sw3.mod) yet - I am looking into the other possible causes for that
lack of improvement.
Best regards
George
----- Original Message ----- 
From: "Michel Juillard" michel.juillard@ens.fr
To: "G. Perendia" george@perendia.orangehome.co.uk; "List for Dynare
developers" dev@dynare.org
Sent: Sunday, June 07, 2009 9:01 PM
Subject: Re: [DynareDev] Kalman Filter
...
Thanks George,
very interesting. Please proceed with simplifying the KF code.
All the best,
Michel
G. Perendia wrote:
...
Dear Michel
I already have a stand-alone exe test I used last week (and uploaded  it
today too) and I  run it through gprof earlier today (though, after
wasting
...
...
some time last week trying to use a reportedly sophisticated profiler -
CodeAnalyst from AMD - which, so far, I could not  make to work at all).
The other profiling tool I initially used (and reported) last week
("Very
...
...
Sleepy") is an "external", not getting into the code details itself but
could be run attached externally to either the stand-alone exe test or
the
...
...
Matlab running DLL thread too, and that reported for both spending a lot
(~40%) time in ztrsv solver but, at one snapshot, also pointed to a lot
of
...
...
time spent in dtrsv and GeneralMatrix copy - 50% each. On the contrast,
gprof is more internal, higher resolution profiler and it puts the load
weight (>10%) on housekeeping functions but does not even mention calls
to
...
...
external library BLAS functions such as dtrsv and ztrsv.
The both profiling tools, however, seem to confirm what my early code
inspection concluded too:  a very high, use of  (not very productive)
General Matrix copy constructor -(e.g. the C++ kalman filter stores copy
of
...
...
few of  the main time variant system matrices: F, P and an intermediate
L
...
...
for each step of the time series evaluation and also creates a copy of
the
...
...
input T, H and Z at each step as if they may be time invariant too
although
...
...
this would not be the case for a simple, no-diffuse KF without missing
observations).
This then resulted in high % of time in GeneralMatrix copy() function,
(which is called by the copy-constructor explicitly) - i.e. as reported
by
...
...
the both profiling programs: Sleepy: up to 50% at one snapshot point
whilst
...
...
gprof gives it the 1st rank with 11% on its own or 27.3% of total with
its
...
...
children.
The copy() function is followed by the utility functions such as two
varieties (const and non-const) of Vector indexing [] operator and by
the
...
...
const and variable varieties of GeneralMatrix::get() elements that
utilise
...
...
the previous Vector indexing[] operator and are themselves  directly
called
...
...
from the heavily used GenaralMatrix copy function among the rest.
According to gprof, the above high burden of copy constructor and the
related functions are only then followed by the productive functions
such as
...
...
PLUFact::multInvRight matrix multiplication with inversion (used for
inversion of the F matrix), the GeneralMatrix constructors and the
GeneralMatrix::gemm()  - a  general matrix multiplication(itself calling
BLAS dgemm) with 4.7, 3.1 and 2.6 % of total time respectively
NOTE however that gprof profiler paints to an extent different picture
and
...
...
does not even mention external BLASS functions such as dtrsv and ztrsv
solvers reported as heavy users by the VerySleepy "external" profiler.
All in all, it appears form the both profiler reports and my initial
inspection that , for start (and as I initially already intended and
suggested to), we should refactor the current heavy use of the
un-productive
...
...
General Matrix copy constructor and its current reliance on
element-by-element get() function before we get into any further
performance
...
...
improvements of the productive functions and external libraries.
Best regards
George
----- Original Message ----- 
From: "Michel Juillard" michel.juillard@ens.fr
To: "List for Dynare developers" dev@dynare.org
Sent: Saturday, June 06, 2009 3:08 PM
Subject: Re: [DynareDev] Kalman Filter
...
There are tools to do profiling in C++. All we need is an standalone
executable calling the filter. Don't loose time adding timing function
inside the code. It may be difficult to do profiling in Windows. In
that
...
...
...
case, just prepare the code and we will do the profiling in Linux.
Best
Michel
G. Perendia wrote:
...
Dear Michel

Yes, as agreed initially

these are the Matlab  Dynare KF measures, mainly to show the

proportion
...
...
of inversion vs. pure update in Matlab KF.
I have not yet done fine profiling for C++, so, not much to upload
either.
...
...

I agree..

Best regards
George
----- Original Message ----- 
From: "Michel Juillard" michel.juillard@ens.fr
To: "List for Dynare developers" dev@dynare.org
Sent: Saturday, June 06, 2009 2:10 PM
Subject: Re: [DynareDev] Kalman Filter
...
Thanks George
One of the first thing that we need to establish is whether identical
basic matrix operations take much longer in the C++ implementation
than
...
...
...
...
...
in Matlab and, if that it is the case, why.
...

Indeed, and as a significant part of the overall parcel of

updating
...
...
...
...
...
...
P,
...
...
one needs to invert the updated F too :
100000 loops of small model KF 4x4  F matrix inversion:   iF     =
inv(F);
...
...
Fmx_inv_time =    2.2530
100000 loops of the corresponding core KF 8x8 P matrix update:
            P1      = T*(P-K*P(mf,:))*transpose(T)+QQ;
Pupdt_time =    3.4450
(and also, 100000 loops of the preceding   K  = P(:,mf)*iF;
Kupdt_time =    0.5910)
How do these operations compare with Matlab on your machine?
...
The convergence of P exploited in Matlab Dynare KFs (which does not
require
...
...
further  update of P and K and inversion of F ), can improve greatly
performance
of KF.
e.g.: running Matlab Dynare KF with 57x57 system matrix in 1000 loop
1000 of usual matlabKF_time = 337.1650
and then using P recursively in the loop with a modified
kalman_filter.m
...
...
...
...
which returns P too (therefore, utilising P convergence and avoiding
its
...
...
...
...
update for the most of the remaining 999 loops):
1000 of recursive: Matlab_rec_KF_time = 11.7060

And, although the convergence of P in Matlab KF did  not take

place
...
...
...
...
...
...
for
...
...
the large sw_euro model which had total run much closer to the C++KF
(see 3
...
...
below), as today's check show, the convergence did however take
place
...
...
...
...
...
...
very
...
...
early in the Matlab KF running the small model I initially tested
(at
...
...
...
...
...
...
step
...
...
t=3!!!) so it
certainly did affect and, judging from the above results,  rather
greatly
...
...
contribute to the very much  faster KF loops we experienced running
Matlab KF  versus  C++ KF in the initial tests with the same small
model
...
...
...
...
(C++ KF does yet not take
advantage of convergence and the comparative results were even)!!!
OK, we forget the first comparison on the small model, because C++
and
...
...
...
...
...
Matlab didn't use the same algorithm (no convergence monitorinig in
C++).
...
...
...
Matlab is still faster by 45% on the medium size model. We should
focus
...
...
...
...
...
on explaining this difference and we don't need to bring monitoring
the
...
...
...
...
...
convegence of the filter for this particular example.
Could you please upload on SVN the code that you use for profiling?
Best
Michel
...
Best regards
George

Dev mailing list
Dev@dynare.org
http://www.dynare.org/cgi-bin/mailman/listinfo/dev

Dev mailing list
Dev@dynare.org
http://www.dynare.org/cgi-bin/mailman/listinfo/dev

Dev mailing list
Dev@dynare.org
http://www.dynare.org/cgi-bin/mailman/listinfo/dev

Dev mailing list
Dev@dynare.org
http://www.dynare.org/cgi-bin/mailman/listinfo/dev

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Re: [DynareDev] Kalman Filter