Scilab Bag Of Tricks: The Scilab-2.5 IAQ (Infrequently Asked Questions) | ||
---|---|---|
Prev | Chapter 6. Performance | Next |
The brute force way of getting a better performance is rewriting an existing Scilab script in a low-level language as C, Fortran, or even assembler. This option should be chosen with care, because the rapid prototyping facilities of Scilab are lost. On the other hand if the interface of the function has settled, its performance is known to be crucial and it is of use in future projects then the translation into compiled code could be be worth the time and the grief.
In the first part of this section we compare different ways of integrating an external function into Scilab. We focus on the ease of integration versus the runtime overhead introduced. The second part deals with writing the low-level functions themselves, especially their interfaces.
We revive our matrix mirroring example from the section called Avoiding Indexing.
Our Fortran-77 version looks like this:
subroutine mir(n, m, a, dir, b) * * Mirror n*m-matrix a along direction prescribed by dir. * If dir == 'r' then mirror along the rows, i.e. horizontally. * Any other value for dir mirrors along the columns, i.e. * vertically. The mirrored matrix is returned in b. * implicit none * ARGUMENTS integer n, m double precision a(n, m) character dir*(*) double precision b(n, m) * LOCAL VARIABLES integer i * TEXT if (dir(1:1) .eq. 'c') then do 100, i = 1, m call dcopy(n, a(1, m+1-i), 1, b(1, i), 1) 100 continue else do 200, i = 1, n call dcopy(m, a(n+1-i, 1), n, b(i, 1), n) 200 continue end if end
The dcopy(n, x, incx, y, incy, ) is from BLAS level 1, and copies n double precision elements from vector x in increments of incx to y, where it uses increments of incy.
The only thing missing is the glue code between Scilab and mir.
function b = mirf(a, dir) // interface function for 'mir.f' // Behavior is the same as mirror() [n, m] = size(a) b = zeros(n, m) if dir == 'r' | dir == 'c' then b = fort('mir', .. n, 1, 'i', m, 2, 'i', a, 3, 'd', dir, 4, 'c', .. 'out', .. [n, m], 5, 'd') else error("dir must be ''r'' or ''c''") end
OK, let's lock-and-load. We are ready to rock!
link('mir.o', 'mir') getf('mirf.sci')
The fast alternative to using fort, which dynamically creates an interface to a C or Fortran function is using intersci, which which creates an interface suitable for static loading.
intersci can create the Fortran glue code for a C or Fortran function to make it callable form the Scilab interpreter. The glue code is compiled (with a Fortran compiler) and linked to Scilab. intersci is described very well in the SCI/doc/Intro.ps. Anyhow, here is the description file for our current example. Finally it will supply us with a Scilab function called mirai(a, dir).
mirai a dir a matrix n m dir string 1 b matrix n m mir n m a dir b n integer m integer a double dir char b double out sequence b *
We do not want to go into detail here, but a desc-file has three parts separated by blank lines: The description of the Scilab-level function's signature (here: mirai), the same for the low-level function (here: mir), and finally the results' structure. The signatures resemble Fortran or K&R-style C function definitions with the parenthesis missing. The process of passing a desc-file through intersci, compiling the low-level routine and the glue code can be automated. Example 6-4, a snippet of our Makefile.intersci shows the relevant rules.
Example 6-4. Makefile for static Scilab interfaces via intersci
ifdef SCI SCIDIR := $(SCI) else SCIDIR := /site/X11R6/src/scilab-2.5 endif %.f.pre: %.desc $(SCIDIR)/bin/intersci $* mv $*.f $*.f.pre %.f: %.f.pre perl -pe 's#SCIDIR#$(SCIDIR)#' $< > $@ %.o: %.f $(FC) $(FFLAGS) -c $<
Running the automatically generated Fortran code through a filter (here: perl) is necessary to fix the lines include 'SCIDIR/routines/stack.h'. After everything is compiled a single Scilab command makes the new routine available to the user.
addinter(['mirai.o', 'mir.o'], 'mirai', 'mirai')
The first argument which almost always is a vector of strings tells Scilab the names of the object files to load. One of them is the interface code made by intersci. The rest are the user routines. The second argument specifies name of entry point into the interface routine. The third parameter is the name the new Scilab function will carry.
Entry point of interface function | |
---|---|
addinter's second argument must be the name of the interface routine, i.e. the one generated by intersci. Using the low-level function's entry point here causes Scilab to barf. |
Why do we go through that tedious process? After all we are in the performance section, so what we want is speed, high speed, or even better the ultimate speed. Now we can compare all the variants as is done in Figure 6-1.
Figure 6-1. Benchmark results for the mirror functions
Performance comparison of mirror[1-4], and mirai on a P5/166 Linux box.
Performance comparison of mirror[1-4], and mirai on a 2-way PIII/550 Linux box.
If we compare the performance of our three Scilab mirror routines mirror1, mirror2, and mirror3 together with the two incarnations of the hard-coded routine mirf, and mirai, we reach at the following conclusions.
Scilab code that makes heavy use of indexing, like mirror1, is extremely slow no matter what problem size. Thumbs down on that one.
Well written i.e. index-free Scilab code, like mirror4, performs very well. This is especially true for large vectors or matrices.
The overhead of the fort-call in mirf is high; it is hard to amortize for that. fort is only justified in situations where a significant amount of time is spent in the low-level user-routine. Usually this will be the case for large problem sizes. Of course the cross-over point has to be determined separately in each case.
Nothing can beat a compiled function that is integrated with addinter. mirai surpasses all other implementations. For small problem sizes the little overhead in comparison to all the other functions gives this function a factor 10 advantage, though, as the problems size increases mirai's lead is challenged by mirror4.
Conclusion: Never underestimate the power of the Emperor^H^H^H^H^H^H^H vectorized Scilab code.
In this section we will discuss the interfacing of C, C++, Fortran-77, Fortran-9x, or Ada routines with Scilab via link command. We restrict ourselves to the simple case of functions that expect exactly one double precision floating point parameter and return a double precision floating point result. Functions with that signature are required e.g. for the integration routine intg, or the root finder fsolve.
Before we dive into the language specific descriptions, let us point out the main features of Fortran we have be pay attention to when writing an interface in (another) language.
A function named FOO (foo, or whatever capitalization is chosen) in the Fortran source can become a different symbol on code generation. This is compiler dependent. Most often an underscore "_" is prepended or appended. Sometimes the name is downcased, sometimes it is upcased.
The nm(1) command provides easy access to the symbols in an object file. |
Fortran never passes the value of a parameter, but always a pointer to the parameter.
Arrays are stored so that their leftmost index varies fastest.
Fortran-77 or, how do you want to ruin your day? | |
lvd |
Extending Scilab with Fortran-77 is most straightforward. A Fortran-77 source for function FALS could look like this:
double precision function fals(x) double precision x fals = sin(10.0d0 * x) end
After compilation (e.g. f77 -c fals.f) the compiled code can be linked to Scilab and called with the integration routine.
link('fals.o', 'fals'); [res, aerr, neval, info] = .. intals(0.0, 1.0, -0.5, -0.5, 'alg', 'fals')
Fortran-90? Don't worry, it can't get much worse. | |
cls |
A bloated, but portable Fortran-90 source for a function could look like this:
function fsm(x) implicit none integer, parameter :: idp = kind(1.0d0) ! arguments/return value real(kind = idp), intent(in) :: x real(kind = idp) :: fsm ! text fsm = exp(x) / (1.0d0 + x*x) end function fsm
After compilation (e.g. f90 -c fsm.f90) the compiled code can be linked to Scilab and called with an integration routine.
link('fsm.o', 'fsm'); [ires, ierr, neval] = intsm(0.0, 1.0, 'fsm')
A simple C function meting out signature requirements has e.g. this shape:
#include <math.h> double fgen(const double *x) { if (*x > 0.0) return 1.0 / sqrt(*x); else return 0.0; }
After compilation (e.g. cc -c fgen.c) the compiled code can be linked to Scilab and called with the integration routine.
link('fgen.o', 'fgen', 'c'); [ires, ierr, neval, info] = intgen(0.0, 1.0, 'fgen')
There are several ways to get the naming convention differences between Fortran and C right. We show three possible solutions for the case where C uses no decoration at all and Fortran appends one underscore.
/* (1) GNU C compiler */ double foo(const double *x) __attribute__((weak, alias ("foo_"))); /* (2) good preprocessor */ #define C2F(name) name##_ /* (3) old preprocessor ;-) */ #define ANOTHERC2F(name) name/**/_
A C++ source for a function could look like this:
#include <math.h> extern "C" { double fgk(const double *x); } double fgk(const double *x) { return 2.0 / (2.0 + sin(10.0 * M_PI * (*x))); }
After compilation (e.g. c++ -c fgen.c) the compiled code can be linked to Scilab and called with the integration routine.
link('fgk.o', 'fgk', 'c'); [ires, ierr, neval, info] = .. intgk(0.0, 1.0, 'fgk', 0, %eps, '15-31')
For GNAT/Ada the package's interface part pulls in the Fortran interface definitions. Is the simplest case the mathematical functions are only instantiated with the type Double_Precsion. Ada requires to export every function's interface separately, as is clear from the following example.
with Interfaces.Fortran; use Interfaces.Fortran; with Ada.Numerics.Generic_Elementary_Functions; package TestFun is package Fortran_Elementary_Functions is new Ada.Numerics.Generic_Elementary_Functions(Double_Precision); use Fortran_Elementary_Functions; function foo(x : Double_Precision) return Double_Precision; pragma Export(Fortran, foo); pragma Export_Function(Internal => foo, External => "foo_", Mechanism => Reference, Result_Mechanism => Value); end TestFun;
According to the interface specification the package body looks like this:
package body TestFun is function foo(x : Double_Precision) return Double_Precision is begin return exp(x) / (1.0 + x*x); end foo; end TestFun;
The package is compiled as usual gnatmake -O2 testfun.adb.
Make sure that there is a GNAT runtime library libgnat-3.12p.so. Your version number may be different. The ending so is critical, as libgnat-3.12p.so.1.7 will not make dlopen(3) happy. From now on everything is downhill and the function can be linked almost as usual.
Again, the path to your gnat-library and the version numbers can differ.
In the case of several functions in the package it is preferable to rely on the extended dlopen(3) mechanism, and link the package/library combo with remembering the id of the shared library.
adacode = link("testfun.o -L/site/gnat-3.12p/lib -lgnat-3.12p", "foo")
Linking further functions from the library happens by referencing the number of the library.
link(adacode, "bar")
This saves space (Scilab's TRS) and time (to execute the link). Speaking about saving... Users with a loader e.g. GNU ld, capable of incremental linking (e.g. -i, -r, --relocatable) can of course link testfun.o with the library before linking everything to Scilab. To complete the example, here comes the command-line:
ld -i -o testfun-lib.o testfun.o -L/site/gnat-3.12p/lib -lgnat-3.12p
In Scilab the arguments to link reduce to
link("testfun-lib.o", "foo")
What? What are you doing in this section? Still not satisfied with your functions' performance?—Sorry, but there are no conventional ways to get more out of Scilab. Tinkering with the interface routines is not worth the effort. Some completely new approach is necessary.
If a problem is too tough, Scilab still can serve as a rapid prototyping environment. One sister program of Scilab, namely Tela has been written for exactly this purpose. Prototyping with an interpreted language is currently going through a big revival with C (and C++) developers discovering Python.
As whenever optimization is the final goal, an extensive test suite is the base for success. So one way to proceed could be to develop test routines and reference implementation completely in Scilab. The next step is rewriting the routines still in Scilab to match the signatures of for example BLAS/LAPACK routines as closely as possible. The test suite can remain untouched in this step. The final step is to migrate the Scilab code to Fortran, C, or whatever, while making extensive use of BLAS/LAPACK. Ideally the test suite remains under Scilab and can be used to exercise the new standalone code.