Writing Native Scilab Functions

In the following two sections we shall treat the "anatomy" of native, i.e. low-level Scilab functions. This will confront us with all the gory details of the stack, the low-level API, and the calling conventions. Having the "Guide for Developers", Internals.ps (see also the section called Local Documents in Chapter 8) ready is a good idea. Where the developer guide is at the end of its wits, a study of the source code is appropriate, especially the file SCI/routines/interf/stack1.f

We start out discussing simple functions. Simple in the sense that they are self-contained and only take non-function parameters as their arguments. In the second part we shall consider functions that take other functions (either Scilab functions or externals) as arguments.

Simple Functions

A typical native Scilab function proceeds as follows:

  1. Check the number of input and output parameters.

  2. Get the "pointers" to all actual input parameters; supply default values for optional parameters; issue warnings or errors as appropriate for too many or too few parameters.

  3. Allocate space for all temporary variables, "workspaces", etc.

  4. It might be necessary to translate the input variables which are in Scilab format into the appropriate format for the worker routine. This is necessary for example if the worker routine uses DOUBLE COMPLEX (or COMPLEX*16) variables.

  5. Perform the calculations or transformations that really make up the procedure.

  6. As in Step 4, it might be necessary to transform the results, now form the worker routine's format back into Scilab format.

  7. If necessary, allocate space on the Scilab stack and copy results to this space.

Now that the outline is clear we are ready to dissect a simple function: ortho. The function takes exactly one argument a, that is a real or complex m times n matrix. The single output parameter is a matrix of the same shape and type is the input matrix. The duty of ortho is to bring the columns of the input matrix into orthonormal form; to achieve this we employ the following LAPACK functions:

type Complex is record
    Re, Im : Float'Base;
end record;

type FloatVector   is array (Positive range <>) of Float;
type ComplexVector is array (Positive range <>) of Complex;
type FloatMatrix   is array (1..Lda, Positive range <>) of Float;
type ComplexMatrix is array (1..Lda, Positive range <>) of Complex;

procedure dgeqrf
    (M     : in     Natural;
     N     : in     Natural;
     A     : in out FloatMatrix;
     Lda   : in     Natural;
     Tau   :    out FloatVector;
     Work  :    out FloatVector;
     Lwork : in     Integer;
     Info  :    out Integer);

procedure dorgqr
    (M     : in     Natural;
     N     : in     Natural;
     K     : in     Natural;
     A     : in out FloatMatrix;
     Lda   : in     Natural;
     Tau   :    out FloatVector;
     Work  :    out FloatVector;
     Lwork : in     Integer;
     Info  :    out Integer);

procedure zgeqrf
    (M     : in     Natural;
     N     : in     Natural;
     A     : in out ComplexMatrix;
     Lda   : in     Natural;
     Tau   :    out ComplexVector;
     Work  :    out ComplexVector;
     Lwork : in     Integer;
     Info  :    out Integer);

procedure zungqr
    (M     : in     Natural;
     N     : in     Natural;
     K     : in     Natural;
     A     : in out ComplexMatrix;
     Lda   : in     Natural;
     Tau   :    out ComplexVector;
     Work  :    out ComplexVector;
     Lwork : in     Integer;
     Info  :    out Integer);

procedure dcopy
    (N     : in     Natural;
     X     : in     FloatVector;
     IncX  : in     Integer;
     Y     :    out FloatVector;
     IncY  : in     Integer);

The two [d|z]geqrf-functions compute a QR-factorization of a real or complex m-by-n matrix a, while the [dor|zun]gqr-functions generate an m-by-n real or complex matrix q with orthonormal columns. dcopy copies N elements of the vector X in increments of IncX into the vector Y using increments of IncY on the output side. For a detailed description please consult the LAPACK User Guide or the appropriate manual pages.

Example 7-1 is one of the longest examples in the running text, but don't be scared as we will explain line-by-line and variable-by-variable what is where and why.

Example 7-1. Simple native Scilab function

      subroutine ortho                                          -- Native functions are parameterless

      implicit none                                             -- Switch into weeny mode :-)

*     CONSTANTS
      integer realtype
      parameter (realtype = 0)                                  -- See Table 7-1 for type association

*     LOCAL VARIABLES
      character*6 fname                                         -- The name of the routine as string

      logical checklhs, checkrhs, cremat, getmat                -- Scilab API functions

      integer topk
      integer n, m, mattyp
      integer tausz, worksz, info
      integer areadr, aimadr, badr, tauadr
      integer wrkadr, rreadr, rimadr, dumadr

*     EXTERNAL FUNCTIONS/SUBROUTINES
      external checklhs, checkrhs, cremat, getmat               -- Scilab API functions
      external error

      external dcopy, dgeqrf, dorgqr, zgeqrf, zungqr            -- LAPACK/BLAS worker subroutines

*     HEADER
      include '/site/X11R6/src/scilab/routines/stack.h'         -- Scilab API header

*     TEXT
      fname = 'ortho'                                           -- Function name (for error messages)
      topk = top                                                -- top is defined in stack.h
      rhs = max(0, rhs)

      if (.not. checkrhs(fname, 1, 1)) return     (1)
      if (.not. checklhs(fname, 1, 1)) return

*     fetch input parameters                      (2)
      if (.not. getmat(fname, topk, top - rhs + 1,
     $        mattyp, m, n, areadr, aimadr)) return

      if (n * m .eq. 0) return                                   -- Quick return on empty matrix

      tausz = min(m, n)                                          -- Prescribed by man-page
      worksz = max(1, n)                                         -- ... same here


      if (mattyp .eq. realtype) then
*     real case

*     allocate temporary variables; all are real  (3)
          if (.not. cremat(fname, top + 1, realtype, tausz, 1, 
     $            tauadr, dumadr)) return
          if (.not. cremat(fname, top + 2, realtype, worksz, 1,
     $            wrkadr, dumadr)) return
          if (.not. cremat(fname, top + 3, realtype, m, n,
     $            badr, dumadr)) return

*     prepare worker routines' input parameters   (4)
          call dcopy(n * m, stk(areadr), 1, stk(badr), 1)

*     call worker routines                        (5)
          call dgeqrf(m, n, stk(badr), m, stk(tauadr),
     $            stk(wrkadr), worksz, info)
          if (info .ne. 0) then                                  -- Any error is considered fatal
              buf = fname // ' dgeqrf failed'
              call error(999)
              return
          endif

          call dorgqr(m, n, tausz, stk(badr), m, stk(tauadr),
     $            stk(wrkadr), worksz, info)
          if (info .ne. 0) then                                  -- Any error is considered fatal
              buf = fname // ' dorgqr failed'
              call error(999)
              return
          endif

      else
*     complex case; mattyp != realtype

*     allocate temporary variables, 
*     use two REAL*8 for one COMPLEX*16           (6)
          if (.not. cremat(fname, top + 1, realtype, 2 * tausz, 1,
     $            tauadr, dumadr)) return
          if (.not. cremat(fname, top + 2, realtype, 2 * worksz, 1,
     $            wrkadr, dumadr)) return
          if (.not. cremat(fname, top + 3, realtype, 2 * m, 2 * n,
     $            badr, dumadr)) return

*     prepare worker routines' input parameters, joining
*     two REAL*8 arrays into one COMPLEX*16 array (7)
          call dcopy(n * m, stk(areadr), 1, stk(badr), 2)
          call dcopy(n * m, stk(aimadr), 1, stk(badr + 1), 2)

*     call worker routines                        (8)
          call zgeqrf(m, n, stk(badr), m, stk(tauadr),
     $            stk(wrkadr), worksz, info)
          if (info .ne. 0) then                                  -- Any error is considered fatal
              buf = fname // ' zgeqrf failed'
              call error(999)
              return
          endif

          call zungqr(m, n, tausz, stk(badr), m, stk(tauadr),
     $            stk(wrkadr), worksz, info)
          if (info .ne. 0) then                                  -- Any error is considered fatal
              buf = fname // ' zorgqr failed'
              call error(999)
              return
          endif

      endif


*     get ready to exit
      if (lhs .ge. 1) then                        (9)
          if (.not. cremat(fname, top, mattyp, m, n, 
     $            rreadr, rimadr)) return
          if (mattyp .eq. realtype) then          (10)
              call dcopy(m * n, stk(badr), 1, stk(rreadr), 1)
          else
              call dcopy(m * n, stk(badr), 2, stk(rreadr), 1)
              call dcopy(m * n, stk(badr + 1), 2, stk(rimadr), 1)
          endif
      endif

      end
    
(1)
Check the number of input and output parameters. Here the task is easy as we need one and write one. This line and the next correspond to Step 1.
(2)
Get the addresses as mentioned in Step 2 of the real and imaginary part of the matrix passed as only parameter to ortho. Note that getmat will return False if the parameter at the given parameter stack position is not a matrix of numbers.

getmat is called with the second parameter, topk holding the value of the parameter stack pointer when the control flow entered ortho. This as well as the function name passed in fname is necessary for the cleanup and messaging in case of an error.

The only parameter we use is on top of the parameter stack as top - rhs + 1 equals top in our case.

On successful return getmat not only sets the data stack addresses areadr, and aimadr, but also tells us via mattyp whether the matrix is real complex, and via m, and n how large the matrix is.

The following lines directly depend on the sizes passed back, calculating the necessary space for two scratch arrays.

(3)
Allocating space for the temporary variables tau, work, and b on the data stack is Step 3. tau and work are necessary because of the LAPACK routines used; b is a copy of a as the LAPACK routine works with the matrix in place, i.e. would mangle the input variable a. The temporaries are accessed the same way parameters are accessed: through indices into the data stack. These indices are tauadr, wrkadr, and badr. Their position on the parameter stack is top + 1, top + 2, and top + 3, respectively.

We request a purely real storage for each of the three temporary variables with the third parameter being realtype = 0. Therefore the index for the imaginary part is a dummy index, dumadr.

The sizes of the vectors or matrices have been computed before.

(4)
There is no "translation" to do in the real case. So Step 4 is easy. The input variable – of which we definitely know that it is real – is simply copied into the scratch space we have allocated on the data stack.

Note how the address of the matrices is passed. The idiom is stk(index), where index has been obtained through a get*-, or cre*-function. The memnonic "stk" means data stack.

(5)
Everything is set up correctly and initialized. We have reached Step 5. The worker routines can take over now.
(6)
In the complex case the allocation of the temporaries variables requires a bit more thought, although it is again just Step 3. We know that the LAPACK routines need the complex vectors/matrices in packed form. Thus, we allocate one real (DOUBLE PRECISION) vector/matrix of twice the size each time thereby accommodating the the storage requirement of complex (DOUBLE COMPLEX) variables. Otherwise this step proceeds as in the real case.
(7)
Due to the different handling of complex variables in Scilab and in LAPACK, Step 4 requires two separate calls to the copy function.
      call dcopy(n * m, stk(areadr), 1, stk(badr), 2)
      call dcopy(n * m, stk(aimadr), 1, stk(badr + 1), 2)

The first line says: "Copy m times n elements from the first position in the DOUBLE PRECISION variable stk(areadr) taking each entry (3rd parameter, read stride: 1) into the DOUBLE COMPLEX output variable stk(badr) filling every other entry (5th parameter, write stride: 2)." The second line does almost the same, but starts off writing at the second element stk(badr + 1), therfore filling the imaginary parts into stk(badr). This corresponds to Step 4.

(8)
Again we have reached Step 5; everything is set up correctly and initialized, and the worker routines can take over.
(9)
If there is an output variable, we copy the results into it. Otherwise, we skip the expensive copy operation.
(10)
At this point a purely real result, stk(rreadr), can simply be copied onto the output parameter, stk(badr).

The situation is a bit more complicated for a complex result, as we have to split the DOUBLE COMPLEX result from LAPACK into two DOUBLE PRECISION matrices. Here are the crucial lines again:

      call dcopy(m * n, stk(badr),     2, stk(rreadr), 1)
      call dcopy(m * n, stk(badr + 1), 2, stk(rimadr), 1)

The first line says: "Copy m times n elements from the first position in the DOUBLE COMPLEX result stk(badr) taking every other entry (3rd parameter, read stride: 2) into the DOUBLE PRECISION output variable stk(rreadr) filling each entry (5th parameter, write stride: 1)." The second line does almost the same, but starts off at the second element stk(badr + 1), therfore copying the imaginary parts into stk(rimadr). This way we are merging Step 6, and Step 7 into one.

Functionals

FIXME: write it!