Previous Page Next Page Contents

generate::Macrofort::genFor -- FORTRAN code generator

Introduction

Mac::genFor (where Mac:=generate::Macrofort) is the Macrofort package for generating FORTRAN code.

The Macrofort package allows the user to make complete standard FORTRAN 77 code while remaining in the MuPAD environment. Necessities of FORTRAN code such as label numbering and complicated FORTRAN loops are handled automatically. Macrofort has capabilities for:

Furthermore, the combination of MuPAD's symbolic manipulation capabilities and, in particular, its optimizer generate::optimize can help yield efficient codes for ambitious goals in numerical computation. MuPAD can be also used as a preprocessor for numerical analysis.

Macrofort overlaps with MuPAD's generate::fortran but it is more general program for generation FORTRAN code and has many more options.

Call(s)

generate::Macrofort::genFor(l)

Parameters

l - list or list of lists

Returns

the void object of domain type DOM_NULL

Side Effects

writes FORTRAN code into an ascii file

Related Functions

generate::Macrofort::init, generate::Macrofort::setOptimizedOption, generate::Macrofort::setIOSettings, generate::Macrofort::setPrecisionOption, generate::Macrofort::setAutoComment, generate::Macrofort::openOutputFile, generate::Macrofort::closeOutputFile, generate::optimize, generate::fortran

Details

Example 1

Calculation of a matrix.

Initialize Macrofort and open the file "matrix.f"

>> Mac := generate::Macrofort:
   Mac::init():
   Mac::openOutputFile("matrix.f"):

Create a 2 x 2 array with symbolic entries called a

>> a := array(1..2, 1..2, [[x^2, x - y], [x/y, x^2 - 1]])
                             +-            -+
                             |   2          |
                             |  x ,  x - y  |
                             |              |
                             |   x   2      |
                             |   -, x  - 1  |
                             |   y          |
                             +-            -+

and generate a FORTRAN array v for a using "matrixm" in genFor:

>> Mac::genFor(["matrixm", v, a]):

Close the file "matrix.f"

>> Mac::closeOutputFile():
   delete a:

The output file matrix.f is:

              v(1,1) = x**2
              v(1,2) = x-y
              v(2,1) = x/y
              v(2,2) = x**2-1

Example 2

The calculation of a polynomial in Horner form.

Open the file "demo.f" and define the function using the macro "functionm". p is a list containing all specifications in the form of lists. "declarem" defines the FORTRAN variables and their types and "do_m" creates the do-loop by which the polynomial is summed in Horner form.

>> Mac::openOutputFile("demo.f"):
   
   p := ["functionm", doubleprecision,
         horner,[x,n,a],
         [["declarem", doubleprecision,
           [a(n), x]],
          ["equal", horner, a[n]],
          ["do_m", i, n - 1, 0, -1,
           ["equal", horner, a[i] + horner*x]]]]:
   
   Mac::genFor(p):
   Mac::closeOutputFile():
   delete p,a,x,n,i:

The output file demo.f is:

      c      
      c     FUNCTION horner
      c      
            doubleprecision function horner(x,n,a)
              doubleprecision a(n),x
              horner = a(n)
      c           
                do 1000, i=n-1,0,-1
                  horner = x*horner+a(i)
       1000     continue
      c         
            end

Bear in mind that this demo example is lame as the Macrofort MuPAD code is at least as extensive as the resulting FORTRAN output which could have been readily obtained directly by hand. The dividends of Macrofort become more clear in the following examples.

Example 3

A recursive function on a tree.

Although it is possible to write FORTRAN programs for recursive problems, this can be very tedious for complicated recursions and even if the recursion is not so complicated, other problems can arise as shown here. In this example, we consider a binary tree with nodes labeled by pairs of integers (i,j) where (1,1) is the root of the tree, (2,1) and (2,2) are the children nodes of the root and recursively (i+1,2j-1) and (i+1,2j) are the children nodes of node (i,j). Node (i,j) is at level i. Consider now the following sequence:

where g is a given function. We want to compute the values of the sequence up to a given level N, i.e. the 2 ^N-1 values f(N,1) ... f(N,2 ^N-1). To write the corresponding FORTRAN code, you only have to write two loops:

      real f(n,m)
      do 1,i=1,n
        do 2,j=1,2**(n-1)-1,2
            f(i,j)=g(f(i-1,(j+1)/2))
 2      continue
        do 3,j=2,2**(n-1),2
            f(i,j)=g(f(i-1,j/2))
 3      continue
 1    continue

but the dimension m of the array f is 2 ^N-1. In other words, we would have to retain the storage of N x 2 ^N-1 real values instead of 2 ^N-1 (which corresponds to 5 times more storage for N equal to 10).

A way to avoid this waste, is to have an array for each level, i.e. to have FORTRAN arrays f1(1), f2(2), f3(4) and so forth but it then becomes very tricky to write the resulting FORTRAN program by hand. However, this can be readily solved using a MuPAD program within Macrofort.

We suppose that a FORTRAN function g has already been defined. The MuPAD function which generates the FORTRAN program is:

>> Mac::openOutputFile("Func.f"):
   
   pushe := proc(e,l) begin [op(eval(l)),e] end_proc:
   gen_Func := proc(n)
     local i,pg,temp,temp1,temp2,temp3;
   begin
     pg:=[]:
   // declaration of the arrays
     for i from 1 to n do
       temp:=eval(text2expr(
       _concat("f",expr2text(i),"[",expr2text(2^(i-1)),"]"))):
       pg:=pushe(["declare",real,[temp]],pg):
     end_for:
   // loops for each array
     for i from 2 to n do
      temp1:=eval(text2expr(_concat("f",expr2text(i),"[j]"))):
      temp2:=eval(text2expr(_concat("f",expr2text(i-1),"[Hold(j+1)/2]"))):
      temp3:=eval(text2expr(_concat("f",expr2text(i-1),"[j/2]"))):
      pg:=pushe(["do_m",j,1,2^(i-1)-1,2,["equal",temp1,g(temp2)]],pg):
      pg:=pushe(["do_m",j,2,2^(i-1),2,["equal",temp1,g(temp3)]],pg):
     end_for:
     pg:=["programm",Func,pg]:
     Mac::genFor(pg):
   end_proc:
   gen_Func(4):
   Mac::closeOutputFile():
   delete j,pushe,gen_Func:

The output file Func.f is:

      c
      c     MAIN PROGRAM Func
      c
            program Func
              real f1(1)
              real f2(2)
              real f3(4)
              real f4(8)
      c
                do 1000, j=1,1,2
                  f2(j) = g(f1((j + 1)/2))
       1000     continue
      c
      c
                do 1001, j=2,2,2
                  f2(j) = g(f1(j/2))
       1001     continue
      c
      c
                do 1002, j=1,3,2
                  f3(j) = g(f2((j + 1)/2))
       1002     continue
      c
      c
                do 1003, j=2,4,2
                  f3(j) = g(f2(j/2))
       1003     continue
      c
      c
                do 1004, j=1,7,2
                  f4(j) = g(f3((j + 1)/2))
       1004     continue
      c
      c
                do 1005, j=2,8,2
                  f4(j) = g(f3(j/2))
       1005     continue
      c
            end

By calling gen_Func(n) for larger n, you can readily have Macrofort generate the needed code for larger and larger n. Of course, the output is proportionally larger but still ``digestible'' to modern-day compilers for a wide range of n.

Example 4

Optimized Vectorized FORTRAN code for Molecular Integrals.

Here, we present a method for the rapid numerical evaluation of molecular integrals which appear in the areas of Quantum Chemistry and Molecular Physics. The method is based on the exploitation of common intermediates using MuPAD's optimizer (see generate::optimize) and the optimization can be adjusted to both serial and vectorised computations.

Integrals based on atom-centered Gaussian-type functions known as the R-integrals are given by the recurrence relations:

 
R_j [t+1,m,n ]  =  x R_j+1 [t,m,n] + t R_j+1 [t-1,m,n]
 
R_j [t,m+1,n]  =  y R_j+1 [t,m,n] + m R_j+1 [t,m-1,n]
 
R_j [t,m,n+1]  =  z R_j+1 [t,m,n] + n R_j+1 [t,m,n-1]  
 
R_j [0,0,0]  =  (-2 p ) ^j F_j ( alpha QP ^2 )

where

 
F_m (W) = int(exp [ -W t ^2] t ^2m,t=0..1),

where the vector QP=-(x,y,z) and alpha are given geometrical quantities (determined on input) for a particular molecular geometry, and F is a known function in quantum chemistry for which there already exist various algorithms for its rapid computation (in this example, we are only interested in the computation of the polynomial part of R(t,u,v). The summation indices are bounded by zero and t+m+n <= L where L is a total angular quantum number for a given molecular problem, and consequently these induces adhere to a polyhedral structure. The total number N of R functions to compute for a given L is given by N = (L+1) (L+2) (L+3) /6 . In the diatomic molecular case (i.e. a molecule made of only two ``atoms'' whose centers are set on the z-axis), the R-integrals form a sparse three-dimensional matrix (see the work of the references for a fuller understanding of the framework).

Vectorization involved the introduction of a vectorization index, denoted M, which is the first index of all the arrays involved in the computation. The FORTRAN code obtained from the compiler is then ``sandwiched'' within do-loops. The code shown here is generated for the R integrals up to L=3 although it can generate the code for all the way up to L=16.

This example makes use of MuPAD's optimizer (see generate::optimize) and therefore, one obtains optimized vectorized FORTRAN code. This code had been tested on a CRAY and the FLOP (floating-point operations) rate was about 85 percent of its peak efficiency and has been used in specific relativistic quantum chemistry calculations.

Input code:
First we construct the S functions by which the R functions are later defined as shown in the work of V. Saunders (see references). This definition essentially exploits a decomposition in odd and even symmetries within these functions and provides an economy in the computations.

>> S:=proc(j,p,QP,QP2,t,u,v)
   begin
   if (t<0) or (u<0) or (v<0) then 
       0;
   elif (t=0) and (u=0) and (v=0) then
       ((-2*p)^j)*FS[M,j+1];
   elif (t>0) and (t mod 2 =1) then
       S(j+1,p,QP,QP2,t-1,u,v)+(t-1)*S(j+1,p,QP,QP2,t-2,u,v);
   elif (u>0) and (u mod 2 =1) then
       S(j+1,p,QP,QP2,t,u-1,v)+(u-1)*S(j+1,p,QP,QP2,t,u-2,v);
   elif (v>0) and (v mod 2 =1) then
       S(j+1,p,QP,QP2,t,u,v-1)+(v-1)*S(j+1,p,QP,QP2,t,u,v-2);
   elif (t>0) and (t mod 2 =0) then
       QP2[M,1]*S(j+1,p,QP,QP2,t-1,u,v)+(t-1)*S(j+1,p,QP,QP2,t-2,u,v);
   elif (u>0) and (u mod 2 =0) then
       QP2[M,2]*S(j+1,p,QP,QP2,t,u-1,v)+(u-1)*S(j+1,p,QP,QP2,t,u-2,v);
   elif (v>0) and (v mod 2 =0) then
       QP2[M,3]*S(j+1,p,QP,QP2,t,u,v-1)+(v-1)*S(j+1,p,QP,QP2,t,u,v-2);
   end_if;
   end_proc:
   
   Xi:=proc(xbar,i)
   begin
   if (QP[M,1] <> 0) then 
       (-xbar)^(i mod 2);
   elif 
       ((i mod 2) > 0) then 0; else 1; 
   end_if;
   end_proc:

Finally, we construct the R functions

>> R:=proc(j,p,QP,QP2,t,u,v)
   local X,Y,Z;
   begin
       X:=Xi(QP[M,1],t);
       Y:=Xi(QP[M,2],u);
       Z:=Xi(QP[M,3],v);
       S(j,p,QP,QP2,t,u,v)*(X*Y*Z);
   end_proc:

We restrict ourselves to the Diatomic case i.e. only the z-axis (the 3rd axis) has non-zero components.

>> QP[M, 1] := 0:  QP[M, 2] := 0:  QP[M, 3] := QP[M]:
   QP2[M, 1] := 0:  QP2[M, 2] := 0: QP2[M, 3] := QP2[M]:

We ensure plenty of guard digits for L up to L=16

>> DIGITS:=60: 
   subr:=null():
   for LL from 0 to 4 do
       tuv:=0: ds:=null():
       for mu from 0 to LL do
           t:=LL-mu;
           for nu from 0 to mu do
               u:=mu-nu;
               for tau from 0 to nu do
                   v:=nu-tau;
                   tuv:=tuv+1;
                   Rtuv:=float(R(0,ALPHA[M],QP,QP2,t,u,v));
                   if (Rtuv <>0 and Rtuv <> 0.0) then 
                      ds:=ds,[RC[M,tuv],Rtuv]: 
                   end_if;
               end_for:
           end_for:
       end_for:
       subr:=subr,["if_then_m",L=LL,
                  ["do_m",M,1,MAXM,["equal",[ds]]]];
   end_for:

Construct the input list for the FORTRAN Subroutine

>> delete QP2:
   subr := ["subroutinem", RMAKE, [L, FS, ALPHA, QP2, MAXM, RC],
            [["declare", "implicit doubleprecision", ["(a-h,o-z)"]],
             ["parameter", [LMAX = 12, MAXB = 32, 
              MAXB2 = "MAXB*MAXB", MAXR = 455]],
             ["declare", dimension,
              ["FS(MAXB2,LMAX)", "ALPHA(MAXB2)", "QP2(MAXB2)",
               "RC(MAXB2,MAXR)"]], subr]]:

Initialization of Macrofort:

>> Mac:=generate::Macrofort:
   Mac::init():

Open file "vectorized.f" and switch optimizer on. The desired precision for the FORTRAN code is double:

>> Mac::openOutputFile("vectorized.f"):
   Mac::setOptimizedOption(TRUE):
   Mac::setPrecisionOption("double"):

Code Generation:

>> subr:
   Mac::genFor(subr):
   Mac::closeOutputFile():
   delete subr,S,R,LL,Xi,t,u,v,Rtuv,tuv,ds,tu,mu,nu,QP,QP2,M:

The output file vectorized.f is:

      c      
      c     SUBROUTINE RMAKE
      c      
            subroutine RMAKE(L,FS,ALPHA,QP2,MAXM,RC)
              implicit doubleprecision (a-h,o-z)
              parameter (LMAX=12,MAXB=32,MAXB2=MAXB*MAXB,MAXR=455)
              dimension FS(MAXB2,LMAX),ALPHA(MAXB2),QP2(MAXB2),RC(MAXB2,MAXR)
                if (L.eq.0) then
      c               
                    do 1000, M=1,MAXM
                      RC(M, 1) = FS(M,1)
       1000         continue
      c             
                endif
                if (L.eq.1) then
      c               
                    do 1001, M=1,MAXM
                      RC(M, 4) = FS(M,1)
       1001         continue
      c             
                endif
                if (L.eq.2) then
      c               
                    do 1002, M=1,MAXM
                      t1 = ALPHA(M)
                      t2 = FS(M,2)
                      RC(M, 1) = -0.2D1*t1*t2
                      RC(M, 5) = RC(M,1)
                      RC(M, 8) = RC(M,1)+0.4D1*t1**2*QP2(M)*FS(M,3)
                      RC(M, 10) = FS(M,1)
       1002         continue
      c             
                endif
                if (L.eq.3) then
      c               
                    do 1003, M=1,MAXM
                      t3 = ALPHA(M)
                      t4 = FS(M,2)
                      RC(M, 4) = -0.2D1*t3*t4
                      RC(M, 13) = RC(M,4)
                      RC(M, 18) = RC(M,4)+0.4D1*t3**2*QP2(M)*FS(M,3)
                      RC(M, 20) = FS(M,1)
       1003         continue
      c             
                endif
                if (L.eq.4) then
      c               
                    do 1004, M=1,MAXM
                      t5 = ALPHA(M)
                      t6 = t5**2
                      t7 = FS(M,3)
                      RC(M, 1) = 0.12D2*t6*t7
                      RC(M, 5) = 0.4D1*t6*t7
                      t8 = QP2(M)
                      t9 = FS(M,4)
                      t10 = -0.8D1*t5*t6*t8*t9
                      RC(M, 8) = t10+RC(M,5)
                      t11 = FS(M,2)
                      RC(M, 10) = -0.2D1*t5*t11
                      RC(M, 21) = RC(M,1)
                      RC(M, 24) = RC(M,8)
                      RC(M, 26) = RC(M,10)
                      RC(M, 31) = t8*(0.16D2*t6**2*t8*FS(M,5)-0.24D2*t5*t6*t9)
           #+RC(M,1)-0.24D2*t5*t6*t8*t9
                      RC(M, 33) = 0.4D1*t6*t7*t8+RC(M,10)
                      RC(M, 35) = FS(M,1)
       1004         continue
      c             
                endif
            end

The benefits of MuPAD's optimizer become more evident at higher L but we can already see its effects. Common intermediates are exploited and in a number of cases, only re-assignations and not actual computation were needed.

Background




Do you have questions or comments?


Copyright © SciFace Software GmbH & Co. KG 2000