HI,

Coding and understanding someone else codes will be a major effort especially
at the beginning..

The structure of my code follows this outline:

**NewCode**
directory :

Makefile
// to compile the codes

**Add directory:**

mat-addkernels.c /h // matrix addition, matrix
comparison,

**Error **

** **doubly_compensated_sum.c // Error Analysis/Estimation

**Examples**

Example.3.c, example.4.c, example.6.c

**Executable**

**MAT-ADD-Generator**

**
**addgen.c // matrix addition kernel generator

**Matrices**

** ** architecture.h
// architecture specific macros

mat-operands.h
// specify how we store and access matrices,

row/column major ...

**Mul**

mat-mulkernels.c/h
// multiplication kernels

**Scaling**

scaling.c/h
// for processor allowing frequency/voltage scaling

**Scripts**

**Sort**

quicksort.h/c** //
**this is used for the error analysis

**You will need to
install the BLAS library you like and modify the Makefile **

Goto: GotoBLAS directory Linux_P4SSE2.

ATLAS pre-built library for P4, you may use any one (either pre-built or not)
as you wish.

There is not much more into it. The files example*, as the name says, offer
examples how to call the matrix multiplication routines.

The codes you will see are all for double precision matrices (typedef double Mat).

the file example.4.c present my implementation of the
parallel algorithm for the Opteron system with two
processors with each processor composed of two cores. This implementation is
specific to the operating system available on that particular machine.

Unfortunately, this package is not self installing and it will require some
work in the understanding of its structure, installation and use. Nothing major.

The High performance MM routines should be installed separately. Then my code
can be built and used. Some tuning for the Matrix Addition (MA) is advised but
I have found that the optimized version available fits most of the architecture
I have used.

**Makefile**

Every architecture has its compiler and libraries. My
code will use Matrix Multiplication (MM) routines that can be from ATLAS, GotoBLAS or your preferred vendor library. At this time, I
have experimented (heavily) with ATLAS, GotoBLAS, in the past I used SGI BLAS and recently MKL BLAS.

**architecture.h**

The file with macros specifying what library I am going to use. For example,
the macro mm_leaf_computation is used to identify the leaf computation
(when Strassen/Winograd yield control to the fancy
library routines).

**mat-operands.h**

The matrices are defined here and basic routines for they manipulation,
division and definitions are here as well. For example, how
to get the sub-matrix A0 from the matrix A ...

**mat-addkernels.c**

Matrix addition for matrices in row and column major.

**mat-mulkernels.c/h****
**Strassen, oblivious + Strassen,
dynamic Strassen, Winograd,
and Oblivious + Winograd are all defined here. The
LEAF constant is the recursion point (defined in the mat-mulkernels.h)

- Free = no responsibility
- This software is freely available for non-commercial use. By downloading our software, you explicitly agree to this restriction.
- Please send email to fastmm@ics.uci.edu requesting for a copy of the source files.

Copyright (c) 2007, P. D'Alberto, A. Nicolau, and A. Kumar.