Co-Array Fortran compiler 2.0 alpha version will be released soon.
Co-Array Fortran (CAF)
is a SPMD parallel programming model based on a small set of language extensions to Fortran 90. CAF supports access to non-local data using a natural extension to Fortran 90 syntax, lightweight and flexible synchronization primitives, pointers and dynamic allocation of shared data, and parallel I/O. An executing CAF program consists of a static collection of asynchronous process images. Like MPI programs, CAF programs explicitly manage locality, data and computation distribution; however, CAF is a shared-memory programming model based on one-sided communication. Rather than explicitly coding message exchanges to obtain off-processor data, CAF programs can directly reference off-processor values using an extension of Fortran 90 syntax for subscripted references. Since both remote data access and synchronization are expressed in the language, communication and synchronization are amenable to compiler-based optimizing transformations.
To date, CAF has not appealed to application scientists as a model for developing scalable, portable codes, because the language is still somewhat immature and a fledgling compiler is only available on Cray platforms . At Rice University, we are working to create an open-source, portable, retargetable, high-quality CAF compiler suitable for use with production codes. Our compiler translates CAF into Fortran 90plus calls to ARMCI , a multi-platform library for one-sided communication. Recently, we completed implementation of the core CAF language features, enabling us to begin experimentation to assess the potential of CAF as a high-performance programming model. Preliminary experiments comparing CAF and MPI versions of the BT, MG, SP and CG NAS parallel benchmarks  on a large Itanium 2 cluster with a Myrinet 2000 interconnect, show that our CAF compiler prototype already yields code with performance that is roughly equal to hand-tuned MPI.
We are in the process of designing and implementing a portable high performance source-to-source CAF compiler. To achieve portability, our compiler performs a source-to-source translation from CAF to Fortran 90 with calls to run-time library primitives. To achieve high performance, we want Fortran 90 code that we generate from CAF to be optimizable by vendor compilers. To generate high-performance code for a variety of target platforms our CAF compiler will use a parameterization of target platform characteristics to guide optimization. In our design, CAF optimization strategies are guided by platform-specific cost models that contain information about the costs of various operations for a particular architecture and interconnect. Among other things, platform-specific cost models will be used to make decisions of how best to implement communication. For example, on a platform that has an interconnect suited to coarse-grain communication, the communication cost model should guide our CAF compiler to vectorize communication; on a platform with hardware shared memory, the communication cost model should guide our CAF compiler to generate code that uses loads and stores to directly access non-local data.
We don't have a version available for download yet. The current status of the compiler is presented in the project review slides.
This work was supported in part by the Department of Energy under Grant DE-FC03-01ER25504/A000, the Los Alamos Computer Science Institute (LACSI) through LANL contract number 03891-99-23 as part of the prime contract (W-7405-ENG-36) between the DOE and the Regents of the University of California, Texas Advanced Technology Program under Grant 003604-0059-2001, and Compaq Computer Corporation under a cooperative research agreement. The Itanium cluster used in this work was purchased with support from the NSF under Grant EIA-0216467, Intel, and Hewlett Packard.