The Habanero-C (HC) language under development at Rice University builds on past work on Habanero-Java , which in turn was derived from X10 v1.5. HC serves as a research testbed for new compiler and runtime software technologies for extreme scale systems for homogeneous and heterogeneous processors. Like HJ, HC implements an execution model for multicore processors based on four orthogonal dimensions for portable parallelism:
1. Lightweight dynamic task creation and termination using async, finish, future, forall, foreach, ateach constructs
2. Collective and point-to-point synchronization using phasers
3. Mutual exclusion and isolation using isolated
4. Locality control using hierarchical place trees
Unlike HJ, Habanero-C is not constrained by the need for a managed runtime system such as a Java Virtual Machine. This makes it possible to map HC onto hardware platforms with lightweight system software stacks, such as the Customizable Heterogeneous Platform (CHP) being developed in the NSF Expeditions Center for Domain-Specific Computing (CDSC). It also makes it easier to integrate HC with communication middleware for cluster systems (such as MPI and GASNet).
The Habanero-C compiler is written in C++ and is built on top of the ROSE compiler infrastructure, which is also used in the PACE project at Rice University. The bulk of the Habanero-C runtime has been written from scratch in portable ANSI C. However, a few library routines for low-level synchronization and atomic operations are written in assembly language for the target platform. To date, the Habanero-C runtime has been ported and tested on Intel X86, Cyclops 64 and Intel SCC multicore platforms.
A short summary of the HC language is included below. Details on the underlying implementation technologies can be found in the Habanero publications web page. The HC implementation is still evolving at an early stage. If you would like to try out HC, please contact one of the following people: Zoran Budimlić, Yonghong Yan, Vincent Cave, or Vivek Sarkar.
Habanero-C has two basic primitives for the task parallel programming model borrowed from X10: async and finish. The async statement, async <stmt>, causes the parent task to fork a new child task that executes <stmt>. Execution of the async statement returns immediately, i.e., the parent task can proceed to its following statement without waiting for the child task to complete.
The finish statement, finish <stmt>, performs a join operation that causes the parent task to execute <stmt> and then wait until all the tasks created within <stmt> have terminated (including transitively spawned tasks).
Habanero-C uses phasers for synchronization. Phasers are programming constructs that unify collective and point-to-point synchronization in task parallel programming. Phasers are designed for ease of use and safety, helping programmer productivity in task parallel programming and debugging. The use of phasers guarantees two safety properties: deadlock-freedom and phase-ordering. These properties, along with the generality of its use for dynamic parallelism, distinguish phasers from other synchronization constructs such as barriers, counting semaphores and X10 clocks.
For locality, Habanero-C uses Hierarchical Place Trees (HPTs). HPTs abstract the underlying hardware using hierarchical trees, allowing the program to spawn tasks at places, which for example could be cores, groups of cores sharing cache, nodes, groups of nodes, or other devices such as GPUs or FPGAs. The work-stealing runtime takes advantage of the hardware hierarchy to preserve locality when executing tasks.