IBM®
Skip to main content
    United States change      Terms of use
 
 
Select a scope:    
     Home      Products      Services & industry solutions      Support & downloads      My account     
alphaWorks  >  Grid computing  >  

Grid Application Framework for Java

A framework that abstracts all grid semantics from the application logic and provides a simpler programming model.


Date Posted: February 13, 2003
Overview Requirements DownloadFAQsForum Reviews

1. What is GAF4J built on?
2. When can I look at using GAF4J?
3. What is the model available to developers for distributing comuputations over the grid?
4. What are the differences between GAF4J 0.9.5 and GAF4J 0.9.6?
5. If I choose to "Distribute Threads over the Grid," then why must I download DPPEJ?
6. Is it neccessary to download DPPEJ even if my client application does not use the "Distribute Threads over the Grid" model?
7. Is there any specific set-up of the grid expected if the client application uses the "Distribute Threads over the Grid" model?
8. Does GAF4J support resource brokering?
9. What are the contents of the shell file named java, and why is it required?
10. What are the various configuration files used by GAF4J 0.9.5 and what is the significance of each?
11. What is the significance for the configuration information in gaf4jAttributes.conf?
12. What is the significance for the configuration information in gaf4jRISs.conf?
13. How can I know the "Mds-Vo-name" of a GIIS?
14. How can I retrieve a list of all the computers (along with their attributes) in a grid?
15. How can I register the GRIS of a machine X to the GIIS of a machine Y?
16. Where can I get more information about setting up a hierarchical grid?
17. Where can I get information about the various attributes of nodes supported by MDS?
18. The sample gaf4jAttributes.conf file has an attribute with MDS name Mds-Shared-dir. But when I see http://www.globus.org/mds/Schema.html, I find that attribute Mds-Shared-dir is not supported by MDS. What is its significance?
19. What can a regular DPPEJ user observe as a difference in the operational model of DPPEJ integrated into GAF4J?
20. To a regular DPPEJ user, what are the programming model differences when using GAF4J's "Distribute Threads over the Grid" model?
21. What is the significance of providing the arguments "gaf4jLocation" and "statusListner" when creating a GridDistributedExecutionManager in order to use DPPEJ in GAF4J 0.9.6?


1. What is GAF4J built on?

The profile of GAF4J 0.9.6 is as follows:
  • Built over Globus Toolkit 2.4 and Java CoG 1.1
  • Uses GridFTP to make directories and transfer files across local and remote grid nodes
  • Integrated with Globus MDS (GIIS) for resource discovery
  • Implements a Resource Broker that is configurable for grid-node weight computations and grid-node attribute comparisons
  • Can be integrated with DPPEJto provide additional programming model for distributing threads over a Globus Grid.
Back to top Back to top

2. When can I look at using GAF4J?

GAF4J can be used to develop Java applications or to enable existing Java applications to use computer resources on a grid. In this regard, there are two scenarios in which a Java application developer can look at using GAF4J:
  • When there is a computation-intensive task that can be spawned off and executed in parallel with the client application's current running thread
  • When there is a computation-intensive task that must be cloned into multiple instances and the cloned instances to be executed in parallel, exchanging MPI-like messages.
Back to top Back to top

3. What is the model available to developers for distributing comuputations over the grid?

Presently, GAF4J provides two models for distribution computations over a grid. They are:
  • Distributing Tasks over the Grid: If there is a function or task that requires computation resources locally unavailable to the client application and if this task can be executed asynchronously in an independent thread, then this function can be encapsulated as a GAF4J task and executed over the grid.
  • Distributing Threads over the Grid: Consider a fuction or task that must be cloned into multiple instances and all instances to be executed in parallel, with the capability for sychronizing with each other using MPI-like messages. GAF4J provides a second model of task distribution where the cloning of the function into multiple instances and its distribution over the grid will be handled by GAF4J.
For more inforamation on programming against each of these models refer to the "GAF4J Programming Model" document.
Back to top Back to top

4. What are the differences between GAF4J 0.9.5 and GAF4J 0.9.6?

The differences are as follows: GAF4J 0.9.5:
  • Built over Globus Toolkit 2.4 and Java CoG 1.1
  • Uses GridFTP to make directories and transfer files across local and remote Grid Nodes
  • Can be integrated with Globus MDS (GIIS) for resource discovery
  • Implements a resource broker that is configurable for grid-node weight computations and grid-node attribute comparisons
  • Supports only the model of "Distribute Tasks over the Grid."
GAF4J 0.9.6:
Extension over existing GAF4J 0.9.5 to support the model of "Distirbute Threads over the Grid" in addition to the model of "Distribute Tasks over the Grid" that exists in Version 0.9.5.
Back to top Back to top

5. If I choose to "Distribute Threads over the Grid," then why must I download DPPEJ?

GAF4J provides the model of Distributing Threads over the Grid by integrating into GAF4J another distributed computing environment called Distributed Parallel Programming Environment for Java (DPPEJ), which is avaiable here at alphaWorks. DPPEJ is not distributed along with the GAF4J package. Instead, developers must download it directly from IBM alphaWorks. After downloading DPPEJ, the developer must extract the file dthread.jar into the same sub-directory where the GAF4J JAR files have been extracted.
Back to top Back to top

6. Is it neccessary to download DPPEJ even if my client application does not use the "Distribute Threads over the Grid" model?

No. If your client application does not use the "Distribute Threads over the Grid" model and uses only the "Distribute Tasks over the Grid" model, then DPPEJ is not required.
Back to top Back to top

7. Is there any specific set-up of the grid expected if the client application uses the "Distribute Threads over the Grid" model?

GAF4J supports this model of task distribution over the grid by integrating DPPEJ. The DPPEJ environment uses Java RMI for enabling MPI-like message passing among the threads. To faciliate this message passing, the grid server nodes must permit for Java RMI communications. On each such grid server node, if the owner of the node wishes to specify his own security policies for this RMI Communication then such policies must be coded in a file named java.policy and placed in the directory /tmp on the grid node. If this file cannot be located in /tmp, then GAF4J will use a policy file that comes with its package.
Back to top Back to top

8. Does GAF4J support resource brokering?

GAF4J is designed to include resource brokering. However, in the current release of GAF4J, there is a simple Resource Broker Service (RBS) that functions from the client JVM. This service obtains Node Capability information from the RIS and compares them with the required capabilities put in by the Java application. The nodes that match are considered for subsequent job submission and execution. The RBS may collect information from more than one instance of RIS, where each instance of the RIS is a wrapper around the GIIS service of the Globus Toolkit's Information Service (MDS).
Back to top Back to top

9. What are the contents of the shell file named java, and why is it required?

The shell file named java is required on the grid nodes in order to point to the JRE on the grid nodes. This shell file must be available in the /tmp directory of grid nodes since /tmp is normally accessible to all users. The contents of this file is as follows:
    <absolute path of java executable>      $*
    for example,
      /usr/local/jsdk1.4.0/bin/java       $*
Back to top Back to top

10. What are the various configuration files used by GAF4J 0.9.5 and what is the significance of each?

There are four configuration files connected with GAF4J 0.9.5. The details of each are as follows:
  • FwkJars: This file contains a list of JAR files that GAF4J must distribute to remote grid nodes as part of the framework classes. This file must not be modified by client programs.
  • FwkJarsExt: This file contains a list of JAR files that are external to GAF4J, such as the ones from DPPEJ. These JAR files must also be distributed to remote grid nodes as part of the framework classes. This file must not be modified by client programs.
  • gaf4jAttributes.conf: This file contains client application-defined names for the attributes of grid nodes. Each name maps to a corresponding attribute name in MDS and the name of a class that is to be used to represent that attribute's string value as returned by the MDS.
  • gaf4jRISs.conf: This file contains initialization information for instantiating Resource Information Service of GAF4J. Each line in this file corresponds to information pertaining to a GIIS instance (Grid Index Information Service) of Globus MDS. This information is used by the Resoruce Broker service to instantiate the Resource Information Service instances that it must hold.
  • gaf4jBroker.conf: This file contains initialization information for the Resource Broker Service of GAF4J. This file defines whether the Resource Broker in GAF4J must display information of nodes that it deals with, the class that must be used by the Broker to compute node weights, the weights that must be associated with grid nodes according to their model, etc. The Resource Broker instantiates one or more Resource Information Service (RIS) instances in order to collect information about grid nodes in the domain or network.
For more details, please refer to the included white paper and sample configuration files.
Back to top Back to top

11. What is the significance for the configuration information in gaf4jAttributes.conf?

Information about grid nodes could be discovered using the MDS service of Globus Toolkit. MDS describes a grid node through various attributes and values for these attributes. However, the names used by MDS are lengthy and incovenient to remember. It would be preferrable if client programs could define their own attribute names that, in turn, map to the MDS names. In addition, the values returned by MDS for the attribute names are strings that require further interpretation. Here again, client programs would prefer to have the flexibility of defining how these string values are to be interpreted for various attributes. This can be done by abstracting data types (classes) that represent the attribute values, given the string returned by MDS. However, such classes must implement the inteface java.lang.Comparable to enable the Resource Broker to perform a correct comparison of attribute values when searching for grid nodes with specified attributes. The configuration file gaf4Attributes.conf is the file that enables client programs to map MDS attribute names with a client program-defined attribute name and a class that is to be used to represent the attribute's value. This facility provides flexibility so that client programs can be unaffected by changes to MDS attribute names. There are some mandatory entries in this file that must not be modified by client programs.
Back to top Back to top

12. What is the significance for the configuration information in gaf4jRISs.conf?

The Resource Broker of GAF4J works with one or more Resource Information Service (RIS) instances to obtain information about grid nodes. Every instance of an RIS is a wrapper around a GIIS. Through this configuration file, the client program can include several domains or networks of grid nodes for the task distribution by simply adding information about the GIIS that indexes information about the nodes.
Back to top Back to top

13. How can I know the "Mds-Vo-name" of a GIIS?

In the file $GLOBUS_LOCATION/etc/grid-info-slapd.conf, look for the line database giis. The next line mentions the Mds-Vo-name (or giis-name) of that GIIS.
Back to top Back to top

14. How can I retrieve a list of all the computers (along with their attributes) in a grid?

Use the following command (all on one line):
    grid-info-search -x -h giishost -p giisport –b "mds-vo-name=giis-name, o=grid" "(objectclass=mdshost)"
where giishost represents the IP address/DNS name of the machine hosting the root GIIS of grid; giisport represents the port number of which LDAP server is running on that host; and giis-name represents the Mds-Vo-name of that GIIS.
Back to top Back to top

15. How can I register the GRIS of a machine X to the GIIS of a machine Y?

On machine X, update the configuration file $GLOBUS_LOCATION/etc/grid-info-resource-register.conf and restart its LDAP server (MDS).
Back to top Back to top

16. Where can I get more information about setting up a hierarchical grid?

Please see the following paper: Creating a Hierarchical GIIS The following are some other useful sites:
Back to top Back to top

17. Where can I get information about the various attributes of nodes supported by MDS?

Please see MDS Schemas.
Back to top Back to top

18. The sample gaf4jAttributes.conf file has an attribute with MDS name Mds-Shared-dir. But when I see http://www.globus.org/mds/Schema.html, I find that attribute Mds-Shared-dir is not supported by MDS. What is its significance?

MDS allows additional information to be posted to it through GRIS. Further releases of GAF4J will exploit this capability in order to allow the user to specify a shared directory other than /tmp, as currently assumed.
Back to top Back to top

19. What can a regular DPPEJ user observe as a difference in the operational model of DPPEJ integrated into GAF4J?

GAF4J expands DPPEJ from a preset cluster of nodes to a wider and more open Globus grid. The following are the differences that GAF4J makes in the existing operational model of DPPEJ:
  • Automatic identification of nodes during application run time, in order to run DThread instances on the grid
  • Run-time set-up and start of the DPPEJ Daemon on remote server nodes
  • Facility for defining the required capability of nodes for running the DThread instances so that only suitable nodes are selected
  • Facility for transferring input, output, and library files to DThread instances from client node to remote server nodes
  • Transfer of log files from remote server nodes to the client nodes
  • Asynchronous status updates on the various DThread instances spawned by a client application.
Back to top Back to top

20. To a regular DPPEJ user, what are the programming model differences when using GAF4J's "Distribute Threads over the Grid" model?

There is no change to the existing DPPEJ programming model. However, since through this integration GAF4J brings in additional facitlities into DPPEJ, it is essential to use the classes extended by GAF4J, such as the GridDistributedExecutionManager and the GridDThreadExecutionContext classes. The DPPEJ developer will see minor differences from the known programming model, as follows: Creation of the context object: To create the context object, developers must use the class GridDThreadExecutionContext. In this context, instance developers may optionally configure the following:
  • Capabilities required from Grid Server nodes to execute the Dthread
  • Input, Library and Output files that would need to be transferred to the Grid Server Nodes.
Creation of the Execution Manager: To create the ExecutionManager, developers must use the class GridDistributedExecutionManager, providing two arguments "gaf4jLocation" and "statusListener" to the constructor. The significance of these arguments is discussed in the answer to the next question. For more details on the programming model, please refer to the document "GAF4J Programming Model."
Back to top Back to top

21. What is the significance of providing the arguments "gaf4jLocation" and "statusListner" when creating a GridDistributedExecutionManager in order to use DPPEJ in GAF4J 0.9.6?

The primary benefit that GAF4J brings into DPPEJ is the automated set-up of the DPPEJ environment over the grid and the asynchronous updates about the status of the various DThread instances. To set up the DPPEJ environment during run-time, the GridDistributedExecutionManager must know the location of the DPPEJ and GAF4J JAR files on the client node's file system. This location is to be mentioned in the argument to the constructor of GridDistributedExecutionManager. Next, in order to enable the GridDistributedExecutionManager to post status updates to the client program, the client program must provide a TaskStatusListener to handle these status updates. This status handler is the second argument that must be put in to the constructor of GridDistributedExecutionManager.
Back to top Back to top
Download now Download now

Related technologies

For platform(s):
Linux

For topics:
Globus, intergration


Related resources

IBM Grid Computing

developerWorks grid resources

The Globus Project

Global Grid Forum

Grid Computing Planet

Press Articles

 

    About IBM Privacy Contact