next_inactive up previous


Net Juggler Guide

Jérémie Allard - Valérie Gouranton - Loïck Lecointre - Emmanuel Melin
Université d'Orléans
Laboratoire d'Informatique Fondamentale d'Orléans (LIFO)
45067 Orleans Cedex 2, France
Bruno Raffin
Laboratoire ID-Imag
38330 Montbonnot, France
 
 
http://netjuggler.sourceforge.net

15th of June 2001 
First Revision, 9th November 2001
Second Revision, 26th of Jully 2002
Third Revision, 19th of September 2002


Contents

Introduction

All the way through this book we assume the reader has some experience with VR Juggler. If not, refer to VR Juggler documentation (www.vrjuggler.org). This book is also about clusters. We define a cluster as a set of computing nodes (or hosts) connected by a network, where each node supports a single system image, but the whole set of nodes does not. A set a linux PCs connected by a ethernet network is a cluster, but not a SGI Onyx. Net Juggler is a software laying on top of VR Juggler that turns a cluster where each node supports VR Juggler into a single VR Juggler image machine. In other words, from the user's point of view it (almost) does not make any difference to run a VR Juggler application on a cluster, a single PC or a SGI Onyx (from the operational point of view and not from the performance point of view). A very high-quality multi display projection or active stereo display requires the different video signals to be genlocked (video signal synchronization). Net Juggler does not include any support for genlock. If required, use appropriate hardware and/or software for genlocking the video signals (refer to netjuggler.sourceforge.net for the SoftGenLock solution). Chapter 1 is about installing Net Juggler and running a first application. Chapter 2 details how to prepare and launch an application. The chapters [*] and [*] are for readers interested in the design and the implementation of Net Juggler.


Getting Started Guide

Hardware Requirements

Graphics Cards

Net Juggler does not require any specific graphics card. In particular, because Net Juggler implements a software swaplock (swap buffer synchronization), graphics cards do not have to support swaplock. Most of todays common 3D accelerated graphics cards will ensure satisfactory results. A very high-quality multi display projection or active stereo display requires the different video signals to be genlocked (video signal synchronization). Net Juggler does not include any support for genlock. If required, use appropriate hardware and/or software for genlocking the video signals (see the SoftGenLock library that enables genlocking and active stereo for linux clusters - netjuggler.sourceforge.net).

Cluster Nodes

Net Juggler runs a copy of the VR Juggler application on each node of the cluster. Thus, if computation precision are not identical on each node, data may become incoherent. Using a cluster with identical nodes guarantees that this problem does not occur.

Network

Any kind of network can be used, provided that a communication API supported by Net Juggler is available (currently MPI) and that the performance is sufficient. Net Juggler uses synchronization barriers and data communication instructions. Barriers are mainly used for the swaplock. Because only input events are sent over the network, bandwidth should not be a limiting factor. Communication and synchronization time add extra latency in the main loop and thus can affect interactivity. We tested Net Juggler on a 4 node cluster with Performance was acceptable with the Ethernet network. With the faster networks extra time induced by communications and synchronizations was typically a few hundreds of microseconds. This is not significant compared to the tens of milliseconds required for a frame.

Software Requirements

Operating System

Net Juggler should support any operating system that VR Juggler supports. This include IRIX, Linux, Windows, Free BSD and Solaris. At the moment we only tested Net Juggler with Linux and Windows. Please contact us if you successfully compile and run Net Juggler on an other OS.

VR Juggler

Net Juggler uses VR Juggler to run a copy of the application on each node. Thus, VR Juggler should be installed on each node. Note that VR Juggler should be patched (patch included in the Net Juggler distribution) so that Net Juggler can be installed.

Graphics API

Net Juggler should support any graphics API that VR Juggler supports. This includes OpenGL and Performer. The present version supports OpenGL and Performer, but swaplock support is not yet available for Performer.

Communication Library

Net Juggler is designed so that it can easily be ported on top of various communication libraries. Currently only MPI is supported. Thus, MPI should be installed on your cluster. MPI is a widely available and is ported to almost any kind of network. The two main MPI implementations over TCP/IP for PC clusters are MPICH (www-unix.mcs.anl.gov/mpi/mpich) and lam-mpi (www.lam-mpi.org). We advice to use lam-mpi. It is distributed in rpm format, which eases the installation process. For specific networks higher performance MPI implementations may be available. Here is a non exhaustive list of high performance MPI implementations:

QT

Net Juggler GUI, NjRun, requires the QT library (www.trolltech.com). Be sure QT is installed on your system. Most of Linux distributions (Mandrake, RedHat,...) come with QT.

Installing Net Juggler

Downloading and Uncompressing VR Juggler

Currently Net Juggler only work with VR Juggler 1.0.*. It does not support yet the latest 1.1.0 and higher releases. Download and uncompress the latest VR Juggler 1.0 source code (www.vrjuggler.org). Set the VJ_BASE_DIR (vr juggler installation directory), JDK_HOME (java installation directory - required by VjControl, the VR Juggler configuration program), LD_LIBRARY_PATH (VR juggler library directory) environement variables (see the VR Juggler documentation for more details).

Downloading and Uncompressing Net Juggler

Download the latest Net Juggler source code at netjuggler.sourceforge.net. Unpack Net Juggler :
If your TAR version does not support unpacking gzipped tar files, execute instead:
% gunzip <netjuggler-distrib.tar.gz> % tar -xvf <netjuggler-distrib.tar>
A new directory containing the source code is created ( We call it <netjuggler_source_dir>.

Patching VR Juggler

The <netjuggler_base>/patch directory contains patches for different version of VR Juggler. Choose the patch corresponding to you VR Juggler distribution (vrjuggler-distrib.patch). If the corresponding file does not exist please update your Net Juggler distribution, or refer to chapter 4 for a description of the modifications that must be applied to VR Juggler. Go to VR Juggler source directory and apply the patch:
% cd <vrjuggler_source_dir> % patch -u -p 2 -i <netjuggler_source_dir/patch/vrjuggler-distrib.patch>

Compiling and Installing VR Juggler

Compile and install VR Juggler activating the POSIX performance flag (see the VR Juggler documentation for more details):
% autoheader % autoconf % ./configure  -enable-performance=POSIX % make % make install

Compiling and Installing Net Juggler

To compile Net Juggler invoke configure and make in the Net Juggler source directory:
% cd <netjuggler_source_directory> % configure % make
Several options are available with configure to customize Net Juggler. To obtain the list of these options:
% configure -help
To install Net Juggler:
% make install

Getting the Environment Ready for Net Juggler

The following environment variables as well as the xhost + command should be included in your ~/.bashrc file for a personal installation, or in a global configuration file like /etc/bashrc if you have root access and want Net Juggler to be available to all users. To force activating these configuration changes you can source the file:
% source ~/.bashrc

The VR Juggler environment variables

Check you did not forget the variables required for VR Juggler:
% VJ_BASE_DIR=<vrjuggler_install_dir> ; export VJ_BASE_DIR % JDK_HOME=<java_install_dir> ; export JDK_HOME % LD_LIBRARY_PATH=$VJ_BASE_DIR/lib:$LD_LIBRARY_PATH

The NJ_BASE_DIR variable

Set the NJ_BASE_DIR environment variable to the directory where Net Juggler is installed:
% NJ_BASE_DIR=<netjuggler_install_directory> ; export NJ_BASE_DIR

The USE_NETJUGGLER variable

The USE_NETJUGGLER environment variable allows to easily switch between a compilation for VR Juggler of for Net Juggler. See 2.2.1 for more details. Set USE_NETJUGGLER to yes:
% USE_NETJUGGLER=yes ; export USE_NETJUGGLER

X Settings

To ensure a proper window management, you must tell the X server of each PC controlling a display to grant access to clients launched from distant hosts and to ensure windows are not redirected to an other display:
% xhost +
% DISPLAY=:0 ; export DISPLAY

The PATH environment variable

Net Juggler includes several utility programs (juggler-config, njrun,netjuggler-buildconfig). Be sure the directory containing these programs is included in your PATH. If not update your PATH environment variable:
% export PATH=NJ_BASE_DIR/bin:$PATH

The QTDIR environment variable

To find the QT library, Net Juggler's GUI, NjRun, uses the QTDIR environement variable. Be sure to instanciate it properly :
% export QTDIR=/usr/lib/qt2

Testing VR Juggler

To check your VR Juggler installation works properly, compile and run the cube application delivered with VR Juggler:
% cd $VJ_BASE_DIR/samples/ogl/cubes % gmake % ./cubes $VJ_BASE_DIR/share/Data/configFiles/simstandalone.config
Point your mouse in the control window and you should be able to control the head, the wand and the camera position (see VR Juggler documentation for more details).

Testing Net Juggler

To check your Net Juggler installation works properly, compile and run the cube application delivered with Net Juggler:
% cd $NJ_BASE_DIR/doc/netjuggler/samples/cubes % make
To launch the application use the NjRun GUI (Fig [*]). See section 2.5 for more details about NjRun):
% njrun         # or $NJ_BASE_DIR/bin/njrun if not in your PATH variable
Clic on the Settings button to open the Setting window. There, clic on the button corresponding to the mpi implementation you are using (panel MPI to use). NjRun should have set the other parameters correctly. Close the Settings window. In the main NjRun window, add the following configuration files from the $NJ_BASE_DIR/etc/netjuggler/config directory:
cluster.base.config cluster.display.config cluster.netconnect.config cluster.wand.config
Select the cube executable in the Program to execute panel:
$NJ_BASE_DIR/doc/netjuggler/samples/cubes
Add the name of your current machine in the Nodes list. Push the Launch button and the cube application should start on your machine. You can control the camera position with the mouse. To escape from the control window press the Esc key. To stop the application clic the Kill button of NjRun.
Figure 1.1: NjRun main window
\includegraphics[scale=0.4]{njrunmainwin.epsi}
Repeat the operation adding 1 or two other nodes. When you launch the application you should have a display window open on each of these machines.


User's Guide

This chapter deals with all the steps required to run a VR Juggler application on a Net juggler cluster.

Preparing a VR Juggler application for Net Juggler

Argument Parsing

In the main procedure of the VR Juggler application insert a call to vjKernel::parseArg just after the vjkernel instantiation:
vjKernel* kernel=vjKernel::instance(); kernel->parseArg(&argc,&argv);  //Added line of code
The variables argc and argv should not be used before the parseArg call.

Use VR Juggler Input Devices

Check the application retrieves all input data through VR Juggler input devices. If not, modify the application. This is fundamental to run the application with Net Juggler. Net Juggler runs a copy of the VR Juggler application on each node of the cluster. To keep data coherency between all application instances, Net Juggler runs on the cluster only one instance of each VR Juggler input device and broadcast the data to all the nodes of the cluster. Any input event not retrieved through a VR Juggler input device cannot be intercepted and broadcasted to the different nodes of the cluster. The coherence of the displayed images cannot be guaranteed. For example, assume the application uses a time data. Once executed on a Net Juggler cluster, each copy of the application will have its own local clock, leading to potentially different time data. The different projectors will potentially display images corresponding to different times. The problem is solved by retrieving the time data through a VR Juggler input device. The Net Juggler distribution includes a VR Juggler time input device (described in the following section). It can be used has a pattern to develop other input devices, a random generator input device for example. Please refer to the VR Juggler documentation for more information about input devices.

TimeSystem

The time input device included in the Net Juggler distribution is an analog input that returns the amount of time taken by the last frame. It can be used to retrieve a time data for physical simulations or other computations that modify application states based on a time delay. The TimeSystem input is implemented as a VR Juggler analog input device. Use a vjAnalogInterface to access it:
vjAnalogInterface mTime; // put this in your application class
In the vjApp::init(), mTime must be initialized and named:
mTime.init("Time"); // put this in your application init() method
When you need to obtain the time taken by the last frame, usually in the preFrame method, you have to use:
float dtime=mTime->getData(); // dtime contains the last frame duration 
                              // in seconds (clamped to 1)
Then use dtime in your code to update the application states.


Configuration Chunks for TimeSystem

TimeSystem is a standard VR Juggler analog input device that requires configuration chunks that must be included in a configuration file. A chunk defines the TimeSystem device and an other chunk the associated proxy. For a VR Juggler system, the chunks are:
vjincludedescfile 
  Name "timesystem.desc" 
  end 
TimeSystem 
  Name "TimeDevice" 
  end 
AnaProxy 
  Name "Time" 
  device { "TimeDevice" } 
  unit { "0" } 
  end 
End

Compiling a VR juggler application for Net Juggler


The juggler-config command

The juggler-config command provided with Net Juggler returns the options required to compile and link a VR Juggler application. The command
$NJ_BASE_DIR/bin/juggler-config  -cflags
returns the compilation flags. The command
$NJ_BASE_DIR/bin/juggler-config  -libs
returns the libraries required for the link edition. Adding the vrjuggler option forces juggler-config to return the flags or libraries required for a VR Juggler compilation, while the netjuggler mpi option forces juggler-config to return the flags or libraries required for a Net Juggler compilation. An other way to select between VR and Net juggler is to set the environment variable USE_NETJUGGLER to yes. In this case juggler-config default options are netjuggler mpi, vrjuggler otherwise. Check the possible options with juggler-config -help.

Compilation by Hand

If your application only contains few source files you can build it by directly invoking the compiler. All you need to do is to use the juggler-config:
% gcc -o <app_exe> <source_files> `juggler-config -libs \ 
-cflags<juggler_options>` <app_options>
where:
<app_exe>
is the executable file name
<source_files>
are the application source files
<juggler_options>
are the Net/VR Juggler options
<app_options>
are the application specific options and libraries
For example, if you want to compile the "cubes" sample application (<$NJ_BASE_DIR>/samples/vrjuggler/cubes) for Net Juggler with MPI use :
% gcc -o cubes cubes.cpp cubesApp.cpp `juggler-config -libs \ 
-cflags netjuggler mpi`
Note that juggler-config works exactly like gtk-config from GTK+.

Using a Makefile

Have a look to <netjuggler_base>/samples/vrjuggler/cubes for a sample Makefile for Net Juggler. All you need to do is to use juggler-config to define compiler flags and linker options. This can be done by adding the following lines to your Makefile:
CPPFLAGS= $(CPPFLAGS) `juggler-config -cflags netjuggler` 
LIBS=$(LIBS) `juggler-config -libs netjuggler`

Preparing Configuration Files for Net Juggler

The VR Juggler configuration system is based on a set of files containing "chunks". These chunks describe the configuration of each system component. To run a VR Juggler application on a Net Juggler cluster, the configuration files should be modified (directly editing the files or using VjControl) to include cluster related extra informations. We describe the modifications required in the following. Also refer to the cluster ready configuration files delivered with Net Juggler ($NJ_BASE_DIR/Data/config).

The Host Parameter

Each configuration chunk must include a Host parameter. The Host specifies the cluster node the chunk is applied to. A Host parameter can take one of the following values: For example a User chunk that concerns each host and a FrontDisplay chunk that concerns only the host grappe7 should be defined as:
JugglerUser 
  Name "User" 
  Host { "All" } 
  ... 
  end 
DisplaySurface 
  Name "FrontDisplay" 
  Host { "grappe7" } 
  ... 
  end 
End
The following general rules can be observed to define the Host parameters: The host names in the config files are used literally, i.e. there is no name resolution. When launching an application Net Juggler collects the names of the different hosts with the gethostname function call. To know if a host is concerned by a chunk of config data, Net Juggler compares the string extracted from the config data with the host name returned by the OS and with the smae host name without the domain name (all caracteres following the first "." are reomoved). If there is something wrong about the host names you will experience error messages like stream 1 has bad source host -1.


Input Proxies

A VR Juggler application never directly accesses an input device but uses an intermediate proxy device. Net Juggler extends this approach to define a new class of input proxies, the "shared" proxies. On a cluster, an input device, a wand for example, is connected to one node, but the data must be broadcasted to all other nodes. When a "shared" proxy is encountered, Net Juggler knows that the data retrieved from that proxy must be broadcasted to each node of the cluster. This solution is elegant as it requires no modification of the application code. A shared proxy chunk is similar to a standard proxy chunk except that the chunk name is prefixed by Shared. The Host parameter of a shared proxy chunk is interpreted as the source of the shared data. Thus, the Host parameter must be the same as the Host parameter of the associated input device. For example, the following chunks define a shared proxy for the TimeSystem input device running on pc1 (see section [*] for more details about TimeSystem).
vjincludedescfile 
  Name "timesystem.desc" 
  end 
TimeSystem 
  Name "TimeDevice" 
  Host { "pc1" } 
  end 
SharedAnaProxy 
  Name "Time" 
  Host { "pc1" } 
  device { "TimeDevice" } 
  unit { "0" } 
  end 
End
Note that standard input proxies can be useful on a cluster. For example, a keyboard can be associated to a node only to change the viewport of the display associated with that node. In that case, the keyboard proxy should not be shared.

Template configuration files

The template configuration files are generic configuration files where hosts parameters are instanciated with template host names like @pc1@, @pc2@, .... Net Juggler comes with two utility programs, netjuggler-buildconfig and njrun, that uses these template configuration files to generate configuration files instanciated with the actual names of the hosts of your cluster. Working with these template configuration files save time when moving between different clusters. The configuration files distributed with net Juggler are template files. The njrun utility is described in section 2.5. By default netjuggler-buildconfig takes the template files from the $NJ_BASE_DIR/etc/netjuggler/config directory to create configuration files in the $NJ_BASE_DIR/etc/netjuggler directory. Template host names are replaced by real host names based on the data contained in the $NJ_BASE_DIR/etc/netjuggler/hosts.txt. The default behavior of netjuggler-buildconfig can be modified with the -list (file of template/actual host association) -source (template directory) -destination (generated configuration files) options. Use netjuggler-buildconfig -help to obtain the full list of available options.

Launching an Application

We detail here how to launch an application directly calling the MPI launcher scirpt, mpirun. However, we advice to use the NjRun utility that hides many ugly details you have to take care about otherwise. See section 2.5 for more details. Generally MPI implementations include a mpirun script. The arguments of the mpirun script must include the VR Juggler application you want to execute, how many processes you want to execute, generally one per node, and the configuration files. Note that the mpirun is not standard. The syntax may vary from one implementation to the other. For example, the mpirun command delivered with the MPICH implementation is used to launch the "cubes" application on 4 nodes using the cluster configuration files as follow:
  cd   \$NJ_BASE_DIR/doc/netjuggler/samples/cubes
  mpirun -np 4 cubes \
      \$NJ_BASE_DIR/etc/netjuggler/cluter.base.config \
      \$NJ_BASE_DIR/etc/netjuggler/cluster.netconnect.config \
      \$NJ_BASE_DIR/etc/netjuggler/cluter.displays.config \
      \$NJ_BASE_DIR/etc/netjuggler/cluter.wand.mixin.config
To run the application in simulator mode, change the configuration files:
mpirun -np 4 cubes \
\$NJ\_BASE\_DIR/Data/config/simstandalone.config
Here are some essential mpirun arguments :
-np
<num>: Number of process to launch.
-machinefile
<file>: Configuration file containing the cluster node list.
-nolocal
: Do not launch a process on the local host.

Use a Control Console

A comfortable situation for running a VR Juggler application on a PC cluster is to have an extra PC on your cluster that you can use as a control console. Start linux on your 5 PCs. Just start X on the console PC. From that PC, remotely launch X on the other PCs (use an xterm for each PC):
% rsh pc1 X
% rsh pc2 X
% rsh pc3 X
% rsh pc4 X
This way you can easily control X launching and see error messages if X is not set correctly. In a fifth xterm you can launch your VR application:
% mpirun -np 4 -nolocal -machinefile mpi.conf cubes. ...
where mpi.conf should look like:
pc1
pc2
pc3
pc4
The PC console can also be used to launch VjControl, the VR juggler configuration tool, to control dynamically the cluster configuration.


NjRun

NjRun allows to launch a VR Juggler application on a Net Juggler cluster with a few clicks. NjRun takes the list of nodes, the MPI library and template configuration files specified by the user. It generates the appropriate files and scripts and launch the application. NjRun is programmed in C++ with the Qt library ( www.trolltech.com ), which makes it functional under Win32 and Posix environments. All the values you enter are saved in a NjRun.conf located in your HOME directory under Posix and "Application Data" in your profile under Win32. There is also a historic of your previous values for a convenient and fast access to your favorite programs or MPI implementations.

NjRun Launching

To launch NjRun (Fig 1.1:
% NjRun         # or $NJ_BASE_DIR/bin/NjRun if not in your PATH variable

NjRun Set Up

Prior to run your application, you need to set up some parameters. Click on the "Settings" button to open the appropriate window (Fig 2.1).
Figure 2.1: NjRun settings window
\includegraphics[scale=0.3]{njrunsettingswin.epsi}

NjRun Main Operating Window

The main window of NjRun is divided in 5 main parts :

Advanced Features

MPI templates

There are several MPI templates included with NjRun, but you may be interested in creating your own. Go to the templates directory. The MPI templates are the *.mpi files. If you look at the existing ones, you will notice the templates are made of three lines : Your first line of the template should be @pc@ 2 Second line : 'pre' argument => mpirun argument to place before the call to the executable This is needed for example with Lam MPI if you wish to optimize the network accesses using the -c2c argument. Your second line should be : -c2c Of course, you can specify several arguments in this line Third Line : 'post' argument => mpirun argument to place after the call to the executable and its own arguments Useful for example with the MPD daemon of MPICH when you want to specify environment variables to define on each node, using -MPDENV- : -MPPENV- VJ_BASE_DIR=/usr/local Juggle with several configureration In order to ease the frequent use of several Net Juggler programs, NjRun can load and save different configuration sets. To do this, use the corresponding buttons in the toolbar which allow to save the program configuration to another file of your choice (rather that .NjRun.conf in your HOME directory). These files include all the settings from the save moment, including the historic. Override Proxies / Host association By default, the replacement of tags in template configuration files is made according to the order of the nodes specified in the nodes list. Now, what if you want to assign a particular proxy to a specific node without editing the templates files ? (eg : the keyboard) You can force this association by checking the "Force hosts for input and display proxies". The list contains all the proxy names that have been parsed in the configuration files. When selecting a proxy and a node to the left, then clicking the "Link" button, you force the node to manage the proxy, no matter the order of the node in its list.To restore a proxy name (unassign the link), just double click on it You can also specifiy which proxies have to be parsed by clicking the "Proxies List" button. There a window appears and you can add, remove or edit elements of the list by double-clicking on them.


Design Guide

Net Juggler was developed with the following goals: We present in this chapter how Net Juggler was designed to meet these goals.

General Design Choices

Parallelization Paradigm

To run a VR Juggler application on a cluster we adopted a simple parallelization paradigm: each node of the cluster runs its own copy of the application with its own local parameters, like the viewport for instance. Obviously, input devices are not duplicated. Thus to ensure data consistency across the different copies, input events are broadcasted to each node. This parallelization can easily be hidden fom the user, it is scalable and ensures that the amount of data to communicate is small. The main drawback is that it can lead to redundant computations. Future works will address this problem.

Sharing Inputs

The user of a VR application needs different input devices to interact in real-time with the application, like gloves, keyboards, trackers... VR Juggler collects these inputs and forward them to the application. The approach is the same with Net Juggler except that a given input device is connected to one given node only. Consequently, Net Juggler must get the inputs from the device, and broadcast the collected data to each node of the cluster.

Proxies and Inputs

Let us explain more specifically how Net Juggler gets the input data and how it broadcasts them. VR Juggler manages each input through a driver (vjInput class). This driver is connected to a proxy (vjProxy class) that forwards the data to the application. We could use specific drivers to transmit data. We would associate a server input driver to the node the device is connected to, and a client input driver for the other nodes. The main advantage is that it is very easy to add new drivers in VR Juggler. We just need to instantiate a client class and a server class for each kind of input driver. The drawback is that every single device driver would require a client and a server input driver. This may be pretty laborious. We did not adopt this solution, but we translated it at the proxy level. Instead of having client and server input drivers, we have client and server proxies. Proxies provide an abstraction of input drivers and thus their number is limited and should not increase significantly in the future. This approach only requires to modify the vjProxy class in VR Juggler so that we can derive it. Also note that a VR Juggler proxy stores a pointer to its input driver. It is used to detect if the driver is connected or not. For a server proxy this is the same. For a client proxy the pointer is set to null.

Configuration Management

System Configuration

The system configuration is very important in VR Juggler. It can be controlled by files given when starting the program, or by requests sent during the execution from VjControl. The system configuration is seen like a list of chunks, each chunk having some informations about a part of the system (display, input,...). One goal of Net Juggler is to use only one global configuration for the whole cluster, allowing at the same time to have nodes with different configurations ( different viewports for example). We add a "Host" parameter to a configuration chunk that can be equal to "All" or to a node name. It points out that the considered chunk applies to all nodes of the cluster or only to the specified node. The chunk associated to each couple of a client/server proxy is renamed by taking the regular VR Juggler proxy name prefixed with "Shared". The parameter "Host" has then a different semantics: it points out the node that runs the server proxy, all the other nodes having a client proxy.

Processing Configuration Chunks

Configuration chunks are stored in a data base on each node before being transmitted to VR Juggler. We want each node to know the whole cluster configuration to avoid to centralize configuration informations on one specific node or to have to handle scattered chunks when the user asks for the configuration. Each node has a configuration filter to select the chunks that must be applied locally.

Dynamic Configuration

To dynamically configure VR Juggler, VjControl connects to VR Juggler through a TCP connection and sends configuration requests to vjConfigManager. We extend this concept to Net Juggler. VjControl can connect to any node of the cluster running a configuration server. Configuration requests are intercepted and broadcasted to all nodes before being stored in each local data base and forwarded to the configuration filter. Note that we keep two open port per node. The "old" VR Juggler port opened by the environment manager and the Net Juggler port. The global cluster configuration can be obtained and modified by connecting VjControl to the Net Juggler port. Through the VR Juggler port only local node informations can be retrieved. It is convenient for debugging purpose or to retrieve performance data. However this connection should not be used to modify the node configuration.

Communications

Communications must take place to broadcast configuration requests and input data. For performance purpose these data transfers must be carefully managed.

Streams

We use and extend the classical stream paradigm to represent data communication between nodes. There is one stream by server proxy and by configuration server. A stream is associated to a specific node source and can have several destination nodes. Each stream is identified by a unique id number and can be created, deleted or modified at run-time. The abstraction level provided by the streams hides the actual data movements that take place at a lower level.

Messages

Data communications take place only once per frame. When a node writes into a stream, it builds a message containing the data and appends it to the buffer of pending messages. When the communication actually takes place each node broadcasts its buffer to each other node. This collective communication operation is usually called an allgather. Configuration events can take place at any time and cause buffers to have an unpredictable size. The adopted semantics for the allgather requires all nodes to know the size of the messages they will receive. When the allgather is executed, it sends input data and a special message indicating the size of the reconfiguration data. If this size is different from 0 a second communication step is triggered to send the list of the reconfiguration messages.

Network API

To ease portage to different communication libraries, Net Juggler has a communication interface hiding the library used.

Starting the Application

VR Juggler triggers the following sequence of actions when launched: the config files are loaded, next the kernel starts and only after the application is associated to the kernel. Though not really used, it should also be possible to change the application at run-time. Net Juggler reuses the same sequence of actions. To ensure that the application is started on each node with the same context (same configuration and same input data), a synchronization barrier is required.

Net Juggler Architecture

Net Juggler architecture is organized as follow3.1:

\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_modules.eps} }



The role of the different modules is:

NetKernel

UML Specification



\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_netkernel.eps} }



Description

Remarks

The derivation of the vjKernel class allows to add the functionalities required by Net Juggler. This approach enforces modularity but requires the modification of the vjKernel and the singleton system (see section 4). The NetAPI can be seen as a manager. It is initialized and controlled by the NetKernel. It does not interact directly with other managers to respect the micro-kernel organization.

NetConfigManager

UML Specification



\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_netconfigmanager.eps} }



Description

Remarks

Each node must store the current cluster configuration to answer VjControl requests. NetConfigManager has the same methods than vjConfigManager but the former holds the cluster configuration and the latter the local node configuration. NetConfigManager does not filter the pending chunks not to create a dependence with the NetStreamManager, which would be in opposition with the micro-kernel architecture. Filtering takes place in the NetKernel:checkForReconfig() methods.

NetStreamManager

UML Specification



\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_netstreammanager.eps} }



Description

Remarks

The class NetStreamFactory is similar to a VR Juggler factory. The NetStreamManager manages streams and is also responsible for filtering stream configuration chunks.

NetInputSream

UML Specification



\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_netstreaminput.eps} }



Description

Remarks

The vjServerProxy and vjClientProxy classes are templates that can be used for any type of proxy (vjAnalogProxy in this example).

NetConfigStream

UML Specification



\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_netconfigstream.eps} }



Description

NetMessage

UML Specification



\resizebox*{0.5\columnwidth}{!}{\includegraphics{UML/uml_netmessage.eps} }



Description

Remarks

The buffer is the space reserved to store a message. A message can be a concatenation of smaller messages. The methods of NetMsgList hides the details of reading and writing a message from a list (or concatenation) of messages. Message copies can significantly affect communication performance, in particular for large messages (what is considered large depends on the network). Specific protocols are developed to avoid messages recopies. Not to limit the benefits of such protocols, Net Juggler should also avoid message copies, even if message size is typically small (a few hundreds of Kbytes).

NetAPI

UML Specification



\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_netapi.eps} }



Description

Remarks

Splitting the allgather in an initialization function InitGather and a communication function AllGather allows to avoid repeating unnecessary initializations. The Init function may contain the code necessary to build a data base storing the correspondence between node ranks and node names. This data base is then accessed using the getRank and getName methods.

Sequence Diagrams

This section shows the calling order of the main Net Juggler methods.

Data Exchange

The main function of the application launches InitConfig() that sets the NetAPI and the NetStreamManager. The NetStreamManager initializes the AllGather() parameters. At each iteration of the main loop, the kernel calls the shareData() function that is divided in 3 steps. Each source node stores in a buffer the concatenation of the messages to send. The allgather communication takes place. Each destination node reads the received data.

\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_frame.eps}
}



Configuration Chunk Handling

Before configuration chunk are passed to VR Juggler, they pass through a configuration filter implemented in NetKernel::checkForReconfig. The filter detects stream chunks and filters out non local chunks depending on the host parameter. The following diagram shows the filter main loop:

\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_configfilter.eps} }



For each chunk processed, 3 cases are possible. They are described in the following sections.

Stream Chunk Processing

Stream chunks are associated to shared objects, for example a shared proxy. The configuration filter must first recognize this kind of chunk. The chunk is then passed to NetStreamManager to create the stream and generate two chunks, one for the client and one for the server. Theses chunks are added to the list of chunks to be processed. They are not directly passed to VR Juggler as they may not be local. For example the server is only instanciated on one host. The initial stream chunk is next added to the current cluster configuration.

\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_chunkstream.eps}
}



Local Chunk Processing

To detect a local chunk, the filter uses the isLocal method, passing as argument the host parameter of the chunk. If the host parameter corresponds to "All" or to the local host name, it is added to VR Juggler's vjConfigManager pending chunk list. It is also added to the current cluster configuration.

\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_chunklocal.eps} }



Non Local Chunk Processing

A non local chunks is a chunk that failed the preceding tests. If this chunk name also appears in the current local VR Juggler configuration, this means that it moved to a distant node. It must be removed from the the local VR Juggler configuration. It is next added to the current cluster configuration.

\resizebox*{1\columnwidth}{!}{\includegraphics{UML/uml_chunknonlocal.eps} }




Implementation Guide

VR Juggler modifications

The section details the modifications VR Juggler requires to support Net Juggler. A patch that should prevent you from doing it by hand is included in the Net Juggler distribution (see section [*]). These modifications do not affect the VR Juggler overall architecture. In the future, they may be directly included in the main VR Juggler distribution.

Derived Classes

Net Juggler tries whenever possible to derive VR Juggler classes instead of modifying directly the VR Juggler code. This requires the modified methods to be declared virtual, which is not always the case (no one ever thought that vjKernel::checkForReconfig could be overloaded). The affected classes are:

Chunk Type Checking in the vj*Proxy Class

The vj*Proxy:config method checks Chunk types. Because Net Juggler defines two new proxy types (client proxy and server proxy), the test must be modified accordingly. For example the vjAnalogProxy test must be modified as:
vjASSERT(((std::string)chunk->getType()) == "AnaProxy"
      || ((std::string)chunk->getType()) == "AnaClientProxy"
      || ((std::string)chunk->getType()) == "AnaServerProxy");

VR Juggler 1.0 Related Issues

VR Juggler 1.0 implementation leads to compilation problems when proxies and input devices are registered externally to VR Juggler. These problems are related to the template classes vjDeviceConstructor and vjProxyConstructor constructors that are defined in the .cpp files and not in the .h. To compile Net Juggler you need to move the corresponding code in the .h files
( Input/InputManager/vjProxyFactory.h and Input/InputManager/vjDeviceFactory.h). The InputManager produces an error when a proxy is added without an attached input device. Net Juggler requires such a possibility because client proxies are not attached to an input device. The methods vjInputManager::add*Proxy in the Input/InputManager/InputManager.cpp file must be modified to accept stupified proxies. On Win32 systems, the vjTimeStamp class is not implemented, hence Net Juggler timer can not work. You must add an empty diff method in vjTimeStampNone (file Performance/vjTimeStampNone.h to be able to compile:
//: returns 0.0
inline float diff (const vjTimeStampNone& t2) const {
return 0.0;
}

Calling a Derived Class Constructor

We describe the method we chose to call a derived class constructor without explicitly calling it to improve code modularity. For sake of clarity we concentrate on Net Juggler kernel creation, but this method also applies to other classes, for example the NetAPI and NetAPI_MPI classes. The vjKernel class is called a "singleton" because only one instance of that class can be created. This is achieved by hiding the call to the constructor in a intance method. This method creates the kernel instance if it does not already exist, and returns the instance address. Because a VR Juggler application needs a pointer to the kernel instance it has a pointer to the kernel initialized with the instance method:
vjKernel* kernel = vjKernel::instance();
On a Net Juggler cluster, an instance of the NetKernel is required instead. A solution would be to modify each application to call the instance method of the NetKernel class:
vjKernel* kernel = NetKernel::instance();
To avoid such a modification, we take advantage of the singleton system and modify its implementation (see Utils/vjSingleton.h for VR Juggler original singleton implementation and below page [*] for the modified version). The idea is the following: instead of calling the vjKernel constructor, the method instance uses a pointer sInstanceConstructor to an active constructor. This pointer points to vjKernel's constructor if the vjKernel class is not derived, and to NetKernel's constructor if NetKernel derives from vjKernel. The NetKernel class is a "derived singleton". It has a specific instance method to set the
sInstanceConstructor base pointer. Because the sInstanceConstructor pointer must be set before the instance of NetKernel is created, the NetKernel class has a static variable isRegistered. This variable initialization changes the constructor pointed by sInstanceConstructor. Singletons are used for other classes, like vjDeviceFactory, so it is important that our implementation stay compatible with VR Juggler singleton system: if no derived class is provided the sInstanceConstructor should point to the base constructor. For that goal, vjKernel initializes the static variable
sInstanceConstructor with vjKernel's constructor. We now have to make sure that isRegistered is initialized after sInstanceConstructor to set
sInstanceConstructor to the expected constructor if a derived class is provided. We force a proper order by initializing sInstanceConstructor with a pointer assignment while isRegistered is initialized with a function call. Compilers first initialize simple variables, for example those without constructors or function calls, and then complex ones. All compilers we are working with respect this initialization order, but others may not. Please contact us if you encounter such a situation. We also defined an "abstract singleton". An abstract singleton differs from a normal singleton because it can not be instantiated if no not-abstract derived class is defined (see page [*]). The abstract singleton is required by NetAPI, the Net Juggler class defining the network interface.
#define vjSingletonHeader( TYPE )                    \ 
protected:                                           \ 
   typedef TYPE *vjSingletonPtr;                     \ 
   typedef TYPE vjSingletonBase;                     \ 
   typedef vjSingletonPtr vjSingletonConstructor();  \ 
   static vjSingletonConstructor *sInstanceConstructor; \ 
   static vjSingletonPtr constructor( void );        \ 
public:                                              \ 
   static TYPE* instance( void ) 
  
#define vjDerivedSingletonHeader( TYPE )             \ 
protected:                                           \ 
   static vjSingletonPtr constructor();              \ 
   static bool registerSingleton();                  \ 
   static bool isRegistered;                         \ 
public:                                              \ 
   static TYPE* instance( void ) 
  
#define vjAbstractSingletonHeader( TYPE )            \ 
protected:                                           \ 
   typedef TYPE *vjSingletonPtr;                     \ 
   typedef TYPE vjSingletonBase;                     \ 
   typedef vjSingletonPtr vjSingletonConstructor();  \ 
   static vjSingletonConstructor *sInstanceConstructor; \ 
public:                                              \ 
   static TYPE* instance( void ) 
  
#define vjSingletonImp( TYPE )                       \ 
   TYPE::vjSingletonConstructor *TYPE::sInstanceConstructor=TYPE::constructor; \ 
   TYPE::vjSingletonPtr TYPE::constructor( void )    \ 
   {  return new TYPE; }                             \ 
   TYPE* TYPE::instance( void )                      \ 
   {                                                 \ 
      static vjMutex singleton_lock1;                \ 
      static TYPE* the_instance1 = NULL;             \ 
                                                     \ 
      if (the_instance1 == NULL)                     \ 
      {                                              \ 
         vjGuard<vjMutex> guard( singleton_lock1 );  \ 
         if (the_instance1 == NULL)                  \ 
         /*{ the_instance1 = new TYPE; }*/           \ 
         { the_instance1 = sInstanceConstructor(); } \ 
      }                                              \ 
      return the_instance1;                          \ 
   } 
  
#define vjAbstractSingletonImp( TYPE )               \ 
   TYPE::vjSingletonConstructor *TYPE::sInstanceConstructor=NULL; \ 
   TYPE* TYPE::instance( void )                      \ 
   {                                                 \ 
      static vjMutex singleton_lock1;                \ 
      static TYPE* the_instance1 = NULL;             \ 
                                                     \ 
      if (the_instance1 == NULL)                     \ 
      {                                              \ 
         vjGuard<vjMutex> guard( singleton_lock1 );  \ 
         if (the_instance1 == NULL)                  \ 
         /*{ the_instance1 = new TYPE; }*/           \ 
         { the_instance1 = sInstanceConstructor(); } \ 
      }                                              \ 
      return the_instance1;                          \ 
   } 
  
#define vjDerivedSingletonImp( TYPE )                \ 
   TYPE::vjSingletonPtr TYPE::constructor( void )    \ 
   {  return new TYPE; }                             \ 
   bool TYPE::registerSingleton()                    \ 
   {                                                 \ 
     printf("registering singleton " #TYPE "\n");    \ 
     sInstanceConstructor=TYPE::constructor;         \ 
     return true;                                    \ 
   }                                                 \ 
   bool TYPE::isRegistered=TYPE::registerSingleton(); \ 
   TYPE* TYPE::instance( void )                      \ 
   {                                                 \ 
      return static_cast<TYPE*>(vjSingletonBase::instance()); \ 
   }

Swaplock Support

For a proper display synchronization, all nodes should synchronize to swap their frame buffers (swaplock). VR Juggler does not include any swaplock support. It assumes that the underlying system is responsible for swapping synchronization. This is for example the case on an SGI Onyx system. Net Juggler is aimed at running VR applications on machines built of commodity components that usually do not support swaplock. So Net Juggler includes a software swaplock support. VR Juggler rendering occurs as follow:
drawmanager->draw(); // start drawing drawmanager->sync(); // wait until frame is displayed on
                     // screen
For swaplocking we use a synchronization barrier that forces the different nodes to wait each other before to swap their frame buffers. This synchronization barrier is preceded by a call to swapReady and followed by a call to swap, two methods that were added to the drawmanager class. The sequence of calls in the NetKernel main loop is the following:
drawmanager->draw();      // start drawing drawmanager->swapReady(); // wait until rendering is finished                           // and frame is ready to be displayed netapi->barrier();        // synchronization with other nodes
                          // (swaplock) drawmanager->swap();      // display frame on screen drawmanager->sync();      // wait until frame is displayed on
                          // screen
For OpenGL, swapReady() is based on a call to glFinish(). Swaplock for Performer is not yet supported.

Communication Library

Because we assume the communication library used may not be thread safe, calls to the NetAPI are all performed by the same thread (the kernel thread).

MPI

Thread Safe

MPI implementations are not necessarily thread safe.

Collective Communication Implementation

Depending on your MPI implementation, collective operations may not be optimized for Net Juggler communication requirements. For example the allgather operation is typically implemented by having all processors shifting messages in a ring. This is efficient for large messages, but for small messages a gather followed by a broadcast is generally more efficient. The NetAPI_MPI class contains constants that are used to select between different implementations (see NetAPI_MPI/NetAPI_MPI.h): By default all theses constant are set to 1. Refer to the code of the NetAPI_MPI class to know the other implementations available. By changing the constant values you can select different implementations. The netapi_mpi_test program can be used to measure performances.

About this document ...

Net Juggler Guide

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.62)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 1 netjuggler.tex

The translation was initiated by Bruno Raffin on 2002-09-19


next_inactive up previous
Bruno Raffin 2002-09-19