Legion: A Worldwide Virtual Computer
Home General Documentation Software Testbeds Et Cetera Map/Search

legion_run FAQ
Before running the command
1. What does this command do?[go]
2. What is the difference between legion_run and legion_run_multi?[go]
3. What are the prerequisites?[go]
4. What are the required parameters?[go]
5. What is an options file and how do I use it?[go]
6. Can I set a time limit?[go]
7. I want (or don't want) my program to run on a queue system[go]
Controlling what happens on the command line
1. Can I control the output?[go]
2. How does the debug option work?[go]
3. What is the difference between blocking and nonblocking modes? Which should I use?[go]
Setting up your run
1. Can I control which host runs my job?[go]
2. What is a probe file and do I need one?[go]
3. My program expects input files -- how do I get them to the remote host?[go]
4. My program will produce output files -- how do I get them back?[go]
5. Is this the only way to handle input and output files?[go]
6. Can I control stdin, stdout, and stderr?[go]
7. Can I check the program's status after it has started?[go]
General
1. What if the remote host crashes?[go]
2. Some hints to make life easier[go]

What does this command do?

The legion_run command starts one instance of a program on a remote host. You can ask that the job be started on a specific architecture or a specific host, if you wish. If necessary, you can tell Legion to copy input files to and output files from the remote host. The command is fully documented here.

What is the difference between legion_run and legion_run_multi?

Both of these commands are used to run programs on remote Legion hosts. But where legion_run executes only a single copy of a program on a single host, legion_run_multi executes multiple copies on multiple remote hosts.

What are the prerequisites?

You must have previously registered the program with Legion, with either legion_register_program or legion_register_runnable (see chapter 8 in the Basic User Manual). If there are any required input files, they must be visible in your local file space or context space.

What are the required parameters?

You must include the program's Legion class path (created when you registered the program in Legion). If there are any required command-line arguments, you must include them as well. This may require repeating information already provided in Legion flags: e.g., if you have specified an input file with Legion flags you may need to specify it again for your program's parameters.

What is an option file and how do I use it?

An option file is a local text file that contains instructions for input/output files, architecture, timing, nodes, and tty. If your program requires extensive instructions, this is a useful way to avoid typing everything in at the command line. You can include one or more of the legion_run options except the program class name and arbitrary command-line arguments. Use spaces, tabs, or blank lines to separate your parameters. E.g.:
-IN /dir1/inputStuff.1
-IN /dir2/inputStuff.2
-in /home/users/myContext/anotherInput
-OUT /dir1/outputStuff.1
-v
-n 4
-a linux
To use an options file, use the -f flag.

Can I set a time limit?

Yes. If the remote host enforces time limits for outside jobs, the -t flag specifies how many minutes the program needs to run. This may prove especially useful if you are running on queue systems. If a Legion job goes on a queue host, Legion's default run time limit is one hour.

I want (or don't want) my program to run on a queue system

By default, Legion will choose a remote host, depending what flags you have used. If you need to restrict the possibilities to queueing systems (a.k.a. batch queue hosts) you need to set the program class's desired_host_property to 'queue' with the legion_update_attributes command:
$ legion_update_attributes -c <program class path> \
     -a "desired_host_property('queue')"
Or, if you want to be sure that the program does not run on a queueing system, you can set the property to 'interactive':
$ legion_update_attributes -c <program class path> \
     -a "desired_host_property('interactive')"

Be aware the Legion has a default time limit of one hour for jobs on queueing systems. You may need to use the -t flag to adjust this setting.

Can I control the output?

The -w flag tells Legion to direct the program's output to the set tty object (i.e., any output will appear in your current window).

The -v flag, just as in Unix, causes the command to run in verbose mode (i.e., the program reports progress whenever possible). We strongly suggest that you always use this flag, especially if the program takes more than a few seconds to run.

How does the debug option work?

The -debug flag will give you tons of debugging information about legion_run (i.e., about the Legion objects that are working to remotely start your program), but not about your program or its input and output files.

You can redirect this output to a file by resetting stdout.

What is the difference between blocking and nonblocking modes? Which should I use?

You can run legion_run in blocking or nonblocking mode. Until the 1.7 release, all jobs were run in blocking mode and that is still the default setting. Essentially, this means that the command will monitor your job until it terminates and clean up the remote host afterwards. The command will block at the command line and poll the remote job regularly until it has finished. If your job is relatively short or you don't need to use the command line for other work, this may be the best choice.

If you run in nonblocking mode, you will free up your command line but nothing will monitor your remote job. The legion_run command will start your job and then exit. The command line will be freed up for other jobs. You do not need to keep the shell open or even remain logged in to the loca host while the job is running.

Once the job has finished, it will pass any output files marked with an -out flag into context space but it will ignore any files marked with an -OUT flag. You can pick up these files with the legion_probe_run command or, if you know the name of the remote host and the job's working directory, you can conceivably pick them up by hand. The remote host will hold on to the job's working directory for six hours. If you do not clean up the directory within that time frame, the remote host will tar and compress it and move it into your context scratch space.

Can I control which host runs my job?

Yes. There are two options: you can use the -h flag to name a specific host or -a to name a specific architecture. If more than one acceptable host is available, Legion will choose one.

The -n flag specifies how many of the remote host's nodes should be allocated for running your program. This is useful if your program is a native parallel job.

What is a probe file and do I need one?

A probe file is a text file that sits on your local host. It is created through legion_run's -p flag and is associated with whatever job the legion_run is starting. It contains information for contacting that job. You don't have to create one unless you intend to use legion_probe_run to monitor the job. If you are running in nonblocking mode we strongly suggest that you use a probe file.

In either mode, if your client crashes and you have started a probe file and you can retrieve your work.

My program expects input files -- how do I get them to the remote host?

You must make sure that these files are visible from either context space or local file space and you must use either -in or -IN to tell Legion to copy the contents of your input files to the remote host. The copied files will be placed in the local directory of the remote host and given the same file name. Once the program is finished, the copied files are deleted.

My program will produce output files -- how do I get them back?

Use either -out or -OUT to tell Legion to look for these files after the program has finished and to copy them back to your local file space or context space. If the program crashes midway through, any existing output files will be available.

Please note that if you start legion_run in nonblocking mode the remote run will copy files marked with the -out flag but NOT files marked with the -OUT flag. You can use legion_probe_run to pick them up.

Is this the only way to handle input and output files?

No. If you have a big pile of files to organize and you don't want to enter them all on the command line, you can use an option file. If you create a probe file you can also use legion_probe_run to move input and output files between the remote and local hosts and context space.

Can I control stdin, stdout, and stderr?

Yes. You can use -stdin, -stdout, and -stderr to specify a local file to use for standard input, output, and error.

Can I check the program's status after it has started?

Yes, but only if you use a probe file. You can use the legion_probe_run command to get vital information about the program while it is running.

Some hints to make life easier

  • Be careful not to run the program on the wrong architecture.

  • Don't specify conflicting architectures with -a and -h.

  • Doublecheck your input and output filenames.

  • Don't use the -a flag more than once: you will get an error.

  • Be sure to include any necessary command-line parameters for the program. For example, you may need to specify input files twice; once in Legion's -IN flag and then again in the program's command-line arguments.

  • A program can fail for multiple reasons, including internal bugs: try to be sure that your program is bugfree before you run it.

  • We suggest that you always use -v, especially if the program will be running for more than a few seconds.

Other FAQs

Last modified: Wed Jun 20 11:00:59 2001

 

[Home] [General] [Documentation] [Software]
[Testbeds] [Et Cetera] [Map/Search]

This work partially supported by DOE grant DE-FG02-96ER25290, Logicon (for the DoD HPCMOD/PET program) DAHC 94-96-C-0008, DOE D459000-16-3C, DARPA (GA) SC H607305A, NSF-NGS EIA-9974968, NSF-NPACI ASC-96-10920, and a grant from NASA-IPG.

legion@Virginia.edu
http://legion.virginia.edu/