HTCondor Cookbook
Overview
This cookbook contains little snippets of code that can be helpful in submitting jobs to the various CSaW clusters. Individually they are not enough to submit, but they can be useful to add to your existing submission files or to write new ones based around the ideas presented here.
For more complete examples please see the HTCondor Submission File Examples page.
Submitting lots of Jobs
If you need to submit multiple jobs, don’t write individual submission file for each job. Instead write one file that will handle multiple jobs, and submit them in a batch. This makes life easier for you and the scheduler.
Setting the maximum number to run simultaneously
Just because you can submit 10,000 jobs to run doesn’t mean that you
can physically run 10,000 jobs simultaneously. There may be limiting
factors such as the file server, maybe your free WandB account has
limited the maximum number of jobs you can run simultaneously, or the
admins may have asked you to make resources available for other
users. No matter the reason, you can set a max_materialize
variable to limit the maximum number of jobs to create at any given
time. When a job finishes, new ones will be created up to this limit
until all jobs in the batch are completed.
1max_materialize = 5
Setting the maximum number to keep around idle
Sometimes your jobs are tiny and complete quickly. Sometimes you just need to submit 100,000 jobs and don’t want to kill the scheduler when you try and submit that many. Setting max_idle will let as many jobs that can run be running, but always keep up to max_idle waiting in the queue, ready to run. This lets you submit massive amounts of jobs without the need for the scheduler to check if the next 99,999 out of the 100,000 you submitted can run. For teeny-tiny jobs, set this higher to something like 100 to ensure there’s always a fresh batch ready to go. For longer running jobs, setting it to 10 or fewer may be appropriate.
1max_idle = 50
Creating jobs from a list of files
Do you need to submit a job for every input file in some directory?
This recipe works great! Just make sure you test it on a few files
before submitting it on the giant directory full of them. Combine this
recipe with one of the above max_materialize
or max_idle
to
ensure that you play nicely with others.
1arguments = -arg1 -input $(filename)
2
3queue filename matching files data/*.dat
This will submit a job for every file that ends with the extension “.dat” that also exists inside a directory named data. It also sets the arguments to your program to tell it which file to read. The
Note
$(filename) is a relative path to a file by default. If you want to reference just the filename, you can use the builtin function macro $BASENAME() to get just the filename. This is particularly useful for creating logs based on just the filename without including the path to the file:
1output = logs/out.$BASENAME(filename)
2error = logs/err.$BASENAME(filename)
3log = logs/condor_log
4
5queue filename matching files data/*.dat
Assuming that the above matched the files data/a.dat, data/b.dat, data/c.data, this would create log entries for logs/out.a.dat, logs/err.a.dat, logs/out.b.dat, logs/err.b.dat, logs/out.c.dat, logs/err.c.dat, and logs/condor_log. This makes it much easier to see which input file generated which logs.
Creating jobs from a list of directories
Sometimes you have lots of individual directories that contain all the input you need for each job you want to run. Also, sometimes you need to run a command from INSIDE that exact directory. Instead copying your submission file over and over to each directory and submitting it, this lets you keep it at the top level and just find the appropriate directories.
1initialdir = $(dirname)
2
3queue dirname matching dirs to_process/*
This will submit a job for every directory inside of a directory named “to_process”, and set the initial working directory to that path when it starts the job. Imagine a scenario where you have 100 directories, and each contains a file named INPUT. Your program takes no arguments, but reads directives from this file named INPUT. So to run it needs to be inside that directory so it can open the INPUT file. This sets the directory to that matched directory, then runs your program.
Digging deep into directories to find jobs
Naming is one of the really hard problems in Computer Science, so it makes sense that naming your data directories would also be complicated. Have you found yourself in the situation where you deeply nested directories that contain your work? HTCondor supports multiple globs and combine them in many ways.
Consider the following directory tree:
1.
2├── data
3│ ├── a
4│ │ ├── a
5│ │ │ └── file.txt
6│ │ ├── b
7│ │ │ └── file.txt
8│ │ └── c
9│ │ └── file.txt
10│ ├── b
11│ │ ├── a
12│ │ │ └── file.txt
13│ │ ├── b
14│ │ │ └── file.txt
15│ │ └── c
16│ │ └── file.txt
17│ └── c
18│ ├── a
19│ │ └── file.txt
20│ ├── b
21│ │ └── file.txt
22│ └── c
23│ └── file.txt
24├── test.job
25└── test.sh
All that we know is that some directories underneath data have a file named “file.txt” that we want to process, but we don’t know where they are. We can give HTCondor a hint on how to find them, but it still needs to search through multiple layers of directories. The following will find all the “file.txt” files and then use those directories from that path.
1initialdir = $DIRNAME(filename)
2queue filename matching files data/*/*/file.txt
This submitted 9 jobs, and set the working directory to each directory that had a file.txt file inside of it.
1.
2├── data
3│ ├── a
4│ │ ├── a
5│ │ │ ├── condor.log
6│ │ │ ├── err
7│ │ │ ├── file.txt
8│ │ │ └── out
9│ │ ├── b
10│ │ │ ├── condor.log
11│ │ │ ├── err
12│ │ │ ├── file.txt
13│ │ │ └── out
14│ │ └── c
15│ │ ├── condor.log
16│ │ ├── err
17│ │ ├── file.txt
18│ │ └── out
19│ ├── b
20│ │ ├── a
21│ │ │ ├── condor.log
22│ │ │ ├── err
23│ │ │ ├── file.txt
24│ │ │ └── out
25│ │ ├── b
26│ │ │ ├── condor.log
27│ │ │ ├── err
28│ │ │ ├── file.txt
29│ │ │ └── out
30│ │ └── c
31│ │ ├── condor.log
32│ │ ├── err
33│ │ ├── file.txt
34│ │ └── out
35│ └── c
36│ ├── a
37│ │ ├── condor.log
38│ │ ├── err
39│ │ ├── file.txt
40│ │ └── out
41│ ├── b
42│ │ ├── condor.log
43│ │ ├── err
44│ │ ├── file.txt
45│ │ └── out
46│ └── c
47│ ├── condor.log
48│ ├── err
49│ ├── file.txt
50│ └── out
51├── test.job
52└── test.sh
Creating jobs from a file of pre-determined arguments
Do you ever need to run 100 jobs with slightly different input to each one? Have you already written a tiny script or program that generates the list of all the combinations of inputs? Don’t generate individual submit files for each task by setting a different arguments line. Instead, let HTCondor submit all your jobs in a batch by reading your arguments from a text file.
1-a 1 -b 10 -c 100
2-a 2 -b 9 -c 1000
3-a 3 -b 8 -c 10000
4-a 4 -b 7 -c 100000
5-a 5 -b 6 -c 1000000
1queue arguments from myargs.txt
This submits 5 jobs, each with the arguments to passed to your program from each of the lines in “myargs.txt” file.
Limiting the number of jobs for testing
If you’re playing around with the recipes above to submit lots of
jobs, make sure you test with a small subset of the jobs first before
submitting them all at once. Rather than submitting 30,000 jobs and
having them all slowly fail, submit 5 and make sure your logic is all
correct first. The easy way to do this is to add a “slice” to limit
the number of jobs that is created. HTCondor’s slices are similar to
Python slices. Opening square brackets ([
) that contain an
optional begin, followed by a required :
, followed by an optional
end, and finally a required closing square bracket (]
). This means
[0:] is all jobs, and [:5] is only the first 5 jobs. The slice is
specified after one of the following keywords on the queue line: in,
files, dirs, from.
The following recipes limit the condor_submit to 5 jobs in the batch submission no matter how many would normally be submitted.
1queue arguments from [:5] 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
2queue filename matching files [:5] alphabet/*.txt
3queue dirname matching dirs [:5] ten_thousand_dirs/*
Notice that even though many jobs would have been submitted (10, 26, 10000) HTCondor will only submit the first 5 jobs, which is perfect for testing. When you’re sure you’re ready to submit all the jobs, you can remove the slice ([:5]) and do your normal submission.
Using GPUs from the CSE Cluster
The CSE cluster does not currently (08/12/2024) have any GPU nodes that will accept jobs unless you are in a research group who funded the node purchase. Because of this, you have two options for running code on a GPU node: Preemption and Flocking.
Accepting preemption on the CSE cluster will get you access to two Nvidia T4’s, one Nvidia A2, and one Nvidia A100.
1+WantPreempt = True
2
3request_gpus = 1
Alternatively, you can submit your job to the CSE cluster and have it “Flock” to the CSCI cluster, where there are sometimes readily available GPUs.
1+WantFlocking = True
2
3request_gpus = 1
If you’re not picky about your GPU requirements, you can set both WantPreempt and WantFlocking to True, and see which one gets you a GPU fastest.
Requesting an H100
The CSCI Cluster has one node with three Nvidia H100’s installed. 2 of the H100’s are open with no priority for any CSCI student that accepts preemption, and sets the appropriate flag to request an H100. The other GPU is prioritized amongst the researchers who funded it’s purchase, and HTCondor will automatically steer jobs from members of those research groups to that card when possible.
1+WantH100 = True
2+WantPreempt = True
3
4request_gpus = 1
5require_gpus = GlobalMemoryMb >= 80000
Setting WantH100 is not enough to guarantee your job will only run on an H100, just that you would like to run on an H100. If you job must run on an H100, you will need to set a require_gpus line such as the one above to ensure that it runs only on a card with >= 80GB of memory.
Ensuring Access to InfiniBand
Depending on the cluster (CSE or CSCI) the InfiniBand nodes may be scheduled differently.
On the CSCI cluster, all of the compute nodes already have InfiniBand, so there is no additional request needed.
On the CSE Cluster, there exists a parallel scheduling group to ensure all of the allocated nodes in your parallel jobs are consistent.
1# Stay only on InfiniBand or non-InfiniBand nodes
2+WantParallelSchedulingGroups = True
3
4# Run only on InfniBand nodes
5Requirements = ParallelSchedulingGroup == "ib40_1"
Submit your own recipes
Do you have a cool HTCondor trick you’ve discovered or that an admin helped you write? Share your recipe with us and we’ll add it to the cookbook!