Cloud Storage
Overview
The CSaW clusters have enabled the HTCondor credential server, corresponding OAuth2 web service, and provisioned two cloud providers: OneDrive and Google Drive. This grants the ability to request an OAuth2 token, and send it out as part of the job to obtain access to cloud storage for each job that requests it.
The process for utilizing cloud storage is simple, though it does require a couple of additional steps when submitting a job. The process starts the same as usual, condor_submit and pass it your job file. However, with credential access there are two branching paths that can be taken; if you already have a valid OAuth2 token from your provider stored in CredMon, your job submits and runs as normal. If you do not have valid token in OAuth2 you must begin the process of obtaining one. Fortunately, the OAuth web service that runs alongside CredMon makes this easy. condor_submit will print a message containing a URL you need to visit in a web browser, something like:
USER@head-node:~/test$ condor_submit test_cloud.job
Submitting job(s)
Hellow, USER.
Please visit: https://head-node.cluster.cs.wwu.edu/credmon/key/acad68f997146a7ab51
Visiting the URL printed in your terminal will provide you the option to login with your chosen cloud provider, or possibly grant the HTCondor credential system access to your account if it’s your first time using it. If you’ve already authenticated with your cloud provider, you may just notice the page refreshes and the blue Login button goes away to simply be replaced by the text “Logged in as ”. This indicates that your OAuth2 token has been saved and is available for use for your next submission. Simply re-submit the same job (usually pressing the ↑ on your keyboard will show the previous command so you can quickly run it again).
One thing to be mindful of when using file transfers is that you need to properly request disk space to ensure the local EP has enough room for your job. Make sure to account for any extra space you need if you’re using compressed files you plan to extract before running, as well as space for your results to be written to the disk before they’re transferred back somewhere.
See below for additional details on each supported cloud provider for information on how to use that service.
OneDrive
Note
The current configuration only supports wwu.edu based accounts. You will not be able to use your personal, non-wwu.edu based account.
Enabling OneDrive is simple, you only need to add three lines to your existing submission file:
should_transfer_files = YES
use_oauth_services = onedrive
onedrive_oauth_permissions = https://graph.microsoft.com/Files.ReadWrite
These three lines break down to the following:
The
use_oauth_serviceswill enable HTCondor to use the OneDrive service (i.e. request or reuse an existing OAuth2 token).should_transfer_files = YESwill force file transfers on. Without this, the OneDrive plugin will not be used because the CSaW environments have a shared file server by default.The last line sets the default permissions to enable read and write access to your OneDrive. See the section below on permissions for additional information.
Transferring files from OneDrive
Transferring files uses the same transfer_input_files directive
you would normally use for file transfers, except each file needs to
know where it belongs at the start. As such, each file or directory
that should come from OneDrive should contain a onedrive:// prefix
before the file path. The OneDrive plugin goes out of its way to make
file transfers easy, so you don’t need to specify the absolute path to
the file or directory. This means both of the follow examples refer to
a file named “test.txt” at the top level of your OneDrive storage:
# Option A - Absolute path beginning with /
transfer_input_files = onedrive:///test.txt
# -- OR --
# Option B - Implied path starting from the toplevel
transfer_input_files = onedrive://test.txt
The example above shows that having three /’s in a row is fine, though not required.
In addition to specifying files to transfer you can also specify directories to transfer, which will recursively transfer all files inside of it. Doing this will preserve the tree structure of the files exactly as they were in your OneDrive.
Assuming a directory named “data_dir” exists in OneDrive, and beneath “data_dir” are three directories, “A”, “B”, and “C”, each with their own “data.txt” inside of them; you could transfer “data_dir” and all of it’s contents to the local scratch directory of your job for processing.
transfer_input_files = onedrive://data_dir
The job’s working directory would look something like the following:
.
|-- _condor_stderr
|-- _condor_stdout
|-- data_dir
| |-- a
| | `-- data.txt
| |-- b
| | `-- data.txt
| `-- c
| `-- data.txt
|-- test.sh
|-- tmp
`-- var
`-- tmp
To access files transferred for your job you can reference them as a
relative path from your initial working directory. If you change your
current working directory to a different directory, you can find the
initial path by reading the environment variable
_CONDOR_SCRATCH_DIR.
Transferring files to OneDrive
Much as specifying which files to download was very similar to the normal HTCondor file transfer protocol, uploading via OneDrive is nearly the same, with only one extra line needed.
output_destination = onedrive:///results_dir
The output_destination tells HTCondor to transfer anything
specified in transfer_output_files to that location. In the
example above, all files will be placed in your OneDrive under the
“results_dir” directory.
With output_destination set, use the traditional
transfer_output_files variable to specify a list of files and
directories to transfer after the job completes.
For example, if you saved your results in a project directory called “my_project” and wanted to keep all of the files inside of it, you would specify exactly that:
output_destination = onedrive:///my_project
# Transfer all of results directory
transfer_output_files = results
This creates and populates /my_project/results in your OneDrive.
Alternatively, if you created a “results” directory which has lots of temporary results and you don’t care about the intermediate output, you can transfer your final result instead of keeping everything:
# Transfer single file from results directory
transfer_output_files = results/my_answer.txt
This gives /my_project/results/my_answer.txt in your OneDrive.
Finally, it’s also possible to save your logs to OneDrive:
output = onedrive:///my_project/logs/out.$(JobId).txt
error = onedrive:///my_project/logs/err.$(JobId).txt
log = condor.log
This saves your job’s STDOUT and STDERR to OneDrive. The HTCondor log can not currently be saved to OneDrive, because it needs to merge the logs of multiple jobs into a single file. Because of this, you will need to save it to the local system (or disable the log, but this is not recommended!).
Note
You will receive a warning when submitting with
output_destination = onedrive://. The OneDrive plugin works
around this, but not all cloud plugins do. It’s safer to specify
output_destination = onedrive:/// (with the extra / at the
end) if you’re not specifying a named directory.
OneDrive Job Example
Below are a simple test.sh script and test.job file that demonstrate transferring data to and from OneDrive.
Note
You must have a file named `test.txt` at the top level of your OneDrive for this example to run. From the My files section of your OneDrive, Right-click inside the files area, select Add New, then Text Document. In the pop-up dialog, type “test” (without the quotes), then click Create. In the new text editor that opens you can type any simple message like “This file lives in my OneDrive”, then click the 💾 (Floppy Disk icon) in the top left, followed by the X icon in the top right to close the file. Once the file exists on your OneDrive, you’re ready to run the job using the files below:
1#!/bin/sh
2
3printenv
4whoami
5pwd
6tree
7
8echo 'File contents:'
9echo '--------------'
10echo
11echo
12
13
14if [ -e test.txt ] ; then
15cat test.txt
16fi
17
18echo
19echo
20echo
21echo 'Writing to results.txt...'
22
23myjobnum=$(condor_q -jobads "$_CONDOR_JOB_AD" -f '%d' ClusterId -f '.%d\n' ProcId)
24
25echo "Hello, world from ${myjobnum}!" > results.txt
1executable = test.sh
2
3request_cpus = 1
4request_memory = 256MB
5request_disk = 10MB
6
7use_oauth_services = onedrive
8onedrive_oauth_permissions = https://graph.microsoft.com/Files.ReadWrite
9
10should_transfer_files = YES
11when_to_transfer_output = ON_EXIT
12transfer_input_files = onedrive:///test.txt
13transfer_output_files = results.txt
14
15output_destination = onedrive:///
16
17output = output.txt
18error = error.txt
19log = condor.log
20
21queue
OneDrive Permissions
The recommended default setting for onedrive_oauth_permissions is
https://graph.microsoft.com/Files.ReadWrite. Technically, there
are additional Graph API permissions that could be used, but they were
not enabled for WWU’s HTCondor application when it was provisioned.
https://graph.microsoft.com/Files.Read(Read only permissions)https://graph.microsoft.com/Files.Read.All(Read all files accessible by account, not just personal files)https://graph.microsoft.com/Files.ReadWrite.All(Read and write all files accessible by account, not just personal files)
These permissions are not currently supported, but if you’re interested in the above permissions, please contact CSaW Support and we’ll work to accommodate your use case.
OneDrive Tips & Troubleshooting
Once you have your credentials saved, you can check the status of them with the htcondor credential list command.
USER@csci-head:~$ htcondor credential list
>> no Windows password found
>> no Kerberos credential found
>> Found OAuth2 credentials for 'USER@cluster.cs.wwu.edu':
Service Handle Last Refreshed File Jobs
onedrive 2026-06-25 15:38:29 onedrive.use
While the system will keep your token refreshed as long as you have jobs running that need it, there is a small window of time in which it has expired but not yet been cleaned up. This leads to the job being able to be submitted without requesting a new token, but failing to start because it can’t authorize the file transfer. To manually clean up your token before it gets automatically cleaned up, you can use the htcondor credential command again, but this time tell it to explicit remove the OneDrive OAuth2 token:
htcondor credential remove --service onedrive oauth2
Google Drive
Note
The current configuration only supports wwu.edu based accounts. You will not be able to use your personal, non-wwu.edu based account.
Please be aware of Western’s limitations around your Google Account.
To quote the ATUS warnings:
Please note that while ITS will make our best effort to provide support, we have no enterprise support agreement with Google. File recovery support is particularly limited.
Google Workspace is not recommended for files that have HIPAA, FERPA, or Personally Identifiable Information (PII) because the university does not have an agreement in place with Google regarding sensitive data. Storage of such files would be in violation of the University’s compliance with these standards.
Google Drive is not recommended for storing files that belong to a department or group. Microsoft Teams, Office 365 Groups, and SharePoint are the fully supported shared storage locations in the cloud.
Enabling Google Drive is simple, you only need to add three lines to your existing submission file (very similar to the above OneDrive settings):
should_transfer_files = YES
use_oauth_services = gdrive
gdrive_oauth_permissions = https://www.googleapis.com/auth/drive
These three lines break down to the following:
The
use_oauth_serviceswill enable HTCondor to use the Google Drive service (i.e. request or reuse an existing OAuth2 token).should_transfer_files = YESwill force file transfers on. Without this, the Google Drive plugin will not be used because the CSaW environments have a shared file server by default.The last line sets the default permissions to enable read and write access to your Google Drive. Without this the request for the OAuth2 token will fail.
Warning
The Google Drive plugin does not currently support all the same features that the OneDrive plugin does. It currently can not recursively download directories and the maximum file size for uploading is limited to under 2GB (2147483647 bytes max). It also does not upload files to the top level directory correctly every time; to work around this it is suggested you create a directory and specify that it use that directory instead. See below in the Transferring files to Google Drive section for details.
Transferring files from Google Drive
Transferring files uses the same transfer_input_files directive
you would normally use for file transfers, except each file needs to
know where it belongs at the start. As such, each file that should
come from Google Drive should contain a gdrive:// prefix before the
file path.
# Option A - Absolute path beginning with /
transfer_input_files = gdrive:///test.txt
# -- OR --
# Option B - Implied path starting from the toplevel
transfer_input_files = gdrive://test.txt
The example above shows that having three /’s in a row is fine, though not required.
Note
Unlike the OneDrive plugin, the Google Drive plugin does not support downloading directories. One workaround is to compress your directory, then upload the compressed file to Google Drive. It is suggested to use a modern, efficient compression algorithm like xz or zst.
You can compress a directory into a single file using tar. Passing it the -J option enables xz compression:
tar -cJf my_data.tar.xz my_data
That creates a “my_data.tar.xz” file that contains the directory my_data.
To decompress the file later, you can use the -x option instead of -c:
tar -xJf my_data.tar.xz
That would extract the contents of the “my_data.tar.xz” file.
To access files transferred for your job you can reference them as a
relative path from your initial working directory. If you change your
current working directory to a different directory, you can find the
initial path by reading the environment variable
_CONDOR_SCRATCH_DIR.
Transferring files to Google Drive
Much as specifying which files to download was very similar to the normal HTCondor file transfer protocol, uploading via Google Drive is nearly the same, with only one extra line needed.
output_destination = gdrive:///results_dir
The output_destination tells HTCondor to transfer anything
specified in transfer_output_files to that location. In the
example above, all files will be placed in your Google Drive under the
“results_dir” directory.
With output_destination set, use the traditional
transfer_output_files variable to specify a list of files and
directories to transfer after the job completes.
For example, if you saved your results in a project directory called “my_project” and wanted to keep all of the files inside of it, you would specify exactly that:
output_destination = gdrive:///my_project
# Transfer all of results directory
transfer_output_files = results
This creates and populates /my_project/results in your Google
Drive.
Alternatively, if you created a “results” directory which has lots of temporary results and you don’t care about the intermediate output, you can transfer your final result instead of keeping everything:
# Transfer single file from results directory
transfer_output_files = results/my_answer.txt
This gives /my_project/results/my_answer.txt in your Google Drive.
Finally, it’s also possible to save your logs to Google Drive:
output = gdrive:///my_project/logs/out.$(JobId).txt
error = gdrive:///my_project/logs/err.$(JobId).txt
log = condor.log
This saves your job’s STDOUT and STDERR to Google Drive. The HTCondor log can not currently be saved to Google Drive, because it needs to merge the logs of multiple jobs into a single file. Because of this, you will need to save it to the local system (or disable the log, but this is not recommended!).
Note
You will receive a warning when submitting with
output_destination = gdrive://. The OneDrive plugin works
around this, but the Google Drive plugins does not. At the very
least, you will need to specify output_destination =
onedrive:/// (with the extra / at the end) if you’re not
specifying a named directory. However, uploading directories to the
top level of Google drive often times incorrectly names the
uploaded files. It is suggested that you upload to a sub directory
instead, such as output_destination = onedrive://my_project.
Google Drive Tips & Troubleshooting
Once you have your credentials saved, you can check the status of them with the htcondor credential list command.
USER@csci-head:~$ htcondor credential list
>> no Windows password found
>> no Kerberos credential found
>> Found OAuth2 credentials for 'USER@cluster.cs.wwu.edu':
Service Handle Last Refreshed File Jobs
gdrive 2026-06-25 11:28:37 gdrive.use
While the system will keep your token refreshed as long as you have jobs running that need it, there is a small window of time in which it has expired but not yet been cleaned up. This leads to the job being able to be submitted without requesting a new token, but failing to start because it can’t authorize the file transfer. To manually clean up your token before it gets automatically cleaned up, you can use the htcondor credential command again, but this time tell it to explicit remove the Google Drive OAuth2 token:
htcondor credential remove --service gdrive oauth2