Transferring Files To and From a Remote Host
Transfer a local file to a remote host's filesystem using Rsync via SSH.
The following example transfers a file located on the remote host's filesystem at /home/ubuntu/remote_unprocessed_file.png
to the local filesystem at unprocessed_file.png
using Rsync via SSH.
Prerequisites
Upload a color image file to a remote host. Make note of the address of the remote host.
Procedure
- Define an
Rsync
strategy with the remote host and private key path to be used for SSH.
import covalent as ct
from pathlib import Path
from typing import List, Tuple
from skimage import io, color
private_key = "/path/to/private/key"
host_address = "123.remote.host.address.com"
username = "ubuntu"
unprocessed_filename = "unprocessed_file.png"
processed_filename = "processed_file.png"
unprocessed_filepath = str(Path(unprocessed_filename).resolve())
processed_filepath = str(Path(processed_filename).resolve())
remote_source_path = f"/home/{username}/remote_{unprocessed_filename}"
remote_dest_path = f"/home/{username}/remote_{processed_filename}"
rsync_strategy = ct.fs_strategies.Rsync(user=username, host=host_address, private_key_path=private_key)
- Generate the
FileTransfer
objects usingTransferFromRemote
andTransferToRemote
factories.
ft_1 = ct.fs.TransferFromRemote(remote_source_path, unprocessed_filepath, strategy=rsync_strategy)
ft_2 = ct.fs.TransferToRemote(remote_dest_path, processed_filepath, strategy=rsync_strategy)
The Covalent Transfer*
functions intelligently assign the stage at which each file transfer should take place. The TransferFromRemote
takes place before the electron is executed so that the electron can process the file. Conversely, the TransferToRemote
takes place after the electron creates the outgoing file.
Note that TransferToRemote
is the only case in which the destination path is passed first, then the source. The FileTransfer
object generated from it adheres to the (<source_file_path>, <dest_file_path>)
convention.
- Define an electron, passing the Covalent
FileTransfer
objects to thefiles
keyword argument in the decorator.
@ct.electron(files=[ft_1, ft_2]) # ft_1 is done before the electron is executed; ft_2 is done after.
def to_grayscale(files: List[Tuple[str]] = None):
# Get the downloaded file's path
image_path = files[0][1] # destination filepath of first file transfer, downloaded before executing this electron
# Convert the image to grayscale
img = io.imread(image_path)[:, :, :3] # limiting image to 3 channels
gray_img = color.rgb2gray(img)
# Save the grayscale image to the upload file path
gray_image_path = files[1][0] # source filepath of second file transfer, to be uploaded
io.imsave(gray_image_path, gray_img)
- Create and dispatch a lattice to run the electron.
@ct.lattice
def process_remote_data():
return to_grayscale()
dispatch_id = ct.dispatch(process_remote_data)()
status = ct.get_result(dispatch_id, wait=True).status
print(status)
COMPLETED
Notes:
- The transfer operations use
rsync
to perform the transfer. - In a typical real-world scenario, this kind of transfer can be used to move data generated by the workflow.
See Also
Transferring Local Files During Workflows