correct-way-to-swap-pytorch-tensors-without-copying

Solution

First, you should know that every tensor in PyTorch has an underlying storage that holds its actual data. You can use .storage() to retrieve the underlying storage of a tensor. Then, you should use .set_() to replace the tensor's underlying storage with a new storage.

with torch.no_grad():

    x_storage = x.untyped_storage()
    y_storage = y.untyped_storage()

    x.set_(y_storage, y.storage_offset(), y.size(), y.stride())
    y.set_(x_storage, x.storage_offset(), x.size(), x.stride())

Note: the swapping process does not affect refences to the tensors themselves. Also, swapping the storage does not interfere with the reference counting/GC since PyTorch handles reference counting and garbage collection automatically. Another reason is that you are not creating new tensors or modifying the reference counting directly.

Update

After mentioning in the comments that the swapping will be within an optimizer class, this can potentially affect the autograd graph. Swapping storage objects between tensors, it turns out that the data the autograd graph references is changed. This can lead to inconsistencies between the calculated gradients and the actual computations performed on the swapped tensors. In brief, swapping data directly within an optimizer can be problematic for autograd graph. Therefore, I do not recommend swapping tensors directly within an optimizer class.

The only solution is to use temporary tensors for safe swapping.

2024-07-03

Hamzah Al-Qadasi

Correct way to swap PyTorch tensors without copying

Update