Question
How to copy files
How do I copy a file in Python?
Question
How do I copy a file in Python?
Solution
shutil
has many methods you can use. One of which is:
import shutil
shutil.copyfile(src, dst)
# 2nd option
shutil.copy(src, dst) # dst can be a folder; use shutil.copy2() to preserve timestamp
src
to a file named dst
. Both src
and dst
need to be the entire filename of the files, including path.IOError
exception will be raised.dst
already exists, it will be replaced.copy
, src
and dst
are path names given as str
s.Another shutil
method to look at is shutil.copy2()
. It's similar but preserves more metadata (e.g. time stamps).
If you use os.path
operations, use copy
rather than copyfile
. copyfile
will only accept strings.
Solution
Function | Copies metadata |
Copies permissions |
Uses file object | Destination may be directory |
---|---|---|---|---|
shutil.copy | No | Yes | No | Yes |
shutil.copyfile | No | No | No | No |
shutil.copy2 | Yes | Yes | No | Yes |
shutil.copyfileobj | No | No | Yes | No |
Solution
copy2(src,dst)
is often more useful than copyfile(src,dst)
because:
dst
to be a directory (instead of the complete target filename), in which case the basename of src
is used for creating the new file;Here is a short example:
import shutil
shutil.copy2('/src/dir/file.ext', '/dst/dir/newname.ext') # complete target filename given
shutil.copy2('/src/file.ext', '/dst/dir') # target filename is /dst/dir/file.ext
Solution
In Python, you can copy the files using
shutil
moduleos
modulesubprocess
moduleimport os
import shutil
import subprocess
shutil
moduleshutil.copyfile
signature
shutil.copyfile(src_file, dest_file, *, follow_symlinks=True)
# example
shutil.copyfile('source.txt', 'destination.txt')
shutil.copy
signature
shutil.copy(src_file, dest_file, *, follow_symlinks=True)
# example
shutil.copy('source.txt', 'destination.txt')
shutil.copy2
signature
shutil.copy2(src_file, dest_file, *, follow_symlinks=True)
# example
shutil.copy2('source.txt', 'destination.txt')
shutil.copyfileobj
signature
shutil.copyfileobj(src_file_object, dest_file_object[, length])
# example
file_src = 'source.txt'
f_src = open(file_src, 'rb')
file_dest = 'destination.txt'
f_dest = open(file_dest, 'wb')
shutil.copyfileobj(f_src, f_dest)
os
moduleos.popen
signature
os.popen(cmd[, mode[, bufsize]])
# example
# In Unix/Linux
os.popen('cp source.txt destination.txt')
# In Windows
os.popen('copy source.txt destination.txt')
os.system
signature
os.system(command)
# In Linux/Unix
os.system('cp source.txt destination.txt')
# In Windows
os.system('copy source.txt destination.txt')
subprocess
modulesubprocess.call
signature
subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.call('cp source.txt destination.txt', shell=True)
# In Windows
status = subprocess.call('copy source.txt destination.txt', shell=True)
subprocess.check_output
signature
subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False)
# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.check_output('cp source.txt destination.txt', shell=True)
# In Windows
status = subprocess.check_output('copy source.txt destination.txt', shell=True)
Solution
You can use one of the copy functions from the shutil
package:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Function preserves supports accepts copies other permissions directory dest. file obj metadata ―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― shutil.copy ✔ ✔ ☐ ☐ shutil.copy2 ✔ ✔ ☐ ✔ shutil.copyfile ☐ ☐ ☐ ☐ shutil.copyfileobj ☐ ☐ ✔ ☐ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Example:
import shutil
shutil.copy('/etc/hostname', '/var/tmp/testhostname')
Solution
Copying a file is a relatively straightforward operation as shown by the examples below, but you should instead use the shutil stdlib module for that.
def copyfileobj_example(source, dest, buffer_size=1024*1024):
"""
Copy a file from source to dest. source and dest
must be file-like objects, i.e. any object with a read or
write method, like for example StringIO.
"""
while True:
copy_buffer = source.read(buffer_size)
if not copy_buffer:
break
dest.write(copy_buffer)
If you want to copy by filename you could do something like this:
def copyfile_example(source, dest):
# Beware, this example does not handle any edge cases!
with open(source, 'rb') as src, open(dest, 'wb') as dst:
copyfileobj_example(src, dst)
Solution
Use the shutil module.
copyfile(src, dst)
Copy the contents of the file named src to a file named dst. The destination location must be writable; otherwise, an IOError exception will be raised. If dst already exists, it will be replaced. Special files such as character or block devices and pipes cannot be copied with this function. src and dst are path names given as strings.
Take a look at filesys for all the file and directory handling functions available in standard Python modules.
Solution
Directory and File copy example, from Tim Golden's Python Stuff:
import os
import shutil
import tempfile
filename1 = tempfile.mktemp (".txt")
open (filename1, "w").close ()
filename2 = filename1 + ".copy"
print filename1, "=>", filename2
shutil.copy (filename1, filename2)
if os.path.isfile (filename2): print "Success"
dirname1 = tempfile.mktemp (".dir")
os.mkdir (dirname1)
dirname2 = dirname1 + ".copy"
print dirname1, "=>", dirname2
shutil.copytree (dirname1, dirname2)
if os.path.isdir (dirname2): print "Success"
Solution
For small files and using only Python built-ins, you can use the following one-liner:
with open(source, 'rb') as src, open(dest, 'wb') as dst: dst.write(src.read())
This is not optimal way for applications where the file is too large or when memory is critical, thus Swati's answer should be preferred.
Solution
There are two best ways to copy file in Python.
shutil
moduleCode Example:
import shutil
shutil.copyfile('/path/to/file', '/path/to/new/file')
There are other methods available also other than copyfile, like copy, copy2, etc, but copyfile is best in terms of performance,
OS
moduleCode Example:
import os
os.system('cp /path/to/file /path/to/new/file')
Another method is by the use of a subprocess, but it is not preferable as it’s one of the call methods and is not secure.
Solution
Firstly, I made an exhaustive cheat sheet of the shutil methods for your reference.
shutil_methods =
{'copy':['shutil.copyfileobj',
'shutil.copyfile',
'shutil.copymode',
'shutil.copystat',
'shutil.copy',
'shutil.copy2',
'shutil.copytree',],
'move':['shutil.rmtree',
'shutil.move',],
'exception': ['exception shutil.SameFileError',
'exception shutil.Error'],
'others':['shutil.disk_usage',
'shutil.chown',
'shutil.which',
'shutil.ignore_patterns',]
}
Secondly, explaining methods of copy in examples:
shutil.copyfileobj(fsrc, fdst[, length])
manipulate opened objects
In [3]: src = '~/Documents/Head+First+SQL.pdf'
In [4]: dst = '~/desktop'
In [5]: shutil.copyfileobj(src, dst)
AttributeError: 'str' object has no attribute 'read'
# Copy the file object
In [7]: with open(src, 'rb') as f1,open(os.path.join(dst,'test.pdf'), 'wb') as f2:
...: shutil.copyfileobj(f1, f2)
In [8]: os.stat(os.path.join(dst,'test.pdf'))
Out[8]: os.stat_result(st_mode=33188, st_ino=8598319475, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067347, st_mtime=1516067335, st_ctime=1516067345)
shutil.copyfile(src, dst, *, follow_symlinks=True)
Copy and rename
In [9]: shutil.copyfile(src, dst)
IsADirectoryError: [Errno 21] Is a directory: ~/desktop'
# So dst should be a filename instead of a directory name
shutil.copy()
Copy without preseving the metadata
In [10]: shutil.copy(src, dst)
Out[10]: ~/desktop/Head+First+SQL.pdf'
# Check their metadata
In [25]: os.stat(src)
Out[25]: os.stat_result(st_mode=33188, st_ino=597749, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516066425, st_mtime=1493698739, st_ctime=1514871215)
In [26]: os.stat(os.path.join(dst, 'Head+First+SQL.pdf'))
Out[26]: os.stat_result(st_mode=33188, st_ino=8598313736, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516066427, st_mtime=1516066425, st_ctime=1516066425)
# st_atime,st_mtime,st_ctime changed
shutil.copy2()
Copy with preserving the metadata
In [30]: shutil.copy2(src, dst)
Out[30]: ~/desktop/Head+First+SQL.pdf'
In [31]: os.stat(src)
Out[31]: os.stat_result(st_mode=33188, st_ino=597749, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067055, st_mtime=1493698739, st_ctime=1514871215)
In [32]: os.stat(os.path.join(dst, 'Head+First+SQL.pdf'))
Out[32]: os.stat_result(st_mode=33188, st_ino=8598313736, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067063, st_mtime=1493698739, st_ctime=1516067055)
# Preserved st_mtime
shutil.copytree()
Recursively copy an entire directory tree rooted at src, returning the destination directory.
Solution
As of Python 3.5 you can do the following for small files (ie: text files, small jpegs):
from pathlib import Path
source = Path('../path/to/my/file.txt')
destination = Path('../path/where/i/want/to/store/it.txt')
destination.write_bytes(source.read_bytes())
write_bytes
will overwrite whatever was at the destination's location
Solution
You could use os.system('cp nameoffilegeneratedbyprogram /otherdirectory/')
.
Or as I did it,
os.system('cp '+ rawfile + ' rawdata.dat')
where rawfile
is the name that I had generated inside the program.
This is a Linux-only solution.
Solution
Use
open(destination, 'wb').write(open(source, 'rb').read())
Open the source file in read mode, and write to the destination file in write mode.
Solution
Use subprocess.call
to copy the file
from subprocess import call
call("cp -p <file> <file>", shell=True)
Solution
For large files, I read the file line by line and read each line into an array. Then, once the array reached a certain size, append it to a new file.
for line in open("file.txt", "r"):
list.append(line)
if len(list) == 1000000:
output.writelines(list)
del list[:]
Solution
In case you've come this far down. The answer is that you need the entire path and file name
import os
shutil.copy(os.path.join(old_dir, file), os.path.join(new_dir, file))
Solution
Here is a simple way to do it, without any module. It's similar to this answer, but has the benefit to also work if it's a big file that doesn't fit in RAM:
with open('sourcefile', 'rb') as f, open('destfile', 'wb') as g:
while True:
block = f.read(16*1024*1024) # work by blocks of 16 MB
if not block: # end of file
break
g.write(block)
Since we're writing a new file, it does not preserve the modification time, etc.
We can then use os.utime
for this if needed.
Solution
Similar to the accepted answer, the following code block might come in handy if you also want to make sure to create any (non-existent) folders in the path to the destination.
from os import path, makedirs
from shutil import copyfile
makedirs(path.dirname(path.abspath(destination_path)), exist_ok=True)
copyfile(source_path, destination_path)
As the accepted answers notes, these lines will overwrite any file which exists at the destination path, so sometimes it might be useful to also add: if not path.exists(destination_path):
before this code block.
Solution
For the answer everyone recommends, if you prefer not to use the standard modules, or have completely removed them as I've done, preferring more core C methods over poorly written python methods
The way shutil works is symlink/hardlink safe, but is rather slow due to os.path.normpath()
containing a while (nt, mac) or for (posix) loop, used in testing if src
and dst
are the same in shutil.copyfile()
This part is mostly unneeded if you know for certain src
and dst
will never be the same file, otherwise a faster C approach could potentially be used.
(note that just because a module may be C doesn't inherently mean it's faster, know that what you use is actually written well before you use it)
After that initial testing, copyfile()
runs a for loop on a dynamic tuple of (src, dst)
, testing for special files (such as sockets or devices in posix).
Finally, if follow_symlinks
is False, copyfile()
tests if src
is a symlink with os.path.islink()
, which varies between nt.lstat()
or posix.lstat()
(os.lstat()
) on Windows and Linux, or Carbon.File.ResolveAliasFile(s, 0)[2]
on Mac.
If that test resolves True, the core code that copies a symlink/hardlink is:
os.symlink(os.readlink(src), dst)
Hardlinks in posix are done with posix.link()
, which shutil.copyfile()
doesn't call, despite being callable through os.link()
.
(probably because the only way to check for hardlinks is to hashmap the os.lstat()
(st_ino
and st_dev
specifically) of the first inode we know about, and assume that's the hardlink target)
Else, file copying is done via basic file buffers:
with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
copyfileobj(fsrc, fdst)
(similar to other answers here)
copyfileobj()
is a bit special in that it's buffer safe, using a length
argument to read the file buffer in chunks:
def copyfileobj(fsrc, fdst, length=16*1024):
"""copy data from file-like object fsrc to file-like object fdst"""
while 1:
buf = fsrc.read(length)
if not buf:
break
fdst.write(buf)
Hope this answer helps shed some light on the mystery of file copying using core mechanisms in python. :)
Overall, shutil isn't too poorly written, especially after the second test in copyfile()
, so it's not a horrible choice to use if you're lazy, but the initial tests will be a bit slow for a mass copy due to the minor bloat.
Solution
I would like to propose a different solution.
def copy(source, destination):
with open(source, 'rb') as file:
myFile = file.read()
with open(destination, 'wb') as file:
file.write(myFile)
copy("foo.txt", "bar.txt")
The file is opened, and it's data is written to a new file of your choosing.