Open a remote file using paramiko in python slow [duplicate]
I am using paramiko to open a remote sftp file in python. With the file object returned by paramiko, I am reading the file line by line and processing the information. This seems really slow compared to using the python in-built method 'open' from the os. Following is the code I am using to get the file object.
Using paramiko (slower by 2 t开发者_如何学Cimes) -
client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(myHost,myPort,myUser,myPassword)
sftp = client.open_sftp()
fileObject = sftp.file(fullFilePath,'rb')
Using os -
import os
fileObject = open(fullFilePath,'rb')
Am I missing anything? Is there a way to make the paramiko fileobject read method as fast as the one using the os fileobject?
Thanks!!
Your problem is likely to be caused by the file being a remote object. You've opened it on the server and are requesting one line at a time - because it's not local, each request takes much longer than if the file was sitting on your hard drive. The best alternative is probably to copy the file down to a local location first, using Paramiko's SFTP get
.
Once you've done that, you can open the file from the local location using os.open
.
I was having the same issue and I could not afford to copy the file locally because of security reasons, I solved it by using a combination of prefetching and bytesIO:
def fetch_file_as_bytesIO(sftp, path):
"""
Using the sftp client it retrieves the file on the given path by using pre fetching.
:param sftp: the sftp client
:param path: path of the file to retrieve
:return: bytesIO with the file content
"""
with sftp.file(path, mode='rb') as file:
file_size = file.stat().st_size
file.prefetch(file_size)
file.set_pipelined()
return io.BytesIO(file.read(file_size))
Here is a way that works using scraping the command line (cat) in paramiko, and reading all lines at once. Works well for me:
import paramiko
client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.WarningPolicy())
client.connect(hostname=host, port=port, username=user, key_filename=ssh_file)
stdin, stdout, stderr = client.exec_command('cat /proc/net/dev')
net_dump = stdout.readlines()
#your entire file is now in net_dump .. do as you wish with it below ...
client.close()
The files I open are quite small so it all depends on your file size. Worth a try :)
精彩评论