amazon s3 - "Text file busy" error on multithreading python -
i have python script downloads shell scripts amazon s3 server , executes them (each script 3gb in size). function downloads , executes file looks this:
import boto3 def parse_object_key(key): key_parts = key.split(':::') return key_parts[1] def process_file(file): client = boto3.client('s3') node = parse_object_key(file) file_path = "/tmp/" + node + "/tmp.sh" os.makedirs(file_path) client.download_file('category', file, file_path) os.chmod(file_path, stat.s_ixusr) os.system(file_path)
the node unique each file.
i created loop execute this:
s3 = boto3.resource('s3') bucket = s3.bucket('category') object in bucket.objects.page_size(count=50): process_file(object.key, client)
this works perfectly, when try create separate thread each file, error:
sh: 1: /path/to/file: text file busy
the script threading looks like:
s3 = boto3.resource('s3') bucket = s3.bucket('category') threads = [] object in bucket.objects.page_size(count=50): t = threading.thread(target=process_file, args=(object.key, client)) threads.append(t) t.start() t in threads: t.join()
out of threads, 1 thread succeed , other fail on "text file busy error". can me figure out doing incorrectly?
boto3 not thread-safe cannot re-use s3 connection each download. see here details of workaround.
Comments
Post a Comment