{"id":636,"date":"2018-02-13T14:03:04","date_gmt":"2018-02-13T19:03:04","guid":{"rendered":"http:\/\/jasonralph.org\/?p=636"},"modified":"2018-02-13T14:04:30","modified_gmt":"2018-02-13T19:04:30","slug":"python-generator-find-files-with-wildcard","status":"publish","type":"post","link":"https:\/\/jasonralph.org\/?p=636","title":{"rendered":"Python Generator Find Files With Wildcard"},"content":{"rendered":"<p>This is a neat way to generate file names in a directory that match a specific pattern, I use this to generate a list of files exported out of hive to load into S3.  <\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\ndef find_files(directory, pattern):\r\n    for root, dirs, files in os.walk(directory):\r\n        for basename in sorted(files):\r\n            if fnmatch.fnmatch(basename, pattern):\r\n                filename = os.path.join(root, basename)\r\n                yield filename\r\n<\/pre>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\nlocal_dir = '\/mnt\/share\/etl\/date\/'\r\nfor files in find_files(local_dir,'*.gz'):\r\n    key = files[1:]\r\n    try:\r\n        awss3.upload(key,files)\r\n        log_msg = ('uploading file: [{0}] to S3').format(files)\r\n        log.write(log_msg)\r\n    except Exception as e:\r\n        log_msg = ('ERROR: {0} uploading file: [{0}] to S3').format(e,files)\r\n        log.write(log_msg, 'error')\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>This is a neat way to generate file names in a directory that match a specific pattern, I use this to generate a list of files exported out of hive to load into S3. def find_files(directory, pattern): for root, dirs, files in os.walk(directory): for basename in sorted(files): if fnmatch.fnmatch(basename, pattern): filename = os.path.join(root, basename) yield [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[38,1,4],"tags":[61,24],"class_list":["post-636","post","type-post","status-publish","format-standard","hentry","category-coding-thoughts","category-general-code","category-python","tag-generator","tag-python-2"],"_links":{"self":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/636","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=636"}],"version-history":[{"count":1,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/636\/revisions"}],"predecessor-version":[{"id":637,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/636\/revisions\/637"}],"wp:attachment":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=636"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=636"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=636"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}