{"id":449,"date":"2016-04-03T23:38:12","date_gmt":"2016-04-04T03:38:12","guid":{"rendered":"http:\/\/jasonralph.org\/?p=449"},"modified":"2016-04-24T17:14:46","modified_gmt":"2016-04-24T21:14:46","slug":"python-download-videos-for-offline-viewing-pytube-lib","status":"publish","type":"post","link":"https:\/\/jasonralph.org\/?p=449","title":{"rendered":"PYTHON &#8211; Script to download youtube videos for offline viewing"},"content":{"rendered":"<p>I was interested in viewing this video of a news conference (USENIX 2016) on my trip home on Metro North Train, NYC => CT.  The trip is about an hour an 10 minutes from Manhattan&#8217;s Grand Central Terminal to Milford CT, express train that is.  My concern was that I would have choppy internet service on the way since I recently updated my laptop and the built in Verizon Mobile card was not activated yet.  I would need to use my ATT iPhone as a hotspot, which proved to be very shakey at times.  A colleague of mine recommended a website for making youtube videos available for offline viewing.  The name of this site was:<\/p>\n<p><a href=\"http:\/\/www.keepvid.com\">http:\/\/www.keepvid.com<\/a><\/p>\n<p>Right off the rip I was concerned that this site was infested with malware and any other bullshit associated with a free video ripping service.  I used the site and was able to create a download of the video I was interested in, however who knows how sick my Windows based machine just got. I could of contracted anything from this site.    <\/p>\n<p>I thought about this and said, there has to be a better way, or a python lib for this, and low and behold a search came up with PYTUBE:<br \/>\n<a href=\"https:\/\/github.com\/nficano\/pytube\">https:\/\/github.com\/nficano\/pytube<\/a><\/p>\n<p>This library had some interesting features and literally blew away the keepvid site in regards to flexibility. Here is some explaining of what this library can do.  Please have a look at  the examples below, I will do my best to narrate them.<\/p>\n<p>Here I use PIP to install the PYTUBE lib, you can ignore the DEPRECATION: warning for my outdated python that blares at you for being such an idiot.  <\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\n[root@jasonralph ~]# pip install pytube\r\nDEPRECATION: Python 2.6 is no longer supported by the Python core team, please upgrade your Python. A future version of pip will drop support for Python 2.6\r\nCollecting pytube\r\n  Using cached pytube-6.1.8.tar.gz\r\nInstalling collected packages: pytube\r\n  Running setup.py install for pytube ... done\r\nSuccessfully installed pytube-6.1.8\r\n<\/pre>\n<p>Next up you can see that I am setting a variable yt(this is the video you want to download).  Using python&#8217;s Pretty Print Lib you can run the pprint(yt.get_videos() method to see what formats are available for download.  <\/p>\n<p>Please have a look at the comments in the code for a bit more details in regards to what is going on, in this example I am using the filename Pulp_Fiction.mp4 for my filename I want to be when downloaded. <\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\n[jralph@jasonralph ~]$ cat py_video_downloader.py\r\nfrom pytube import YouTube\r\nfrom pprint import pprint\r\n\r\nyt = YouTube(\"http:\/\/www.youtube.com\/watch?v=Ik-RsDGPI5Y\")\r\n\r\npprint(yt.get_videos())\r\n\r\nprint(yt.filename)\r\n\r\nyt.set_filename('Pulp_Fiction.mp4')\r\n\r\n# Notice that the list is ordered by lowest resolution to highest. If you\r\n# wanted the highest resolution available for a specific file type, you\r\n# can simply do:\r\nprint(yt.filter('mp4')[-1])\r\n# <Video: H.264 (.mp4) - 720p>\r\n\r\n# You can also get all videos for a given resolution\r\npprint(yt.filter(resolution='720p'))\r\n\r\n\r\nvideo = yt.get('mp4', '720p')\r\n\r\n# NOTE: get() can only be used if and only if one object matches your criteria.\r\n# for example:\r\n\r\npprint(yt.videos)\r\n\r\n\r\nvideo.download('\/home\/jralph\/')\r\n<\/pre>\n<p>Ok so here is what it looks like when you execute the program:<\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\n[23:32:59] jralph@jasonralph:~ $ python py_video_downloader.py\r\n[<Video: MPEG-4 Visual (.3gp) - 144p - Simple>,\r\n <Video: MPEG-4 Visual (.3gp) - 240p - Simple>,\r\n <Video: Sorenson H.263 (.flv) - 240p - N\/A>,\r\n <Video: H.264 (.mp4) - 360p - Baseline>,\r\n <Video: H.264 (.mp4) - 720p - High>,\r\n <Video: VP8 (.webm) - 360p - N\/A>]\r\nPulp Fiction - Dancing Scene\r\n<Video: H.264 (.mp4) - 720p - High>\r\n[<Video: H.264 (.mp4) - 720p - High>]\r\n\/usr\/lib\/python2.6\/site-packages\/pytube\/api.py:141: DeprecationWarning: videos property deprecated. Use `get_videos()` instead.\r\n  \"instead.\", DeprecationWarning)\r\n[<Video: MPEG-4 Visual (.3gp) - 144p - Simple>,\r\n <Video: MPEG-4 Visual (.3gp) - 240p - Simple>,\r\n <Video: Sorenson H.263 (.flv) - 240p - N\/A>,\r\n <Video: H.264 (.mp4) - 360p - Baseline>,\r\n <Video: H.264 (.mp4) - 720p - High>,\r\n <Video: VP8 (.webm) - 360p - N\/A>]\r\n<\/pre>\n<p>As you can see we have a new filename with the video we asked for to watch without a streaming internet connection, here is a ls to show:<\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\n[23:33:02] jralph@jasonralph:~ $ ls -ltr\r\ntotal 32352\r\ndrwxr-xr-x 2 root   root       4096 Mar 20 22:50 image_staging\r\ndrwxr-xr-x 2 root   root       4096 Mar 20 22:51 JR.ORG_SITE_BACKUPS\r\n-rwxrwxr-x 1 jralph jralph      691 Apr  3 00:34 py_video_downloader.py\r\n-rw-rw-r-- 1 jralph jralph 33113243 Apr  3 23:33 Pulp_Fiction.mp4.mp4\r\n<\/pre>\n<p>As always, I am sure there are better ways to do this and I am sure there is cleaner code. Most of this code was taken right from the authors site who is a badass, here is his link:<\/p>\n<p><a href=\"https:\/\/github.com\/nficano\/pytube\">https:\/\/github.com\/nficano\/pytube<\/a><\/p>\n<p>Hope you liked,<br \/>\nJ$0N<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I was interested in viewing this video of a news conference (USENIX 2016) on my trip home on Metro North Train, NYC => CT. The trip is about an hour an 10 minutes from Manhattan&#8217;s Grand Central Terminal to Milford CT, express train that is. My concern was that I would have choppy internet service [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[8,35,12,24,42,41],"class_list":["post-449","post","type-post","status-publish","format-standard","hentry","category-python","tag-bash","tag-code","tag-linux","tag-python-2","tag-pytube","tag-youtube"],"_links":{"self":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/449","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=449"}],"version-history":[{"count":16,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/449\/revisions"}],"predecessor-version":[{"id":471,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/449\/revisions\/471"}],"wp:attachment":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=449"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=449"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=449"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}