AWS CLI Max Concurrent Requests Tuning

In this post I would like to go over how I tuned a test server for copying / syncing files from the local filesystem to S3 over the internet. If you ever had the task of doing this, you will notice that as the file count grows, so does the time it takes to upload the files to S3. After some web searching I found out that AWS allows you to tune the config to allow more concurrency than default.
AWS CLI S3 Config

The parameter that we will be playing with is max_concurrent_requests
This has a default value of 10, which allows only 10 requests to the AWS API for S3. Lets see if we can make some changes to that value and get some performance gains. My test setup is as follows:

I have 56 102MB files in the test directory:

For the first test I am going to run aws s3 sync with no changes, so out of the box it should have 10 max_concurrent_requests. Lets use the Linux time command to gather the time result to copy all 56 files to S3. I will delete the folder on S3 with each iteration to keep the test the same. You can also view the 443 requests via netstat and count them as well to show whats going on. In all the tests my best result was 250. So as you can see you will need to play with the settings to get the best result, these settings will change along with the server specs.

1. 1m25.919s with the default configuration:

2. Now lets set the max conqurent requests to 20 and try again, you can do this with the command below, after running we can see a little gain.

3. Bumped up to 50 shows a bit more gain:

4. Bumped up to 100, I start to notice that we lost some speed:

5. Bumped up to 250 we see the best result so far:

6. Bumped up to 500, we lose performance, most likely due to the machine resources.

So to wrap up, you can tune the amount of concurrent requests allowed from the aws cli to s3, you will need to play with this setting to get the best results for your machine.

Python Backup WORDPRESS Site / DATABASE and HTML

I have this blog hosted on a LINODE dedicated LINUX server. It’s about 10 dollars a month for a 1 core system with about 250GB of disk space and 1GB of RAM, this server runs the common LAMP stack, I needed a quick and dirty script to backup MYSQL database and the PHP code contained in the /var/www/html folder. I wanted the script to compress the contents of both and move them into a directory with the correct date. See the comments below outlining the code and the action of running it.

So you can see we generated 2 files in a dated directory, I chose to use both zip and gunzip for compression algoritims. To view the contents you can run the normal linux commands to extract the files.

So there you have it, I can tar up the entire dated directory for easy offsite backup now of my entire site jasonralph.org. Hope this helps someone, feel free to copy the source code and change at will.

Best,
Jason

PYTHON – Script to download youtube videos for offline viewing

I was interested in viewing this video of a news conference (USENIX 2016) on my trip home on Metro North Train, NYC => CT. The trip is about an hour an 10 minutes from Manhattan’s Grand Central Terminal to Milford CT, express train that is. My concern was that I would have choppy internet service on the way since I recently updated my laptop and the built in Verizon Mobile card was not activated yet. I would need to use my ATT iPhone as a hotspot, which proved to be very shakey at times. A colleague of mine recommended a website for making youtube videos available for offline viewing. The name of this site was:

http://www.keepvid.com

Right off the rip I was concerned that this site was infested with malware and any other bullshit associated with a free video ripping service. I used the site and was able to create a download of the video I was interested in, however who knows how sick my Windows based machine just got. I could of contracted anything from this site.

I thought about this and said, there has to be a better way, or a python lib for this, and low and behold a search came up with PYTUBE:
https://github.com/nficano/pytube

This library had some interesting features and literally blew away the keepvid site in regards to flexibility. Here is some explaining of what this library can do. Please have a look at the examples below, I will do my best to narrate them.

Here I use PIP to install the PYTUBE lib, you can ignore the DEPRECATION: warning for my outdated python that blares at you for being such an idiot.

Next up you can see that I am setting a variable yt(this is the video you want to download). Using python’s Pretty Print Lib you can run the pprint(yt.get_videos() method to see what formats are available for download.

Please have a look at the comments in the code for a bit more details in regards to what is going on, in this example I am using the filename Pulp_Fiction.mp4 for my filename I want to be when downloaded.

Ok so here is what it looks like when you execute the program:

As you can see we have a new filename with the video we asked for to watch without a streaming internet connection, here is a ls to show:

As always, I am sure there are better ways to do this and I am sure there is cleaner code. Most of this code was taken right from the authors site who is a badass, here is his link:

https://github.com/nficano/pytube

Hope you liked,
J$0N

CYGWIN – clear.exe from scratch C Program

Hello all,

TL;DR – I wrote a C program to use as clear.exe on cygwin, I know about CTRL-L but I wanted a binary for scripts, and I am used to typing clear at a terminal

I was recently enforced to use Windows as my daily computer at work for all types of compliance PCI reasons, reasons that I do not wish to dive into. However I am now on Windows, I spend most of my day in a BASH shell on a remote system, so it’s not that bad. However traversing around Windows with CMD or PS sucked. I was introduced to cygwin and I was really impressed, I was able to use a ton of tools that I was already familiar with on the Linux CLI. Native SSH,SCP were huge for me at this time, also the ability to customize the colors with the TTY settings was awesome.

One issue I ran into was I liked to use the clear command in my scripts and from the CLI when I want a fresh terminal. Well, I figured I could just re launch the cygwin installer and search for clear.exe and install it. This was not the case, unfortunately I was instructed to install ncurses and the associated libraries. I followed these recommendations and never got clear.exe to work.

So I did what any computer scientist would do, research how to write a C program on this and compile it and copy it to my $PATH and wouldn’t you know it works!!.

So Here it is in all it’s glory. clear.exe for Windows and CYGWIN:

### Clear no workie ###

Ok so we can see the clear program was not working in the above example. So I wrote the following lines of C code below and compiled them.

OK, cool, now you will need to get GCC on your Windows machine from the CYGWIN installer, so just re run the installer and search GCC, you can install it from there like so.

View post on imgur.com

Ok now for the awesome stuff, it’s time to compile this with GCC and create a new clear.exe binary for Windows and CYGWIN.

Now that we see we can use the new executable to clear out cygwin shell, let’s copy it somewhere so we can just type “clear” to get what we want.

And there you have it, have fun!!!

Jason

MD5 check via ssh to be sure data has rsynced properly then delete 5 days of data from collector

EDIT: A colleague assisted with this.

BASH Clear Semaphores Linux

Rsync Market data from Multicast Collector To SAN – BASH

Here is a snippet from the log:

FInd Large Files Specified By Size Linux – BASH

You can execute this and get your results to stdout.

Confirm String exists in log file and is not stale longer than 600 seconds – PYTHON

Examine Log File To Be Sure O values Are Not Recieved – BASH