AWS CLI Max Concurrent Requests Tuning

In this post I would like to go over how I tuned a test server for copying / syncing files from the local filesystem to S3 over the internet. If you ever had the task of doing this, you will notice that as the file count grows, so does the time it takes to upload the files to S3. After some web searching I found out that AWS allows you to tune the config to allow more concurrency than default.
AWS CLI S3 Config

The parameter that we will be playing with is max_concurrent_requests
This has a default value of 10, which allows only 10 requests to the AWS API for S3. Lets see if we can make some changes to that value and get some performance gains. My test setup is as follows:

I have 56 102MB files in the test directory:

For the first test I am going to run aws s3 sync with no changes, so out of the box it should have 10 max_concurrent_requests. Lets use the Linux time command to gather the time result to copy all 56 files to S3. I will delete the folder on S3 with each iteration to keep the test the same. You can also view the 443 requests via netstat and count them as well to show whats going on. In all the tests my best result was 250. So as you can see you will need to play with the settings to get the best result, these settings will change along with the server specs.

1. 1m25.919s with the default configuration:

2. Now lets set the max conqurent requests to 20 and try again, you can do this with the command below, after running we can see a little gain.

3. Bumped up to 50 shows a bit more gain:

4. Bumped up to 100, I start to notice that we lost some speed:

5. Bumped up to 250 we see the best result so far:

6. Bumped up to 500, we lose performance, most likely due to the machine resources.

So to wrap up, you can tune the amount of concurrent requests allowed from the aws cli to s3, you will need to play with this setting to get the best results for your machine.

Postgres Long Running Active Queries Send To Slack

I needed a utility to alert our team when any long running queries were running on a production postgres cluster. I came up with the following python code that achieves just that. This would alert slack if an active query exceeds 45 mins. The script takes in user parameters as well, I will demonstrate the way to call it. Hope it helps someone.

CRON CALL:

CODE:

SLACK MESSAGE:

Python Function Execute Subprocess With Timeout

I have a project that rsync’s data from an RPM repository for a local version of this repo. The issue I was faced with was the remote mirror would sometimes stop the rsync due to overloaded network or other unforeseen issues. I wanted to use rsyncs hashing algorithm to have it start right where it left off so I wrote a function to do this. If 900 seconds was hit it usually meant there was an issue with the transfer. I also want to state here that I observed the rsync stop serving issue on many mirrors so it was not just an issue with the TCP network. I use this in production and it logs each iteration or restart. The function below will also kill the current rsync so multiple copies are not running at the same time. I also only wanted to perform 5 iterations of rsync upon error or timeout so I use a while loop here.

Here are the individual rsync commands in the INI configuration.

Here is how I call the execute_jobs_timeout() function:

The function:

Log Snippet showing each command executing:

CENTOS6 Postgres pg_upgrade 9 to 11 – In Place – Link – No Copy – Limited Disk Space

I wanted to share my experience with upgrading postgres database server from major version 9 to 11. I am showing the steps that I took to get many servers in dev and production upgraded with limited disk space(not enough space to copy). I am hoping this will help with the problems I faced when testing this procedure. Using the –link parameter has drawbacks as noted in the documentation, however we perform full VM backups of each server so we can always restore from backup if the upgrade fails and we will not need to start the pg9.3 database again.

https://www.postgresql.org/docs/11/pgupgrade.html

-k
--link

use hard links instead of copying files to the new cluster
If you ran pg_upgrade with --link, the data files are shared between the old and new cluster. If you started the new cluster, the new server has written to those shared files and it is unsafe to use the old cluster.

Before we get started make a backup of the files pg_hba.conf and postgresql.conf for later use, you will need to use them later to reconstruct the pg11 configs.

Use WGET to grab the RPMS from https://yum.postgresql.org

Install the RPMS for postgres11 that we just downloaded

We will create the data location for postgres11 where the files will be hardlinked and not copied. You can see the tablespace disk locations and the index locations from the pg9.3 install. Its important to create the new pg11 data directory on the same filesystem since we will be using the –link parameter and it uses hardlinks which will not traverse filesystems.

We will need to init a postgres database in our new location on disk data11.

Now we are ready to stop pg9.3 and check pg_upgrade compatibility. pg_upgrade ships with a –check argument that will check the compatibility of the clusters and be sure the upgrade will work before changing any files. Lets stop pg9.3 and run the pg_upgrade with the –check parameter.


Ok checks have passed and the cluster versions are ready for upgrade, lets run this without the –check parameter and upgrade postgres.

OK the pg_upgrade code completed successfully and has generated 2 scripts. One to analyze the new pg11 cluster to get stats for the query planner and vacuum. The other to cleanup and remove the old pg9.3 locations on disk. Let’s start pg11, we will need to create an override file to tell pg11 where the data11 data lives, then we should be able to start postgres and check some things and verify our upgrade.


OK we can see we have pg11 running and we can run the generated scripts to cleanup, but lets take a look at the data and index directories to see what the upgrade produced.

We can view the shell scripts that pg_upgrade produced and cleanup the old pg9.3 references and run the analyze vacuums.


This looks good, lets execute them and cleanup any pg9.3 references as well as remove the pg9.3 rpms.

Remove the pg9.3 rpms and references, set the new data location in the .pgsql_profile.

You can now view the pg_hba.conf and postgresql.conf you saved in /root and add whats needed to the new pg11 configs.

That’s it!!

SINOPIA NPM allow connections to GITHUB as well as the NPM registry

SINOPIA LINK HERE
We use SINOPIA as a proxy on our internal network behind the firewall to allow users to install NODE packages without an internet connection. We basically run sinopia on a machine that has access to the internet and the clients point to the server to install packages that are not locally available. We have been running into issues where installs that needed access to github would fail with something like this:

As you can see, we are getting choked at:

To get around this we need to change the config.yml on the server to allow proxies to github, here is the final configuration. Hope this helps other users as we had a fun time trying to figure it out. Pay attention to the uplinks section and the proxy requests where github is defined.

PSQL Connect To AWS Redshift From Windows 10 PowerShell

Coming from a completely Linux background, I was tasked with connecting to a aws redshift cluster or a postgres cluster via Windows powershell and PSQL. I knew it was possible and searching the internet came up with CMD prompt solutions, when I attempted via powershell, I was faced with the following error below, you will need to install postgres on windows10 to get access to the psql binary, you can get it here:
https://www.postgresql.org/download/windows/

Turns out a colleague of mine and I figured out you will need to set the variable PGCLIENTENCODING via the powershell command line. This was expected but we could not nail down the syntax, we found it.

Once this is set, you can connect to PG as normal.

Python Generator Find Files With Wildcard

This is a neat way to generate file names in a directory that match a specific pattern, I use this to generate a list of files exported out of hive to load into S3.

Nagios Python Plugin Check If File Is Stale

Wrote this simple plugin to check if a log file was stale on a server using nagios and nrpe. This plugin checks multiple files with the app. naming convention.

POSTGRES – Top 100 Tables In Tablespace

I had a situation where I needed to find the top 100 largest tables in a certain tablespace on a postgres 9 database, in my case we archive tables into an archive1 tablespace. This query will find all the largest relations in the archive1 tablespace. Its important to swap out ‘archive1’ with whatever tablespace you are trying to list.

Hope this helps you out, took some time to get it to work.

Nagios Check Postgres Table Date Column Against now()

I had a situation where a daily sync of a table from one database to another was failing. This table was updated daily so the query should return something like this when it was synced correctly:

I use Nagios very heavily and I setup a custom plugin to check the query’s date against today’s date, this should warn or critical based on user supplied arguments. Here is what a failure looks like when running from the nagios servers command line. This worked well at alerting me when the sync failed, this was integrated into the nagios subsystem and emails and slack alerts are generated as expected.