HTTPSConnectionPool(host=’files.pythonhosted.org’, port=443): Read timed out

I recently had an issue where one of our EMR clusters failed to bootstrap the python modules via PIP. I checked the logs and saw that we ran into the following error:

I wanted to have PIP not die if it timed out, I also wanted it to retry on failure. By adding the following to my bootstrap.sh I was able to have the PIP socket timeout at a longer interval, also bump up the retries to 10. I have not seen the issue since I applied the new settings.

From the PIP help page: