I recently had an issue where one of our EMR clusters failed to bootstrap the python modules via PIP. I checked the logs and saw that we ran into the following error:
1 |
HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out |
I wanted to have PIP not die if it timed out, I also wanted it to retry on failure. By adding the following to my bootstrap.sh I was able to have the PIP socket timeout at a longer interval, also bump up the retries to 10. I have not seen the issue since I applied the new settings.
1 2 |
sudo python3 -m pip --timeout 100 --retries 10 install --upgrade pip sudo python3 -m pip --timeout 100 --retries 10 install |
From the PIP help page:
1 2 |
--retries <retries> Maximum number of retries each connection should attempt (default 5 times). --timeout <sec> Set the socket timeout (default 15 seconds). |