{"id":569,"date":"2017-09-16T19:08:57","date_gmt":"2017-09-16T23:08:57","guid":{"rendered":"http:\/\/jasonralph.org\/?p=569"},"modified":"2020-01-31T19:51:54","modified_gmt":"2020-02-01T00:51:54","slug":"amazon-redshift-long-running-query-alert-to-slack","status":"publish","type":"post","link":"https:\/\/jasonralph.org\/?p=569","title":{"rendered":"Amazon Redshift Long Running Query Alert to Slack"},"content":{"rendered":"<p>This python code when called with a user that can query the STV_RECENTS table will check the duration on a current running query against the threshold set by the config in microseconds and send an alert to slack if it exceeds 30 minutes.  I have it cronned up and running every 30 minutes. <\/p>\n<p>CLI example:<\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\n\r\n[Prompt#]>python rs_long_running_query_alert.py\r\n\r\n<\/pre>\n<p>You will need slackclient:<br \/>\nhttps:\/\/pypi.python.org\/pypi\/slackclient<br \/>\nYou will need psycopg2:<br \/>\nhttps:\/\/pypi.python.org\/pypi\/psycopg2<\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\n#!\/usr\/bin\/python2.6\r\n__author__ = \"Jason Ralph\"\r\n\r\nfrom datetime import timedelta\r\nimport os\r\nimport psycopg2\r\nimport ConfigParser\r\nfrom slackclient import SlackClient\r\nslack_token = os.environ[\"SLACK_API_TOKEN\"]\r\nsc = SlackClient(slack_token)\r\n\r\n\r\nconfig = ConfigParser.ConfigParser()\r\nconfig.read(('%s\/qa.ini') % (os.environ['QA_INI_PATH']))\r\nconfig.sections()\r\n\r\n# redshift\r\nrs_host = config.get('redshift', 'hostname')\r\nrs_port = config.get('redshift', 'port')\r\nrs_database = config.get('redshift', 'database')\r\nrs_sys_user = config.get('redshift', 'sys_user')\r\nrs_sys_pass = config.get('redshift', 'sys_password')\r\nrs_app_lq_thresh = config.get('redshift', 'lq_app_threshold')\r\n\r\n\r\n\"\"\"\r\nGet running queries over set threshold set in config in microseconds.\r\n\"\"\"\r\n\r\n\r\ndef get_long_running_queries():\r\n    try:\r\n        conn = psycopg2.connect((\"dbname=%s user=%s host=%s password=%s port=%s\") %\r\n                                (rs_database, rs_sys_user, rs_host, rs_sys_pass, rs_port))\r\n    except psycopg2.Error as e:\r\n        print e\r\n\r\n    cast_thresh = int(rs_app_lq_thresh)\r\n    sql = (\"select userid, \\\r\n                   status, \\\r\n                   starttime, \\\r\n                   duration, \\\r\n                   user_name, \\\r\n                   db_name, \\\r\n                   query, \\\r\n                   pid from stv_recents \\\r\n                   where duration >= %s and status='Running';\") % cast_thresh\r\n    cursor = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)\r\n    cursor.execute(sql)\r\n    count = 0\r\n\r\n    while True:\r\n        row = cursor.fetchone()\r\n        if row is None:\r\n            break\r\n        userid =    row['userid']\r\n        status =    row['status']\r\n        starttime = row['starttime']\r\n        duration =  row['duration']\r\n        user_name = row['user_name']\r\n        db_name =   row['db_name']\r\n        query =     row['query']\r\n        pid =       row['pid']\r\n        human_date_string = timedelta(microseconds=duration)\r\n\r\n        if (user_name == 'user_1'\r\n            or user_name == 'user_2'\r\n            or user_name == 'user_3' and duration >= rs_app_lq_thresh):\r\n            count += 1\r\n            msg_items = []\r\n            msg_items.append('%s: App Limit Breached - Long Running Query Alert\\n' % rs_database)\r\n            msg_items.append('PID: %s\\n' % pid)\r\n            msg_items.append('COUNT: %s\\n' % count)\r\n            msg_items.append('USERID: %s\\n' % userid)\r\n            msg_items.append('STATUS: %s\\n' % status)\r\n            msg_items.append('STARTTIME: %s\\n' % starttime)\r\n            msg_items.append('DURATION: %s\\n' % human_date_string)\r\n            msg_items.append('USER_NAME: %s\\n' % user_name)\r\n            msg_items.append('DB_NAME: %s\\n' % db_name)\r\n            msg_items.append('QUERY: %s' % query)\r\n            msg = ''.join(msg_items)\r\n            sc.api_call('chat.postMessage', channel='#python', text=\"'%s' :tada:\") % msg\r\n    conn.close()\r\n\r\n\r\ndef main():\r\n    get_long_running_queries()\r\n\r\n\r\nif __name__ == '__main__':\r\n    main()\r\n<\/pre>\n<p>INI file:<\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\n[redshift]\r\nhostname = redshift.amazonaws.com\r\nport = 5439\r\ndatabase = redshift\r\nuser = user\r\npassword = password\r\nschema = public\r\nsys_user = rs_sys\r\nsys_password = password\r\nlq_app_threshold = 3600000000\r\n<\/pre>\n<p>Slack message example:<\/p>\n<pre class=\"theme:solarized-dark lang:default decode:true \" >\r\n'db-redshift' APP [10:15 AM] \r\nredshift_dev: Long Running Query Alert\r\nPID: 10723\r\nCOUNT: 1\r\nUSERID: 100\r\nSTATUS: Running             \r\nSTARTTIME: 2017-09-16 11:00:01.417040\r\nDURATION: 3:15:00.735017\r\nUSER_NAME: admin                                             \r\nDB_NAME: database                                      \r\nQUERY: vacuum;\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>This python code when called with a user that can query the STV_RECENTS table will check the duration on a current running query against the threshold set by the config in microseconds and send an alert to slack if it exceeds 30 minutes. I have it cronned up and running every 30 minutes. CLI example: [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[53,55,54,56],"class_list":["post-569","post","type-post","status-publish","format-standard","hentry","category-python","tag-amazon","tag-query","tag-redshit","tag-slack"],"_links":{"self":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/569","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=569"}],"version-history":[{"count":16,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/569\/revisions"}],"predecessor-version":[{"id":820,"href":"https:\/\/jasonralph.org\/index.php?rest_route=\/wp\/v2\/posts\/569\/revisions\/820"}],"wp:attachment":[{"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=569"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=569"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jasonralph.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=569"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}