Capturing Script Logs

At GameChanger, we use scripts in many of our flows such as during deploys or running Ansible while booting new instances. Some of these flows are critical to our operations and require good visibility. Traditionally we tried to send both metrics and logs to DataDog where we can both monitor what scripts are doing and […]

At GameChanger, we use scripts in many of our flows such as during deploys
or running Ansible while booting new instances. Some of these flows are
critical to our operations and require good visibility. Traditionally we
tried to send both metrics and logs to DataDog where we can both monitor what
scripts are doing and set alerts on various metrics. DataDog integration,
however, is not always available such as during the instance boot. In addition,
sometimes a DataDog alert does not provide enough context of what failed and most
importantly why it failed. It simply states that some threshold was reached. To
investigate the issue requires more manual steps by looking for the appropriate
log which is not always intuitive. That is why recently at GameChanger we
started integrating Slack error reporting directly into some of our critical
scripts. This post describes the exact mechanism of how that is achieved since
it uses a really cool bash trick.

Useful Commands

First some background on some of the useful commands.

tee Anyone?

Lets drink some tea

Tee is a really useful command. It captures an output from a script
and both echoes it to standard out as well as forwards it to a file.
This allows to both see an output as well as capture the same output
for later use. For example:

$ echo hello there | tee hello.log
hello there
$ cat hello.log
hello there

Process Substitution

Lets swap some things

Some commands only work with files. For example a classic diff:

$ echo one > one.txt
$ echo two > two.txt
$ diff -u one.txt two.txt
--- one.txt
+++ two.txt
@@ -1 +1 @@
-one
+two

Sometimes however it is convenient to be able to refer to an output of commands
as a file without manually creating a temporary file. This is what process
substitution allows to do. Same example as above but with process substitution:

$ diff -u <(echo one) <(echo two)
--- /dev/fd/63  2020-11-30 12:33:08.539905663 -0500
+++ /dev/fd/62  2020-11-30 12:33:08.540272967 -0500
@@ -1 +1 @@
-one
+two

This is the output form of process substitution which uses <(command) syntax.
It stores an output of a command into a temporary file for the duration
of a command. There is also input form of process substitution which uses
>(command) syntax. It similarly creates a temporary file to which data can be
written to and process substitution will forward that content to a command. For
example:

$ echo "
> 2
> 1" > >(sort)
1
2

exec

Lets execute things

Most common form of exec is to simply execute another command by replacing
the process:

$ bash -c 'exec echo foo'
foo

However exec also overtakes process file descriptors which allows exec to
adjust what a process does with its file descriptors which includes stdout
and stderr. For example exec can capture script output into a file:

$ cat test.sh
exec &> log.txt
echo stdout > /dev/stdout
echo stderr > /dev/stderr
$ bash test.sh
$ cat log.txt
stdout
stderr

Similar results can be achieved by manually sending script output to a file
(e.g. command &> log.txt) however exec allows to do that directly within
the script.

Putting Everything Together

Lets put everything together

Putting all commands together allows to do something like:

LOG_PATH=/tmp/deploy_$(date +%s).log
exec &> >(tee -a $LOG_PATH)

function error_msg {
    echo "Deploy last 50 logs:
$(tail -n 50 $LOG_PATH)
"
}

function slack_error {
    msg=${1:-}
    slacksend 
        --channel=alerts 
        --color=danger 
        --snippet="$(error_msg)" 
        --filename="deploy.log" 
        "$msg" 
        || true
}

deploy || (slack_error "terminating rolling deployment at $HOSTNAME"; exit 1)

Here is what the above does:

  • Shows script output as normal via tee. This allows normal script log
    aggregation to work as normal.
  • In addition it fully captures script stdout and stderr to a log file via
    exec, process substitution and tee
  • When the deploy command fails, it sends a slack message with the last 50
    lines of the script logs

Example slack message

Wrap Up

It might not seem like much, but immediately seeing what failed and why by
including the failure logs makes for a much more pleasant debugging experience.
Most importantly it does not adjust how the script is used so no other changes
are required to other systems since all the enhancements are baked directly in
into the script itself.

Hopefully you will find some of these cool bash capabilities useful as well.

Source: GameChanger