Using Apache Zeppelin’s shell (sh) interpreter is a great way to run and share shell scripts. This way you can use all your favorite command line tools like sed and awk to create great looking visualizations using Apache Zeppelin. By default your shell output will not be displayed in a tabular format or leverage Zeppelin’s built-in visualization.

There are three things required to display your shell output as tables or graphs:
1. Output columns must be tab (\t) delimited
2. Output records must be separated by newline (\n) characters
3. %table must be echoed/printed as part of your script

In order for your tables and graphs to be labeled correctly, the first line of output must be a header row. Here is an example graphing an NBA team’s (the Utah Jazz) point totals by using shell and awk commands to process a local text file. In this example comma delimiters are converted to tabs (\t) and the header row (first record) of the text file is included in the output to ensure that graphs are tables are labeled correctly.

echo "%table"

cat /data/data/nba-elo/nba-elo.csv \
  | tr ',' '\t' \
  | awk '{FS="\t"; OFS="\t"}  NR == 1 || $10 == "Jazz" {print $5, $11}'

Leave a Reply

Creating Graphs with the Apache Zeppelin Shell Interpreter