How do you search for strings within a zip archive?
I’m tinkering with EPUB3 files, and I wanted to be able to find certain strings within .epub files, so I had a look around, and I immediately found zgrep and family. The trouble was that zgrep assumes a single zipped file, not an archive.
So, without further ado, I wrote the following script, which I called, naturally, zipgrep. It uses grep and unzip, which it assumes to be available on the PATH. Not wanting to have to pick through the argument list, I decided to mark the end of arguments to grep with the traditional ‘—‘, after which I could stack up as many zip file names as I liked.
It was a case of not enough time in the library; or Google, in this case. As soon as I had it working, I discovered the original zipgrep.
All was not lost. The original zipgrep handles a single archive using egrep and unzip, with the nice wrinkle of optional sets of filenames to include in, or exclude from, the search. However, I liked the ability to search multiple zip archives, and grep can be converted to any of its relatives with an appropriate flag, so I decided to hang on to son of zipgrep. All I needed was a new name: hence zargrep.
You can retrieve it here. It has been tested on OS X against multiple EPUB3 files.
Because they are zip files, this should also work for jar files, but I haven’t yet tried it.
#! /bin/sh # Greps files in a zip archive. # Same argument sequence as for grep, except that # zip file arguments must be separated from flags and # patterns by --. If no -- is found in the argument list, returns error. usage() { echo Usage: >&2 echo $0 "<grep flags> <pattern> -- zipfiles ..." >&2 } declare -a args i=0 for (( i=0; $# > 0; i++ )) do if [ "$1" != "--" ]; then args[$i]="$1" shift else filesmarked=1 shift break fi done if [ -z "$filesmarked" ]; then Echo "No '--' marker for zipfiles args." >&2 usage exit 1 fi tmpfile=/tmp/zipgrep$$ rm -rf $tmpfile mkdir $tmpfile trap 'rm -rf $tmpfile' EXIT wd=$(pwd) cd $tmpfile while [ $# -gt 0 ]; do zipfile="$1" zfile="$1" shift # If zipfile is not absolute, set it relative to wd if [ "${zipfile:0:1}" != / ]; then zipfile="$wd/${zipfile}" fi unzip "$zipfile" >/dev/null result=$(find . -type f -print0|xargs -0 grep "${args[@]}") if [ -n "$result" ]; then echo "zip: $zfile" echo "$result" fi cd $wd rm -rf $tmpfile mkdir $tmpfile cd $tmpfile done