zargrep: grep files in a zip archive

How do you search for strings within a zip archive?

I’m tinkering with EPUB3 files, and I wanted to be able to find certain strings within .epub files, so I had a look around, and I immediately found zgrep and family. The trouble was that zgrep assumes a single zipped file, not an archive.

So, without further ado, I wrote the following script, which I called, naturally, zipgrep. It uses grep and unzip, which it assumes to be available on the PATH.  Not wanting to have to pick through the argument list, I decided to mark the end of arguments to grep with the traditional ‘‘, after which I could stack up as many zip file names as I liked.

It was a case of not enough time in the library; or Google, in this case.  As soon as I had it working, I discovered the original zipgrep.

All was not lost.  The original zipgrep handles a single archive using egrep and unzip, with the nice wrinkle of optional sets of filenames to include in, or exclude from, the search.  However, I liked the ability to search multiple zip archives, and grep can be converted to any of its relatives with an appropriate flag, so I decided to hang on to son of zipgrep.  All I needed was a new name: hence zargrep.

You can retrieve it here. It has been tested on OS X against multiple EPUB3 files.

Because they are zip files, this should also work for jar files, but I haven’t yet tried it.

 #! /bin/sh  
   
 # Greps files in a zip archive.  
 # Same argument sequence as for grep, except that  
 # zip file arguments must be separated from flags and  
 # patterns by --. If no -- is found in the argument list, returns error.  
   
 usage() {  
   echo Usage: >&2  
   echo $0 "<grep flags> <pattern> -- zipfiles ..." >&2  
 }  
   
 declare -a args  
   
 i=0  
 for (( i=0; $# > 0; i++ ))  
 do  
   if [ "$1" != "--" ]; then  
     args[$i]="$1"  
     shift  
   else  
     filesmarked=1  
     shift  
     break  
   fi  
 done  
   
 if [ -z "$filesmarked" ]; then  
   Echo "No '--' marker for zipfiles args." >&2  
   usage  
   exit 1  
 fi  
   
 tmpfile=/tmp/zipgrep$$  
 rm -rf $tmpfile  
 mkdir $tmpfile  
   
 trap 'rm -rf $tmpfile' EXIT  
   
 wd=$(pwd)  
 cd $tmpfile  
   
 while [ $# -gt 0 ]; do  
   zipfile="$1"  
   zfile="$1"  
   shift  
   # If zipfile is not absolute, set it relative to wd  
   if [ "${zipfile:0:1}" != / ]; then  
     zipfile="$wd/${zipfile}"  
   fi  
   unzip "$zipfile" >/dev/null  
   result=$(find . -type f -print0|xargs -0 grep "${args[@]}")  
   if [ -n "$result" ]; then  
     echo "zip: $zfile"  
     echo "$result"  
   fi  
   cd $wd  
   rm -rf $tmpfile  
   mkdir $tmpfile  
   cd $tmpfile  
 done  

Leave a Reply

Your email address will not be published. Required fields are marked *