Monday 12 September 2011

Searching LaTeX files

I'm making my final amendments to my thesis at the moment, and one of my examiners suggested a change of terminology, and I needed a convenient way to find every instance of a word in a group of LaTeX files.

Doing this with just grep is possible, but retrieves a lot of false positives in comments, or you need to use more complex regular expressions, which are difficult to write and maintain. Another alternative is pipe from one instance of grep to another.

For what I'm doing, it seemed easier to split the task into two smaller pieces. A Perl script to remove LaTeX comments, and egrep to do the search.

#!/usr/bin/env perl
use strict;
use warnings;

for my $arg (@ARGV) {
    my $fh;
    open($fh, "<$arg");
    print "$arg\n";
    while(<$fh>) {
        chomp;
        if( m/([^%]*)/ ) {
            print "$arg:\t$1\n";
        }
        elsif( m/([^%]*)%.*/ ) {
            # matches word pattern
            print "$arg:\t$1\n";
        }
    }
    close $fh;
}

Which is called like so:

latex-comment-filter ${FILES} | egrep -w --color ${SEARCHPATTERN}

No comments:

Post a Comment