I am developing a program to import CSV files from my banks and
convert them into ledger
format for import into the accounting. The
first program created would import from the BBVA Compass credit card
data. Wrote the program. Included tests for the basic operation.
Included a few very simple integration tests. All good.
Then it was time to create a program to import CSV files from the BBVA Compass checking account. Of course I simply copied the first program to the second so that I would start with a working example and then would modify it into working for the second. The original was named bbva-import-cc and the fork I named bbva-import-bank.
Needing to fork the tests so that there would be tests for the original bbva-import-cc and new tests for bbva-import-bank. Of course I had not named the original tests uniquely. Therefore I needed to rename those files to associate them with the cc program first.
cd ../t/
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ ls -1 *.t
help.t
invalid-options.t
load.t
transactions.t
version.t
The rename
command is perfect for this. It takes a sed-like regular
expression (it is really a perl script so it is a PCRE) to do the
conversion. Let's build up the rename by trying things without doing
the rename and crafting the command in place until it does what we
want. Use the -v option to show what it is doing. Use the -n option
to say not-really, don't actually do it, just show what would be
done. Try this and see what would result without actually doing anything.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ rename -v -n s/.t/-cc.t/ *.t
rename(help.t, help-cc.t)
rename(invalid-options.t, invalid-o-cc.tions.t)
rename(load.t, load-cc.t)
rename(transactions.t, transa-cc.tions.t)
rename(version.t, version-cc.t)
Nope. Missed. The .
is a regular expression that matches
anything. Therefore matched the 't' in "options". Let's backslash
quote it so that it must match a dot.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ rename -v -n s/\.t/-cc.t/ *.t
rename(help.t, help-cc.t)
rename(invalid-options.t, invalid-o-cc.tions.t)
rename(load.t, load-cc.t)
rename(transactions.t, transa-cc.tions.t)
rename(version.t, version-cc.t)
Nope. Missed. The \.
was not quoted and therefore the shell
thought we were quoting a shell meta-character on the command line.
Must quote it. Here it does not matter if we use single or double
quotes. As there is no other expansion desired I used single quotes.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ rename -v -n 's/\.t/-cc.t/' *.t
rename(help.t, help-cc.t)
rename(invalid-options.t, invalid-options-cc.t)
rename(load.t, load-cc.t)
rename(transactions.t, transactions-cc.t)
rename(version.t, version-cc.t)
That now looks correct. Therefore removed the -n option so that it will now actually do the rename.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ rename -v 's/\.t/-cc.t/' *.t
help.t renamed as help-cc.t
invalid-options.t renamed as invalid-options-cc.t
load.t renamed as load-cc.t
transactions.t renamed as transactions-cc.t
version.t renamed as version-cc.t
Looks like the renames were correct. Check it.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ ls -1 *.t
help-cc.t
invalid-options-cc.t
load-cc.t
transactions-cc.t
version-cc.t
Perfect! All of the files are now named for the cc program. Now
let's fork those into versions for the new bank program. Let's run a
for
loop on the command line on the files we want to copy. Just
echo them out initially so that we can see where we are starting. The
hack it into what we want.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ for f in *-cc.*; do echo $f; done
help-cc.t
invalid-options-cc.t
load-cc.t
transactions-cc.t
version-cc.t
Yes. Matches the source files we want to copy. Let's modify the file names from cc to bank. Again sed is perfect for this modification.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ for f in *-cc.*; do n=$(echo $f | sed s/-cc/-bank/); echo cp $f $n; done
cp help-cc.t help-bank.t
cp invalid-options-cc.t invalid-options-bank.t
cp load-cc.t load-bank.t
cp transactions-cc.t transactions-bank.t
cp version-cc.t version-bank.t
Looks exactly like what we want to do. Do it. Remove the echo from
the cp
command. That will invoke it for actually doing the copy.
Since it is nice to have some feedback that it is doing something I am
going to add -v to the cp command so that it will show us what it is
doing here. But really since we previewed it above this is not needed
as we know exactly what it is going to do. The -v option is a GNU
extension to the cp
command and not available everywhere. But here
we are on the command line and will know if it works or does not.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ for f in *-cc.*; do n=$(echo $f | sed s/-cc/-bank/); cp -v $f $n; done
'help-cc.t' -> 'help-bank.t'
'invalid-options-cc.t' -> 'invalid-options-bank.t'
'load-cc.t' -> 'load-bank.t'
'transactions-cc.t' -> 'transactions-bank.t'
'version-cc.t' -> 'version-bank.t'
Looks good. Check it.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ ls -1 *.t
help-bank.t
help-cc.t
invalid-options-bank.t
invalid-options-cc.t
load-bank.t
load-cc.t
transactions-bank.t
transactions-cc.t
version-bank.t
version-cc.t
Perfect! Now we just need to modify the contents of the *-bank
files. The content of them calls over to the bbva-import-cc program
and we want them to call over to the new bbva-import-bank instead.
There are endless different ways to do this. Do not get hung up on one perfect way. This is a one-time throw away sequence of command lines to get this one-time task done. It doesn't need to be perfect. It doesn't need to be best. It just needs to be good enough. Afterwards we will have the result we need and how we got here is not important.
First I like to check the matches of the pattern I am going to use to review and verify that what I am changing is what I want to change. Let's grep the file contents and double check our matches.
Here I am using "-cc" with the starting "-" because that will be very
unique and will avoid matching accidental occurrences of "cc" other
places. But "-cc" looks like an option to grep
. Therefore I need
to use the grep -e PATTERN
option to tell grep
that "-cc" is a
pattern and not an option. `grep -e -cc" says to grep for "-cc" as
the pattern.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ grep -e -cc *-bank.t
help-bank.t:my $output = `perl $srcdir/../src/bbva-import-cc.pl --help`;
invalid-options-bank.t:my $output = `perl $srcdir/../src/bbva-import-cc.pl --foo 2>&1`;
invalid-options-bank.t:$output = `perl $srcdir/../src/bbva-import-cc.pl -x 2>&1`;
load-bank.t:require_ok("$srcdir/../src/bbva-import-cc.pl");
transactions-bank.t:my $output = `perl $srcdir/../src/bbva-import-cc.pl transactions1.csv`;
version-bank.t:my $output = `perl $srcdir/../src/bbva-import-cc.pl --version`;
version-bank.t:$output = `perl $srcdir/../src/bbva-import-cc --version`;
version-bank.t:like($output,qr/bbva-import-cc \d/, "version number is a number");
I review the matches. Looks like there are no confusing strings that I need to worry about. Looks like a very simple task to edit those from cc to bank.
Let's use sed
to change all occurrences of "-cc" to "-bank" in those
files. Let's preview the change first. I like previewing actions
before doing them when hacking something together on the command
line. Here let's substitute "-cc" for "-bank" and let's use the
'g'lobal option to change it every time it appears on a line. The
basic command would be this, which
sed 's/-cc/-bank/g' # do not really run this
But if we previewed that with the files we wanted to change:
sed 's/-cc/-bank/g' ./*-bank.t # do not really run this
That would produce a lot of output. It would stream the entire file
contents to the terminal output. Everything would scroll off. So
let's not do the above. grep
that output for just the lines we want
to see. This we can do and it will be small.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ sed 's/-cc/-bank/g' ./*-bank.t | grep -e -bank
my $output = `perl $srcdir/../src/bbva-import-bank.pl --help`;
my $output = `perl $srcdir/../src/bbva-import-bank.pl --foo 2>&1`;
$output = `perl $srcdir/../src/bbva-import-bank.pl -x 2>&1`;
require_ok("$srcdir/../src/bbva-import-bank.pl");
my $output = `perl $srcdir/../src/bbva-import-bank.pl transactions1.csv`;
my $output = `perl $srcdir/../src/bbva-import-bank.pl --version`;
$output = `perl $srcdir/../src/bbva-import-bank --version`;
like($output,qr/bbva-import-bank \d/, "version number is a number");
As an aside I could be fancy and do the sed+grep all in one sed
command. But that would require modifying the sed command. I would
need to use -n
to tell sed not to print lines and need to add the
'p'rint command to the action. Then need to remove it. Filtering it
through a later grep
command allows the sed part we want later to be
exactly as it needs to be.
And so now let's use the GNU sed --in-place
extension to edit those
files in place.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ sed --in-place 's/-cc/-bank/g' ./*-bank.t
And then check the results.
rwp@angst:~/src/bank-import-stuff/bbva-import/t$ grep -e -bank ./*-bank.t
./help-bank.t:my $output = `perl $srcdir/../src/bbva-import-bank.pl --help`;
./invalid-options-bank.t:my $output = `perl $srcdir/../src/bbva-import-bank.pl --foo 2>&1`;
./invalid-options-bank.t:$output = `perl $srcdir/../src/bbva-import-bank.pl -x 2>&1`;
./load-bank.t:require_ok("$srcdir/../src/bbva-import-bank.pl");
./transactions-bank.t:my $output = `perl $srcdir/../src/bbva-import-bank.pl transactions1.csv`;
./version-bank.t:my $output = `perl $srcdir/../src/bbva-import-bank.pl --version`;
./version-bank.t:$output = `perl $srcdir/../src/bbva-import-bank --version`;
./version-bank.t:like($output,qr/bbva-import-bank \d/, "version number is a number");
Perfect! Exactly what we wanted.
As another aside when I originally did this for real I didn't use
sed
. Instead I used perl
. Since perl
has always had the -i
edit in place option. And perl
is available everywhere too. Really
it is more portable than using the GNU sed --in-place
option. But I
was worried that if I mentioned perl
that people would get scared
off! I didn't want that. But the same command above is very simply
done more portably in perl
like this. The first checking the action
and the second with the -i
invoking it for in place editing.
perl -p -e 's/-cc/-bank/g' ./*-bank.t | grep -e -bank
...
perl -pi -e 's/-cc/-bank/g' ./*-bank.t
Pretty cool! In just a moment of command line hacking we have forked the files that we needed and edited them to fork the content.
Maybe later if I add a third bank I will parameterize out the bank part for this basic part of the framework. But when it is the simple tests such as testing invalid options and version output those tests are always the same overhead to parameterize makes them harder to understand and harder to keep updated than simply copying them.
I am sure however that even though this program did not exist 30 minutes ago and now it exists and solves a problem that I needed solved that 10 minutes from now I will hear all kinds of comments about how I am doing it wrong! I shouldn't be using a copy-paste anti-pattern for one. Which I do agree with. But one should not let the perfect be the enemy of the good. It's a process of continuous improvement. Remember the Rule of Optimization: Prototype before polishing. Get it working before you optimize it.