Home Geschichten Kunst Computer Tindertraum

[current]

(Sunday 21st December 2003)

Just right to lift my Perl-spirit once again here comes an articel (thx Keith) dealing with how the combination of Class::DBI and Template Toolkit can make you're coding work real fun and effective...

Below some quotes from that article I find extremly usefull, if actually unrelated to the Class::DBI thing per se:
We start with a very simple canonicalization--stripping out vowels and collapsing repeated letters. (I've found that this can pick up about half of name misspellings found in the wild, which is pretty impressive.)
sub _canonicalise {
      my ($class, $word) = @_;
      return "" unless $word;
      $word = lc($word);
      $word =~ s/[aeiou]//g;    
      # remove vowels
      $word =~ s/(\w)\1+/$1/eg; 
      # collapse doubled 
      # (or tripled, etc) letters
      return $word;
}

(The matching method can be improved. I've found that neither Text::Soundex nor Text::Metaphone are much of an improvement over the simple approach already detailed, but Text::DoubleMetaphone is definitely worth plugging in, to catch misspellings such as Nicolas/Nicholas and Asimov/Azimof.)

I guess Perl can actually be FUN at times ;) (as if I didn't know anyway)

[ by Martin>] [permalink] [similar entries]

similar entries (vs):

no similar entries (yet?)

similar entries (cg):

no similar entries (yet?)

Martin Spernau
© 1994-2003

traumwind icon Big things to come (TM) 30th Dez 2002

What mistakes did you make last time?
Oblique Strategies, Ed.3 Brian Eno and Peter Schmidt



amazon.de Wunschliste





 

usefull links:
Google Graph browser
Traumwind 6-Colormatch
UAV News

powered by SBELT