Open-source name variants database

I’ve developed an open-source name-variants database. We’re currently using it at WeRelate.org. This is a better algorithm than Soundex for matching variant names like Ann, Anna, Annah, Anne, Annie, etc. It results in a 28% reduction in missed variants compared to soundex, based upon a set of 100,000 pairs of names provided by Ancestry.com. I’ll be giving a talk on it at RootsTech next week. The source code and database are freely available on Gitbub.

Advertisements

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s