UnixLib regex implementation

John-Mark Bell jmb202 at ecs.soton.ac.uk
Sat Jul 23 15:38:10 PDT 2005


Hi,

UnixLib's current regex implementation is horrendously slow (for 
reference, it's currently the same as the one in NetBSD libc). The 
original author of that implementation has produced another regex 
implementation which is significantly faster. This is currently used in 
Tcl and PostgreSQL. I've extracted the implementation from the PostgreSQL 
source tree and modified it for use in UnixLib. There are a couple of 
issues, however:

a) There's some really nasty #defines in regex.h in order to fixup the APIs
b) The implementation is theoretically wide-character aware, but my 
   changes have removed this support.

http://moose.mine.nu:6888/regex.zip contains the relevant sources in a 
form that can be merged with a UnixLib tree.

The original code can be found in PostgreSQL CVS at:

  http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/regex/

with the relevant header files at:

  http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/regex/

I'd appreciate it if someone could take a look and determine whether it's 
worth checking into CVS in its current state or whether the above issues 
should/could be solved in some way or other.

The relevant files associated with issue b, above are regcustom.h (the 
platform customisation options for the library) and regc_locale.h (the 
locale and main character handling code - this is named regc_locale.c in 
the PostgreSQL tree).


John.




More information about the gcc mailing list