UnixLib regex implementation
John.Tytgat at aaug.net
Sat Jul 23 16:01:21 PDT 2005
In message <Pine.LNX.4.44.0507232314440.5119-100000 at tarrant.ecs.soton.ac.uk>
John-Mark Bell <jmb202 at ecs.soton.ac.uk> wrote:
> UnixLib's current regex implementation is horrendously slow (for
> reference, it's currently the same as the one in NetBSD libc). The
> original author of that implementation has produced another regex
> implementation which is significantly faster.
Any figures available ?
> This is currently used in
> Tcl and PostgreSQL. I've extracted the implementation from the PostgreSQL
> source tree and modified it for use in UnixLib. There are a couple of
> issues, however:
> a) There's some really nasty #defines in regex.h in order to fixup the APIs
> b) The implementation is theoretically wide-character aware, but my
> changes have removed this support.
Just curious : what was the reason to remove the wide-character support ?
> http://moose.mine.nu:6888/regex.zip contains the relevant sources in a
> form that can be merged with a UnixLib tree.
> The original code can be found in PostgreSQL CVS at:
> with the relevant header files at:
> I'd appreciate it if someone could take a look and determine whether it's
> worth checking into CVS in its current state or whether the above issues
> should/could be solved in some way or other.
I'm not an expert in regex but a casual inspection of your work looks
fine to me.
> The relevant files associated with issue b, above are regcustom.h (the
> platform customisation options for the library) and regc_locale.h (the
> locale and main character handling code - this is named regc_locale.c in
> the PostgreSQL tree).
If you check this in, make sure to approprately update unixlib/Docs/Copyright
(which I guess needs more licenses attached, no ?).
John Tytgat, in his comfy chair at home BASS
John.Tytgat at aaug.net ARM powered, RISC OS driven
More information about the gcc