Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well... basically, it sounds like Ruby's regex engine needs some work, hmm?


No, people should know better than to write regexes like /X(.+)+X/, with gratuitous doubly-nested "+" characters. :-) This code performs fine when written as /X(.+)X/, and it matches the same set of strings.

Regexp engines are subtle beasts, and there's a couple different ways to implement them (DFAs vs NFAs, simple engines vs lots of clever special cases, etc.). See O'Reilly's "Mastering Regular Expressions" for an exhaustive discussion.


Interesting. Any other recommendations on how to secure regex's that take in user input in Ruby/Rails?


First of all you should think hard before taking regexes from users. Even if you do it correctly you'll still (presumably) need to search over your entire dataset, instead of doing something more lightweight like rely on SQL indexes. Use it with care.

You should use a regex engine that's explicitly designed to take potentially hostile input. Like the Plan9 engine, or Google's re2 engine which powers Google Code Search.

You can also just use Ruby's dangerous PCRE engine if you do something like forking off another process with strict ulimits which executes the regex for you. Then you can just kill it if it starts running away with your resources. Look into how e.g. evaluation bots that work on the popular IRC channels on FreeNode are implemented. POE::Component::IRC::Plugin::Eval on the CPAN is a good example.


This assumes that PCRE doesn't still contain memory corruption flaws, despite not being heavily tested, and being in effect a programming language interpreter. Tavis Ormandy found a couple serious problems a few years ago.

I'd just scrub the hell out of strings before passing them to a regex engine.


Even if it does it's a pretty remote possibility that it'll be exploitable if you limit the input to say 100 bytes. Pretty hard to get a Perl or Ruby level program of that size to exploit some memory corruption at the C level.


Good advice, thanks you two!


It just seems to me that if Perl's regex engine handles this without a problem, and Ruby's implementation freaks out, something should be improved about the Ruby one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: