cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1716
Views
0
Helpful
11
Replies

Regular Expressions do not work

emusican
Level 1
Level 1

Using 4.X sensors, VMS2.2

It seems that normal regualar expressions arent being accepted as valid by CiscoWorks. Example:

If I wanted to match for "Red Duck" but the number of blank spaces between each letter had to be from 0-5 spaces I would use:

R[ ]{0-5}e[ ]{0-5}d[ ]{0-5}D[ ]{0-5}u[ ]{0-5}c[ ]{0-5}k[ ]{0-5}

That expression would match: R e d D uc k, Red D u c k and similar.

Why arent they accepted in String.TCP?

SO the question is, WHERE can I find a list of ACCEPTED regular expressions which work with 4.X sensors. I found a short list which works with the 3.X sensors....it didnt work at all. Any help here would be great.

Eric

2 Accepted Solutions

Accepted Solutions

Red Duck in google comes back as Red+Duck in Google. The space will be replaced with a plus (+) sign or %20 as it goes over HTTP (the browser does this).

The regex would need to be (also including case insensitivity):

[Rr][ +]*[Ee][ +]*[Dd][ +]*[Dd][ +]*[Uu][ +]*[Cc][ +]*[Kk]

You cannot repeat a three character pattern like [%]20.

View solution in original post

If you look in the packet decode you will see that your browser has converted your space character to a plus character: "RED+DUCK" instead of "RED DUCK"

If you look in the Hex dump it has "2b" which is for the character "+". If it was a space the Hex dump would have had "20"

So your previous regular expressions looking for spaces would never match because your web browser is not sending a space, it is sending a plus.

So Tony Hall converted your regex to look for either a space or a plus character. In this new regex the plus is inside the brackets and is therefore an actual plus character and is not acting like a meta character.

The regex "[ +]*" looks for any number of spaces OR plus characters. Only the star "*" and brackets "[]" are acting as meta characters, the plus "+" is a real character to match.

The only problem from your intial regex was that the sensor does not support {0-5} notation.

The problem with the other regexs is that they were looking for a space instead of the plus character and so they would never match on the real packet.

Web browsers are notorious for not sending exactly what you type. So always look at the real packet to see what is actually being sent on the wire.

View solution in original post

11 Replies 11

marcabal
Cisco Employee
Cisco Employee

I am not sure if version 4.1 will accept the {0-5} regex option.

Here is the link for the version 4.1 regex options:

http://www.cisco.com/univercd/cc/td/doc/product/iaabu/csids/csids10/idmiev/swappa.htm#787101

If you would allow more than 5 spaces between letters you could also try:

R[ ]*e[ ]*d[ ]*D[ ]*u[ ]*c[ ]*k[ ]*

R[ ]*e[ ]*d[ ]*D[ ]*u[ ]*c[ ]*k[ ]* = didnt work

R[ ]+e[ ]+d[ ]+D[ ]+u[ ]+c[ ]+k[ ]+ = didnt work

Red[ ]Duck = didnt work

Red.Duck = worked, but only for this specific string.

I am having trouble getting ANY metacharaters to work except dot. Do you (or anyone) have any proven examples of what works? Is there additonal settings other than the default ones which must be used in conjunction with using metacharaters?

Im curious how many people have the same issues that need to use complex string matching.

Eric

The regular expressions available to you as a user are the same as what use for the actual Cisco signatures.

You can execute the following to see examples of regular expressions in Cisco's own signatures:

sensor# conf t

sensor(config)# serv virtual virtualSensor

sensor(config-vsc)# tune

sensor(config-vsc-virtualSensor)# show settings | include Regex

If there is a specific meta character you want an example of then change the include on the show settings command from Regex to the meta character.

Foe example:

show settings | include {

-----------

To further help with your issue.

I can try to replicate it here in our lab.

I would need to verify the sensor type and version.

If this is version 4.x:

Can you go through the 4.x CLI and access your signature and paste in the contents of "show settings"?

For example, if your signature were a string.tcp signature:

sensor# conf t

sensor(config)# serv virtual virtualSensor

sensor(config-vsc)# tune

sensor(config-vsc-virtualSensor)# string.tcp

sensor(config-vsc-virtualSensor-STR)# sig sig 20001

sensor(config-vsc-virtualSensor-STR-sig)# show settings

This way I can see the Regex that you are attempting.

If you are using a management tool then it may be possible that the Regex winding up on the sensor may not be the same as what you are entering on your management tool.

If this is a 3.x sensor then can you paste in the configuration file lines for the signature.

It doesn't appear that you have spaces in your brackets. It appears that you have your regex as follows:

R[]+e[]+......

Instead it should be

R[ ]+e[ ]+

The syntax for what you are actually trying to do is:

R[ ]{0,5}, however our regex does not support this. It only supports the exact count version of R[ ]{,5}.

marcabel-

I went through the CLI and looked at what the actual Regex string was (using your instructions). Heres what I got when I did a show set:

SIGID: 20023

SubSig: 0 default: 0

AlarmDelayTimer:

AlarmInterval:

AlarmSeverity: medium default: medium

AlarmThrottle: FireAll default: Summarize

AlarmTraits:

CapturePacket: True default: False

ChokeThreshold:

Direction: ToService

Enabled: True default: True

EndMatchOffset:

EventAction: ZERO

FlipAddr:

MaxInspectLength:

MaxTTL:

MinHits: 1

MinMatchLength:

Protocol: TCP default: TCP

RegexString: R[ ]*E[ ]*D[ ]*D[ ]*U[ ]*C[ ]*K[ ]*

ResetAfterIdle: 15

ServicePorts: 25,80,110,111,443

SigComment:

SigName: RED DUCK default: STRING.TCP

SigStringInfo:

SigVersion:

StorageKey: STREAM default: STREAM

StripTelnetOptions:

SummaryKey: AaBb

ThrottleInterval: 15

WantFrag:

It seems to be the same thing I entered on the MC:

R[ ]*E[ ]*D[ ]*D[ ]*U[ ]*C[ ]*K[ ]*

There is exactly one space in the middle of each bracket set.

Using a sensor with 4.1 S54...its a Cisco 4235

Feel free to experiment, please do. Let me know what you find out! ...and thanks for the help.

klwiley-

There was a single space between each bracket set.

R[ ]+e[ ]+ did not work, although I didnt get any errors on deployment

R[ ]{0,5} - Is there any way to do this in your Regex? Perhaps an alternate? What good is it to have Regex Strings that cant do anything more than basic string matching?

Thanks for all the help by the way, you guys are keeping me busy today.

Eric

The RegexString wildcards do indeed work. I'm concerned that the traffic isn't what we think it is. Send a network trace of the packet you are trying to match to me at anthall@cisco.com. Be sure to set the snaplen to 1600 if using tcpdump.

I will develop a regex that will match what you want.

Packet Sent...

Its an example of what the sensor sees putting the word "RED DUCK" through google.

Eric

Red Duck in google comes back as Red+Duck in Google. The space will be replaced with a plus (+) sign or %20 as it goes over HTTP (the browser does this).

The regex would need to be (also including case insensitivity):

[Rr][ +]*[Ee][ +]*[Dd][ +]*[Dd][ +]*[Uu][ +]*[Cc][ +]*[Kk]

You cannot repeat a three character pattern like [%]20.

[Rr][ +]*[Ee][ +]*[Dd][ +]*[Dd][ +]*[Uu][ +]*[Cc][ +]*[Kk] Worked. What does the [:space:+] represent? Is there a reliable list of metcharacters and/or examples of regex that work with my packets? The list earlier linked only provided a few that actually worked. I guess what Im looking for is around 10-20 examples of complex regex that I can apply to my system.

Here is a copy of the packet I sent to anthall, it is the google search for "RED DUCK"

0: 0007 84bd ecca 0000 779b 77b4 0800 4500 ........w.w...E.

16: 028a d67a 4000 7c06 5bd7 9033 2596 d8ef ...z@.|.[..3%...

32: 3b63 0775 0050 6451 f28c ce63 9ffc 5018 ;c.u.PdQ...c..P.

48: 4308 e72c 0000 4745 5420 2f73 6561 7263 C..,..GET /searc

64: 683f 686c 3d65 6e26 6c72 3d26 6965 3d55 h?hl=en&lr=&ie=U

80: 5446 2d38 266f 653d 5554 462d 3826 713d TF-8&oe=UTF-8&q=

96: 5245 442b 4455 434b 2662 746e 473d 476f RED+DUCK&btnG=Go

112: 6f67 6c65 2b53 6561 7263 6820 4854 5450 ogle+Search HTTP

128: 2f31 2e31 0d0a 4163 6365 7074 3a20 696d /1.1..Accept: im

144: 6167 652f 6769 662c 2069 6d61 6765 2f78 age/gif, image/x

160: 2d78 6269 746d 6170 2c20 696d 6167 652f -xbitmap, image/

176: 6a70 6567 2c20 696d 6167 652f 706a 7065 jpeg, image/pjpe

192: 672c 2061 7070 6c69 6361 7469 6f6e 2f76 g, application/v

208: 6e64 2e6d 732d 706f 7765 7270 6f69 6e74 nd.ms-powerpoint

224: 2c20 6170 706c 6963 6174 696f 6e2f 766e , application/vn

240: 642e 6d73 2d65 7863 656c 2c20 6170 706c d.ms-excel, appl

256: 6963 6174 696f 6e2f 6d73 776f 7264 2c20 ication/msword,

272: 6170 706c 6963 6174 696f 6e2f 782d 7368 application/x-sh

288: 6f63 6b77 6176 652d 666c 6173 682c 202a ockwave-flash, *

304: 2f2a 0d0a 5265 6665 7265 723a 2068 7474 /*..Referer: htt

320: 703a 2f2f 7777 772e 676f 6f67 6c65 2e63 p://www.google.c

336: 6f6d 2f73 6561 7263 683f 686c 3d65 6e26 om/search?hl=en&

352: 6965 3d55 5446 2d38 266f 653d 5554 462d ie=UTF-8&oe=UTF-

368: 3826 713d 5245 442b 4455 434b 2662 746e 8&q=RED+DUCK&btn

384: 473d 476f 6f67 6c65 2b53 6561 7263 680d G=Google+Search.

400: 0a41 6363 6570 742d 4c61 6e67 7561 6765 .Accept-Language

416: 3a20 656e 2d75 730d 0a41 6363 6570 742d : en-us..Accept-

432: 456e 636f 6469 6e67 3a20 677a 6970 2c20 Encoding: gzip,

448: 6465 666c 6174 650d 0a55 7365 722d 4167 deflate..User-Ag

464: 656e 743a 204d 6f7a 696c 6c61 2f34 2e30 ent: Mozilla/4.0

480: 2028 636f 6d70 6174 6962 6c65 3b20 4d53 (compatible; MS

496: 4945 2035 2e35 3b20 5769 6e64 6f77 7320 IE 5.5; Windows

512: 4e54 2035 2e30 3b20 5433 3132 3436 3129 NT 5.0; T312461)

528: 0d0a 486f 7374 3a20 7777 772e 676f 6f67 ..Host: www.goog

544: 6c65 2e63 6f6d 0d0a 436f 6e6e 6563 7469 le.com..Connecti

560: 6f6e 3a20 4b65 6570 2d41 6c69 7665 0d0a on: Keep-Alive..

576: 436f 6f6b 6965 3a20 5052 4546 3d49 443d Cookie: PREF=ID=

592: 3334 3634 3531 3864 6661 6262 6636 3136 3464518dfabbf616

608: 3a54 4d3d 3130 3638 3832 3731 3038 3a4c :TM=1068827108:L

624: 4d3d 3130 3730 3435 3735 3538 3a54 423d M=1070457558:TB=

640: 323a 533d 4348 6445 4374 694b 7053 3559 2:S=CHdECtiKpS5Y

656: 6b39 5146 0d0a 0d0a k9QF....

Thanks for the help

Eric

If you look in the packet decode you will see that your browser has converted your space character to a plus character: "RED+DUCK" instead of "RED DUCK"

If you look in the Hex dump it has "2b" which is for the character "+". If it was a space the Hex dump would have had "20"

So your previous regular expressions looking for spaces would never match because your web browser is not sending a space, it is sending a plus.

So Tony Hall converted your regex to look for either a space or a plus character. In this new regex the plus is inside the brackets and is therefore an actual plus character and is not acting like a meta character.

The regex "[ +]*" looks for any number of spaces OR plus characters. Only the star "*" and brackets "[]" are acting as meta characters, the plus "+" is a real character to match.

The only problem from your intial regex was that the sensor does not support {0-5} notation.

The problem with the other regexs is that they were looking for a space instead of the plus character and so they would never match on the real packet.

Web browsers are notorious for not sending exactly what you type. So always look at the real packet to see what is actually being sent on the wire.

Thanks for the great help, that made sense marcabal. I took a look and some more examples of packets from different sources and they all seem to either be a space, or a plus.

Eric