Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

POSIX regex file expression question

  • 07-11-2002 5:19pm
    #1
    Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭


    hey guys,

    am setting up some file-extension filters and am just wondering where or not I'm getting this right

    for example:
    .*\..*vb.*
    
    will pick up filename1.vbs, or filename2.vbe, or filename.blah.vbs.txt, yes?



    similarly ....
    .*\.d*l
    
    will pick up filename1.dll and filename2.dpl, but not filename3.dogl, yes?


Comments

  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    Hmm.

    Here is the line I use to reject window$ executable attachments on mailservers if it's any help

    /etc/postfix/body_checks


    /(filename|name)=".*\.(au|bat|chm|cmd|com|css|dll|dot|exe|hlp|hta|exe|hlp|jse|lnk|ocx|pak|pif|pps|scr|sct|shs|src|vbe|vbs|vxd|wsh"/ REJECT


  • Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭Lemming


    Originally posted by Typedef

    /(filename|name)=".*\.(au|bat|chm|cmd|com|css|dll|dot|exe|hlp|hta|exe|hlp|jse|lnk|ocx|pak|pif|pps|scr|sct|shs|src|vbe|vbs|vxd|wsh"/ REJECT

    Hmm ... would I be right in saying that that filename setup checks for filename.extension?

    If it does, whilst it's a one-stop-filter-shopping-list, what I'm trying to do is create as generic a list as possible, rather than have to list every file-type and/or filetype combination (eg. name.txt.vbs) that I want blocked (since some file extensions are similar - give or take a character) something like:
    .*\.(vbs|vbe|dll|dpl|asp|tsp|etc etc).
    

    So what I'm asking is the following:

    "Was my syntax correct from my initial post"?


  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    That'll pick up a filename.txt.vbs (I just tested it).

    /(filename|name)=".*\.(vb*)"/ REJECT

    should work for vb(x) I think.


  • Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭Lemming


    Originally posted by Typedef
    That'll pick up a filename.txt.vbs (I just tested it).

    /(filename|name)=".*\.(vb*)"/ REJECT

    should work for vb(x) I think.

    Cheers type :)

    one last question (I hope anyway) for ye. The syntax * means one character, whilst if you have .* does this mean that you have 0 or more characters ? Or does it mean that you have one or more?

    If you follow the distinction I'm trying to make in that question?

    So , say for example I want to filter hta, htt, htm, html would I just have to have .*\.(ht.*) ? or would I have to specify a seperate filter for .html (since it's four characters as opposed to three) ?


  • Closed Accounts Posts: 96 ✭✭krinDar


    * means 0 or more occurrences of the previous RE.
    . (<period>) matchs any character except new line.
    Therefore the RE '.*' matches 0 or more occurences of any characters.

    The RE you give will match what you want, but be careful as it will match *any* file that has
    '.ht' anywhere in the name e.g important.ht.exe

    Check out regexp(5)


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭Lemming


    Originally posted by krinDar
    * means 0 or more occurrences of the previous RE.
    . (<period>) matchs any character except new line.
    Therefore the RE '.*' matches 0 or more occurences of any characters.

    Ok, so let me just clarify this.

    \.vb* will give me an RE that checks vb(vb n times)

    \.vb.* will give me an RE that checks all paterns with .vb in them? (eg. vbs/vbe)


    Check out regexp(5)
    MyBox# man 5 regexp
    No entry for regexp in section 5 of the manual
    

    pants ....


  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    man re_syntax

    * a sequence of 0 or more matches of the atom
    + a sequence of 1 or more matches of the atom
    ? a sequence of 0 or 1 matches of hte atom
    {m} a sequence of exactly m matches of the atom

    . matches any single character
    \k (where ks is a non-alphnumeric character) matches that chcaracter taken as an ordinary character, a.g. \\ matches a blackslash character.

    rtfm ; )


  • Closed Accounts Posts: 96 ✭✭krinDar


    Originally posted by Lemming
    Ok, so let me just clarify this.

    \.vb* will give me an RE that checks vb(vb n times)

    \.vb.* will give me an RE that checks all paterns with .vb in them? (eg. vbs/vbe)

    You can use either, they both do the same thing really.


  • Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭Lemming


    Originally posted by Typedef
    man re_syntax


    rtfm ; )

    hehe ... helps if you know that said man page exists to rtfm in the first place ;)


  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    : )


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭Lemming


    Originally posted by krinDar
    You can use either, they both do the same thing really.


    Hmm .. well the understanding I had was that they don't

    .*\.vb* checks for name.vb, or name.vbvb, or name.vbvbvb(n times)

    whereas

    .*\.vb.* checks for name.vb(x), or name.vb(xy) etc.

    By the man page so graciously suggested that I rftm (courtesy of Type :p ), it would appear that to do this you actually would type:

    .*\.vb. to do a search for name.vb(x), but something like .*\.vb.. or .*\.vb.* to search for anything more than .vb(x)


    yes ? No ?


  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    I think so.


  • Closed Accounts Posts: 286 ✭✭Kev


    Originally posted by Lemming


    .*\.vb* checks for name.vb, or name.vbvb, or name.vbvbvb(n times)

    that would check for name.vb or name.vbbbbb or name.v


  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    use *\.vb.


  • Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭Lemming


    Cheers guys :)

    now to see what other Regexp situations I can come up ......


  • Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭Lemming


    And she's back .........

    lets say I want to filter for .com or .cmd extensions,

    instead of putting in:
    .*\.cmd
    .*\.com
    or .*\.(cmd|com)

    can I do the following:

    .*\.c{1,}o?m{1,}d?

    to get the same effect?

    From what I /think/ it will interpret is the 'c' character once, 'o' occurs 0-1 times, 'm' occurs once, and then 'd' occuers 0-1 times therefore leaving me with the following possibilities:

    .cm
    .cmd
    .com
    .comd


    that right ?


  • Closed Accounts Posts: 286 ✭✭Kev


    {1,} mean 1 or more times. the + modifier is also a shortcut for this and looks nicer.

    so it would match multiple c's and d's

    if you want to filter for just com and cmd.

    .*\.c[om]d$

    the $ mean match the end of the string.


  • Registered Users, Registered Users 2 Posts: 14,149 ✭✭✭✭Lemming


    Originally posted by Kev
    {1,} mean 1 or more times. the + modifier is also a shortcut for this and looks nicer.

    so it would match multiple c's and d's

    if you want to filter for just com and cmd.

    .*\.c[om]d$

    the $ mean match the end of the string.


    oops .. that should have been c{1}o?m{1}d?

    but anyway ..... doesn't the [] only allow the matching of one character? So I could match either 'o' or 'm', but not 'om' ??


  • Closed Accounts Posts: 286 ✭✭Kev


    yes it would only match one o or m, if you just want to match cmd or com then you only need one, for more use + or {x,y}

    also {1} is redundant.

    c{1}o?m{1}d? will match all of

    comd
    com
    cmd
    cm


Advertisement