Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Grep in unix

Options
  • 10-01-2007 2:39pm
    #1
    Closed Accounts Posts: 4,237 ✭✭✭


    Ok not so much a programming question but lets see if someone can help anyway. I've done this before a few years ago using egrep but can't remember the string i used. I'm looking to pull out all occourances of a string in a log file between two set times. For instance a huge log file and I want to grep all occourances of the word HELP from between 12:01 and 12:30.

    Any takers?


Comments

  • Registered Users Posts: 441 ✭✭robfitz


    iregk wrote:
    Any takers?

    If the time and word are on the same line something like "12:[0-3][0-9].*HELP" should work.


  • Closed Accounts Posts: 7,230 ✭✭✭scojones


    What form does the file take? Does it have a timestamp on each line?


  • Closed Accounts Posts: 6,300 ✭✭✭CiaranC


    Post a few lines of the file


  • Closed Accounts Posts: 4,237 ✭✭✭iregk


    Yes each line has a time stamp straight out. I will post up an example of a line tomorrow but basically, time stamp iso form, then various colums of info.


  • Moderators, Technology & Internet Moderators Posts: 37,485 Mod ✭✭✭✭Khannie


    robfitz wrote:
    If the time and word are on the same line something like "12:[0-3][0-9].*HELP" should work.

    This would actually grep up to 39.

    I would use:
    grep -E "12:(([0-2][0-9])|30).*HELP" YOURFILENAMEHERE

    (grep -E can be substituted with "egrep")

    HTH

    edit: Just saw that you want it to start at 12:01. That is slightly messier.

    grep -E "12:((0[1-9])|([1-2][0-9])|30).*HELP" YOURFILENAMEHERE

    (The above assumes that you want to include matches at 12:30. Just remove the |30 if not.)


  • Advertisement
  • Closed Accounts Posts: 4,237 ✭✭✭iregk


    20070111190004640,Feed,GBPJPY=.RE_NY,J,233.92,234,,,0,L

    Ok folks there is an example of the line. Khannie I tried your one above and it returned nothing. I check the log and there is info there for those times.

    egrep "04:((0[1-9])|([1-2][0-9])|25).*EURGBP" 20070112000000498.RateReceivedLog


  • Moderators, Technology & Internet Moderators Posts: 37,485 Mod ✭✭✭✭Khannie


    I'll modify the query to match that later. Just for clarity, you want to match a line like that one, right?


  • Closed Accounts Posts: 4,237 ✭✭✭iregk


    Yes indeed khannie.


  • Moderators, Technology & Internet Moderators Posts: 37,485 Mod ✭✭✭✭Khannie


    Righto....I'm assuming the time format in that line is:

    YYYYMMDDHHmm

    So in your example, the line was at 19:00 on the 11/01/2007

    Assuming you want to match all lines between 12:01 and 12:30 (i.e. for all dates) you would use:

    grep -E "[0-9]{8,}12((0[1-9])|([1-2][0-9])|30)[0-9]{4,}.*GBP" YOURFILE

    I'll break this down for you....

    [0-9]{8,} = any digit, exactly 8 times (this covers the YYYYMMDD). If you want to match a specific date, replace this with (for example) 20070116 (today).
    12 = match 12 (hour is 12)
    ((0[1-9])|([1-2][0-9])|30) = match 01 to 30 (this is a bit messy but I can further explain it if you need that)
    [0-9]{4,} = any digit exactly 4 times (this isn't strictly necessary but removes the potential for false positives)
    .* = anything, any number of times
    EURGBP (your text)

    Hope this helps. It's almost certainly worth your while spending an hour reading up on regular expressions. They are a massively powerful pattern matching tool.


Advertisement