Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Finding sub-strings in non delimited text.

Options
  • 21-08-2014 11:49am
    #1
    Registered Users Posts: 1,770 ✭✭✭


    Looking for a handy way to find two substrings in a kernel message. here's the typical line.

    Aug 21 11:44:09 server01 kernel: 0L6 D416D RT=D P=27 P=6 E=5 <7>DROP: IN=eth0 OUT= MAC=00:10:22:1c:88:77:ef:ba:70:42:00:80:08:00 SRC=192.168.23.252 DST=168.192.8.13 LEN=382 TOS=0x00 PREC=0x00 TTL=62 ID=45176 DF PROTO=UDP SPT=32770 DPT=162 LEN=362

    I'm truing to pull out just 'SRC=192.168.23.252 DST=168.192.8.18'

    usually I would be able to just cat messages | awk '{print $18,$19;}' but each line changes it's index making this method useless.

    TIA


Comments

  • Moderators, Technology & Internet Moderators Posts: 37,485 Mod ✭✭✭✭Khannie


    A combination of grep and sed would probably achieve what you're looking for. How's your regex-ing?


  • Registered Users Posts: 1,770 ✭✭✭Sebzy


    Yea looking at grep now with perl and I'm getting closer
    grep -Po 'SRC[^LEN]*' /var/log/messages

    This gives me
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.23.252 DST1362681
    SRC=3.0.322DT162681
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.228.253 DST=192.168.8.13
    SRC=192.168.23.252 DST=192.168.8.13


  • Registered Users Posts: 1,477 ✭✭✭azzeretti


    This is quick and nasty (very nasty - and if I hadn't just pulled a 12 hour shift I'd say I could cut a couple of lines of it!) but should do what you want.

    EDIT: If you uncomment the "#print Dumper \@logs;" line you will get the value of the %res hash for each line. Might be handy while setting up.
    #!/usr/bin/perl
    use strict;
    use Data::Dumper;
    my $file = '/var/log/messages';
    open (FH,"<",$file) or die "Can't find $file\n";
    my @logs;
    my @messages = <FH>;
    foreach my $mess (@messages) {
    my %res;
    my @r = split /\s+/,$mess;
    # not interested in anything !=~ m/\=/
    # This will ignore values with no "=" in them
    foreach (@r) {
        if ($_ =~ m/\=/) {
            my ($des,$val) = split /\=/,$_;
            $res{$des} = $val;
            
        }
    }
    # We now have all our values (with an "=" in them) in
    # the @logs array.
    push @logs, \%res
    }
    
    #print Dumper \@logs;
    
    # We can loop throuh the  array pulling out
    # what we need e.g print $log->{'MY_VALUE'}
    foreach my $log (@logs) {
    	print "SRC: $log->{'SRC'} DST: $log->{'DST'}\n";
    }
    


  • Closed Accounts Posts: 18,966 ✭✭✭✭syklops


    Sebzy wrote: »
    Looking for a handy way to find two substrings in a kernel message. here's the typical line.

    Aug 21 11:44:09 server01 kernel: 0L6 D416D RT=D P=27 P=6 E=5 <7>DROP: IN=eth0 OUT= MAC=00:10:22:1c:88:77:ef:ba:70:42:00:80:08:00 SRC=192.168.23.252 DST=168.192.8.13 LEN=382 TOS=0x00 PREC=0x00 TTL=62 ID=45176 DF PROTO=UDP SPT=32770 DPT=162 LEN=362

    I'm truing to pull out just 'SRC=192.168.23.252 DST=168.192.8.18'

    usually I would be able to just cat messages | awk '{print $18,$19;}' but each line changes it's index making this method useless.

    TIA

    I don't suppose this is SIEM related is it? This is a task I do quite regularly as a result of my SIEM work. I'll be back but it will be in python...

    Edit:

    Ok so...
    (?P<DATE>\w+\s\d+\s\d+:\d+:\d+)\s(?P<HOSTNAME>\w+)\s(?P<KERNEL>\w+:)\s(\w+\s\w+\s\D+\d+\s\D+\d\s\D+\d\s\D+\d\D\w+:\s)(?P<IN>\w+=\w+)\s\w+=\s(?P<OUT>\S+)\s\w+=(?P<SRC>\d+.\d+.\d+.\d+)\s\w+=(?P<DST>\d+.\d+.\d+.\d+)
    

    With the sample you have provided that will match the hostname, source and destination. If you want to PM me some more samples I can tweak it so it will match all message types.

    I recommend regex101.com. It gives a visual representation of whats being matched. I used that to write the above.

    Let us know if that solves your issue, or if you need more help.


Advertisement