Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

Finding sub-strings in non delimited text.

  • 21-08-2014 11:49AM
    #1
    Registered Users, Registered Users 2 Posts: 1,775 ✭✭✭


    Looking for a handy way to find two substrings in a kernel message. here's the typical line.

    Aug 21 11:44:09 server01 kernel: 0L6 D416D RT=D P=27 P=6 E=5 <7>DROP: IN=eth0 OUT= MAC=00:10:22:1c:88:77:ef:ba:70:42:00:80:08:00 SRC=192.168.23.252 DST=168.192.8.13 LEN=382 TOS=0x00 PREC=0x00 TTL=62 ID=45176 DF PROTO=UDP SPT=32770 DPT=162 LEN=362

    I'm truing to pull out just 'SRC=192.168.23.252 DST=168.192.8.18'

    usually I would be able to just cat messages | awk '{print $18,$19;}' but each line changes it's index making this method useless.

    TIA


Comments

  • Registered Users, Registered Users 2 Posts: 37,485 ✭✭✭✭Khannie


    A combination of grep and sed would probably achieve what you're looking for. How's your regex-ing?


  • Registered Users, Registered Users 2 Posts: 1,775 ✭✭✭Sebzy


    Yea looking at grep now with perl and I'm getting closer
    grep -Po 'SRC[^LEN]*' /var/log/messages

    This gives me
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.23.252 DST1362681
    SRC=3.0.322DT162681
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.23.252 DST=192.168.8.13
    SRC=192.168.228.253 DST=192.168.8.13
    SRC=192.168.23.252 DST=192.168.8.13


  • Registered Users, Registered Users 2 Posts: 1,477 ✭✭✭azzeretti


    This is quick and nasty (very nasty - and if I hadn't just pulled a 12 hour shift I'd say I could cut a couple of lines of it!) but should do what you want.

    EDIT: If you uncomment the "#print Dumper \@logs;" line you will get the value of the %res hash for each line. Might be handy while setting up.
    #!/usr/bin/perl
    use strict;
    use Data::Dumper;
    my $file = '/var/log/messages';
    open (FH,"<",$file) or die "Can't find $file\n";
    my @logs;
    my @messages = <FH>;
    foreach my $mess (@messages) {
    my %res;
    my @r = split /\s+/,$mess;
    # not interested in anything !=~ m/\=/
    # This will ignore values with no "=" in them
    foreach (@r) {
        if ($_ =~ m/\=/) {
            my ($des,$val) = split /\=/,$_;
            $res{$des} = $val;
            
        }
    }
    # We now have all our values (with an "=" in them) in
    # the @logs array.
    push @logs, \%res
    }
    
    #print Dumper \@logs;
    
    # We can loop throuh the  array pulling out
    # what we need e.g print $log->{'MY_VALUE'}
    foreach my $log (@logs) {
    	print "SRC: $log->{'SRC'} DST: $log->{'DST'}\n";
    }
    


  • Closed Accounts Posts: 18,966 ✭✭✭✭syklops


    Sebzy wrote: »
    Looking for a handy way to find two substrings in a kernel message. here's the typical line.

    Aug 21 11:44:09 server01 kernel: 0L6 D416D RT=D P=27 P=6 E=5 <7>DROP: IN=eth0 OUT= MAC=00:10:22:1c:88:77:ef:ba:70:42:00:80:08:00 SRC=192.168.23.252 DST=168.192.8.13 LEN=382 TOS=0x00 PREC=0x00 TTL=62 ID=45176 DF PROTO=UDP SPT=32770 DPT=162 LEN=362

    I'm truing to pull out just 'SRC=192.168.23.252 DST=168.192.8.18'

    usually I would be able to just cat messages | awk '{print $18,$19;}' but each line changes it's index making this method useless.

    TIA

    I don't suppose this is SIEM related is it? This is a task I do quite regularly as a result of my SIEM work. I'll be back but it will be in python...

    Edit:

    Ok so...
    (?P<DATE>\w+\s\d+\s\d+:\d+:\d+)\s(?P<HOSTNAME>\w+)\s(?P<KERNEL>\w+:)\s(\w+\s\w+\s\D+\d+\s\D+\d\s\D+\d\s\D+\d\D\w+:\s)(?P<IN>\w+=\w+)\s\w+=\s(?P<OUT>\S+)\s\w+=(?P<SRC>\d+.\d+.\d+.\d+)\s\w+=(?P<DST>\d+.\d+.\d+.\d+)
    

    With the sample you have provided that will match the hostname, source and destination. If you want to PM me some more samples I can tweak it so it will match all message types.

    I recommend regex101.com. It gives a visual representation of whats being matched. I used that to write the above.

    Let us know if that solves your issue, or if you need more help.


Advertisement