Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Reg expression help!

Options
  • 11-05-2015 12:14pm
    #1
    Registered Users Posts: 29


    Hi Folks, total noob question here. I'd like to search for all image tags within a certain div element but just can't figure it out. I've been able to find a regular expression that will match all image tags within a HTML document but not within certain div tags.

    Currently i have: <img\s+[^>]*?src=("|')([^"']+)\1 but would like to use this to search for all image tags within the following DIV block (and class): <div class="lower-content-inner-right-top">

    If anyone could help or point me in the right direction, would really appreciate it!
    Thks
    S


Comments

  • Registered Users Posts: 1,275 ✭✭✭bpmurray


    Your first expression seems a little overcooked if you're simply looking for images - this will find them:
    <img\s[^>]*>
    
    The problem with finding tags embedded in another specific tag is that a regular expression is used to find a pattern in text, whereas this really needs to parse the text if you want to do it right. A simplistic (and wrong) solution would be to assume that the img is after the div:
    <div\s[^>]*class="lower-content-inner-right-top"[^>]*>.*<img\s[^>]*>
    
    In reality you want lexical analysis.


  • Registered Users Posts: 29 SeamusHollahan


    Thanks a million bpmurray, it worked a treat. The solution you provided works perfectly in this case as i need to search for all images that have been entered onto pages using a CMS and i know that these images exist only within the DIV tag i provided. Currently looping through all the site pages and the pages brought back all have images...

    Thanks again!
    S.


  • Registered Users Posts: 1,109 ✭✭✭Skrynesaver


    WARNING, you're original regex would capture the whole tag where the tag looked like <img alt="Next>" src="button.gif"/>

    As bpmurray points out you should not attempt to parse html with regex html requires an actual parser and your language will have an XML/HTML parsing library.


Advertisement