Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

java regex

  • 05-02-2009 10:43pm
    #1
    Registered Users, Registered Users 2 Posts: 163 ✭✭


    I need two java regex expressions to parse the anchor text and url from a http link like this: "<a href="http://www.example.com/chapter2.html">chapter two</a>"

    so what i want to be left with is
    url: http://www.example.com/chapter2.html
    anchor: chapter two

    I have something like this for the url : "http://[a-zA-Z_0-9.-&/+=]+&quot;

    it works for simple urls but exotic characters mess it up, alos im using the " at the end to end the match, don't think this is the best way

    I have ">[a-zA-Z_0-9[\\W]]+<a/>" for the anchor, but they don''t seen to cover each eventually, has any one got a set that would handle any permutation ?


Comments

  • Subscribers Posts: 4,076 ✭✭✭IRLConor


    Untested:
    Pattern p = Pattern.compile("<a\s+.*?href=(?:\"(.*?)\"|'(.*?)').*?>(.*?)</a>");
    Matcher m = p.matcher(theStringToSearch);
    if (m.matches()) {
        String anchorText = m.group(3);
        String url = m.group(1);
        if (url == null) {
            url = m.group(2);
        }
    }
    


  • Registered Users, Registered Users 2 Posts: 163 ✭✭stephenlane80


    thanx, i will try it this afternoon, but it looks pretty good


Advertisement