Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Stripping data submitted by users of elements

  • 18-10-2008 6:40pm
    #1
    Closed Accounts Posts: 2,300 ✭✭✭


    I'm doing some work on a java/jsp site.

    Users will submit reviews which'll be posted straight up so I need to strip < > elements out. Is that, within reason, all I need to do?

    SQL injection is a whole other kettle of fish AFAIK (Isn't it?) - I'm not too worried about that tho. Passwords are encrypted for starters., nothing very sensitive stored, etc


Comments

  • Registered Users, Registered Users 2 Posts: 569 ✭✭✭none


    Check here (http://www.comp.lancs.ac.uk/computing/research/cseg/projects/ariadne/ihe/web/chars.html and http://www.w3.org/TR/WD-html40-970708/sgml/entities.html) for the reserved chars but in most cases they won't break up the HTML layout even if you don't encode them. What you really need to worry about is your Java and JavaScript strings and, as you said yourself, SQL backend as they all are much more sensible for the reserved chars than HTML.


  • Closed Accounts Posts: 2,300 ✭✭✭nice1franko


    All I really want them to be able to enter is plain text. So no html or script elements at all.

    The reviews will only be about 300 characters, max.
    String str = "<tr align='center'><td>This product is great..<br style='line-height:20px;'></td></tr><script>alert('im a hacker me')</script>:D";
    
    str = str.replaceAll("\\<.*?\\>", "");
    
    System.out.println(str);
    

    outputs:
    This product is great..alert('im a hacker me'):D
    

    Is that good enough do ya reckon?


  • Closed Accounts Posts: 2,300 ✭✭✭nice1franko


    or possibly this one :
    str.replaceAll("</?\\w++[^>]*+>", "")
    


  • Registered Users, Registered Users 2 Posts: 569 ✭✭✭none


    I thought your question was where their input may cause problems. Obviously, for the end user it is almost always only plain text but for the computer it may well be a bit of an issue. This is what you have to watch out for. For Java and JavaScript it's most of the time quote, slash and CR/LF chars but your main concern may be your SQL backend which can have other restrictions.


Advertisement