Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

regular expression to remove all non-printable characters

  • 28-01-2013 4:16pm
    #1
    Moderators, Science, Health & Environment Moderators, Social & Fun Moderators, Society & Culture Moderators Posts: 60,110 Mod ✭✭✭✭


    I wish to remove all non-printable ascii characters from a string while retaining invisible ones. I thought this would work because whitespace, \n \r \b are invisible characters but not non-printable? Basically I am getting a byte array with � characters in it (\uFFFd) and I don't want them to be in it. So i am trying to convert it to a string, remove the � characters before using it as a byte array again.

    With the code below they are removed but so are any occurences of \r \n and \b. What would be the correct regex to retain these also? Or is there a better way that what I am doing?
    public void write(byte[] bytes, int offset, int count) 
    {
    
        try {
            String str = new String(bytes, "UTF-8");
            str2 = str.replaceAll("[^\\p{Print}\t\n]", "");
            GraphicsTerminalActivity.sendOverSerial(str2.getBytes("UTF-8"));
    
        } catch (UnsupportedEncodingException e) {
    
            e.printStackTrace();
        }
    
         return;
        }
    
    }
    

    Do I have to add some clause for ascii control characters?


Comments

  • Registered Users, Registered Users 2 Posts: 1,931 ✭✭✭PrzemoF




  • Moderators, Science, Health & Environment Moderators, Social & Fun Moderators, Society & Culture Moderators Posts: 60,110 Mod ✭✭✭✭Tar.Aldarion


    Thanks, I tried that earlier and I get the exact same functionality as I do at the moment. Although I was trying replaceAll("\\p{C}", ""); due to ? listing all options in the terminal.


  • Moderators, Science, Health & Environment Moderators, Social & Fun Moderators, Society & Culture Moderators Posts: 60,110 Mod ✭✭✭✭Tar.Aldarion


    I tried [^\\x00-\\x7F] which is the range of ascii characters....but then the � symbols still get through, weird.


Advertisement