Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

regular expression to remove all non-printable characters

  • 28-01-2013 05:16PM
    #1
    Moderators, Science, Health & Environment Moderators, Social & Fun Moderators, Society & Culture Moderators Posts: 60,119 Mod ✭✭✭✭


    I wish to remove all non-printable ascii characters from a string while retaining invisible ones. I thought this would work because whitespace, \n \r \b are invisible characters but not non-printable? Basically I am getting a byte array with � characters in it (\uFFFd) and I don't want them to be in it. So i am trying to convert it to a string, remove the � characters before using it as a byte array again.

    With the code below they are removed but so are any occurences of \r \n and \b. What would be the correct regex to retain these also? Or is there a better way that what I am doing?
    public void write(byte[] bytes, int offset, int count) 
    {
    
        try {
            String str = new String(bytes, "UTF-8");
            str2 = str.replaceAll("[^\\p{Print}\t\n]", "");
            GraphicsTerminalActivity.sendOverSerial(str2.getBytes("UTF-8"));
    
        } catch (UnsupportedEncodingException e) {
    
            e.printStackTrace();
        }
    
         return;
        }
    
    }
    

    Do I have to add some clause for ascii control characters?


Comments

Advertisement