Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Java: Divide a string (firstname,surname)

  • 29-11-2007 5:26pm
    #1
    Registered Users, Registered Users 2 Posts: 2,236 ✭✭✭


    Hi,
    I have just started strings and processing them in class. I have been presented with the following problem:

    User enters a firstname and a surname divided by a space. Then the program outputs the firstname and surname on seperate lines.

    Here's what I have done and it works well:
    	public static void main(String[] args)
    	{
    	Scanner input=new Scanner(System.in);
    	
    	String name;
    	int stringLength;
    	String name1,name2;
    	int spacePos=0;
    	
    	System.out.print("Please enter your name ");name=input.nextLine();
    	
    	while (name.charAt(spacePos)!=(char)32)
    	{
    		spacePos++;
    		
    	}
    	name1=name.substring(0,spacePos);
    	name2=name.substring(spacePos+1,name.length());
    	System.out.println("Name 1: "+name1);
    	System.out.println("Name 2: "+name2);
    	
    	}
    

    My question is, is this program desgined efficiently? I have a feeling that the whole idea of the loop to find out where the space is. (-which is the main speedbump in the problem) is a bit daft.

    Could you guys shed some light on this please? Maybe how I could shorten it?

    Thanks..


Comments

  • Registered Users, Registered Users 2 Posts: 1,916 ✭✭✭ronivek


    First of all; you should try avoid casting integers to characters; especially within boolean expressions. You can use character literals like the following; 'a' 'b' 'c' '?' ' ' etc. Just whatever the character you're looking for is enclosed in single quotes. Makes code manipulating Strings so much easier to read.

    In terms of using a loop to find the space; that's pretty much the only way to do it. There are Java Classes that you could use to perform the same function; but for such a trivial program and for someone who seems to be learning the language I don't believe there's much point discussing them here. Have a look at StringTokenizer Class or the split method within the Java String Class if you're interested.

    In terms of shortening it; here's my revision of what you've done. It's a little more concise but not that much more efficient.
    int delimIndex;
    for (delimIndex = 0; (name.charAt(i) != ' '); delimIndex++);
    System.out.println("Name 1: " + name.substring(0, delimIndex));
    System.out.print("Name 2: " + name.substring(delimIndex + 1));
    


  • Registered Users, Registered Users 2 Posts: 4,188 ✭✭✭pH


    The only slight problem is the error condition if a space isn't found - your code continues when it probably shouldn't.

    java Strings already have an indexOf method which you could use instead of your loop:

    [PHP]int spacePos = name.indexOf(32);[/PHP]


  • Registered Users, Registered Users 2 Posts: 11,989 ✭✭✭✭Giblet


    String string = "First Last";
    String first = string.substring(0,string.indexOf(" "));
    String second = string.substring(string.indexOf(" ")+1,string.length());

    System.out.println(first);
    System.out.println(second);


  • Registered Users, Registered Users 2 Posts: 1,604 ✭✭✭kyote00


    The tokenizer class is designed for sort of thing...

    StringTokenizer st = new StringTokenizer("Firstname LastName);
    while (st.hasMoreTokens()) {
    System.out.println(st.nextToken());
    }


  • Closed Accounts Posts: 198 ✭✭sh_o


    Have a look at the java.util.StringTokenizer class which is very useful for this type of thing.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 2,236 ✭✭✭techguy


    Thanks a million for your help lads..

    I'm after shortening it to:
    String name;
    	System.out.print("Please enter your name ");name=input.nextLine();
    	int spacePos=name.indexOf(32);
    	System.out.println("Name 1: "+name.substring(0,spacePos));
    	System.out.println("Name 2: "+name.substring(spacePos+1));
    

    what you guys think?
    @ ronivek, did you use a for loop? I haven't covered that yet.
    @ sh o, I havent covered the tokenizer yet so I won't go there,want to keep it going at the same pace as my class.thanks anyway.

    Also is it regarded as inefficient to declare variables to store data that may only be printed one or twice in the given program..

    is it better to do:
    System.out.println("Name 1: "+name.substring(0,spacePos));
    

    instead of:
    name1=name.substring(0,spacePos);
    System.out.println("Name 1: "+name1);
    

    I seem to be doing the latter a lot.. i'm sure thats a sure telltale that i'm not designing my program before I go to the keyboard:(


  • Closed Accounts Posts: 413 ✭✭sobriquet


    Dude, you haven't covered for loops yet. Relax. When you get more into coding you'll probably come across a saying: premature optimization is the root of all evil. You can't even code yet and you're optimizing! I admire and/or worry about your enthusiasm.

    Inlining something in an expression ("Name1: " + foo.bar) is a non-issue. Performance wise it makes no difference as far as you're concerned, the compiler may well do it's own thing with it anyways. In terms of coding practice, well it depends. You want to write maintainable code so that when you come back to it later having forgotten what it does, it's clear and easy to debug or expand. The two examples you give are pretty much equivalent - they'll be inside a function somewhere, without side effects. Changing from one to the other would have no effect on the rest of the program. Personally I'm a fan of not declaring variables or creating functions for things that only have a single instance use. Others may disagree. Experience will teach you what style you prefer and why.

    And designing your program before you go to the keyboard?! Yeah, I remember my lecturers banging on about that too. You're program as it stands is 5 lines long. When you integrate that into a larger program (try 100, or 1000, or 100,000 lines long), then you can refactor that code into something more maintainable. It's good that you're mindful of good programming practice, but don't let that stop you from actually writing code.


  • Registered Users, Registered Users 2 Posts: 21,264 ✭✭✭✭Hobbes


    String[] n = "Joe Bloggs".split("\\s+");  // Looks for spaces as a delimiter.
    String firstName = n[0];
    String lastName = n[1];
    
    Also is it regarded as inefficient to declare variables to store data that may only be printed one or twice in the given program..

    Not really. If you were to do what was mentioned it would not have any memory overhead. Objects marked for garbage collection rarely get collected directly after they have lost scope within the program.

    The example above would use the same memory as Strings are immutable.

    Just to add to that, if the number of spaces is not known or you need to factor in other names (eg. "Bob van Eyck") then I recommend looking at the regex stuff.


  • Closed Accounts Posts: 37 kleftangel


    You wouldn't ever do this unless you wanted to use the names seperatly, in which case, it's just as easy to ask for them seperatly.

    You should look up the String API to get the methods you can do on Strings. One way to do it is to use the split(String regex) method, and check for the space as a String, rather than casting. This helps cause you can compare against any specific char or String you want. It works as

    String name = in.nextLine();
    String[] name = name.split(" ");

    This will create an array that holds each String on either side of the space. Then print them out with a for loop or whatever.

    This is also handy for users who may have more than two names, or any other split uses you can think of.


  • Registered Users, Registered Users 2 Posts: 21,264 ✭✭✭✭Hobbes


    kleftangel wrote: »
    You wouldn't ever do this unless you wanted to use the names seperatly, in which case, it's just as easy to ask for them seperatly.

    Are you referring to my post? Because it is the best method to use, and you appeared to suggest the exact same thing.

    Incidently split(" ") will only look for one space. "\\s+" will look for one or more spaces.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 1,311 ✭✭✭Procasinator


    Hobbes wrote: »
    Are you referring to my post? Because it is the best method to use, and you appeared to suggest the exact same thing.

    Not that it really matters, but if we are talking about performance the StringTokenizer would probably out perform String.split(). Because StringTokenizer is simple delimeters (i.e. no regex) there is less overhead.

    That being said, the documentation recommends String.split() instead of StringTokenizer, which is only kept around to support legacy applications, so the choice is obvious.
    StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

    There ya go for, thought I throw is some useless rubbish instead of letting this thread die naturally. :P


  • Registered Users, Registered Users 2 Posts: 21,264 ✭✭✭✭Hobbes


    Not that it really matters, but if we are talking about performance the StringTokenizer would probably out perform String.split(). Because StringTokenizer is simple delimeters (i.e. no regex) there is less overhead.

    StringTokenizer, you would be making more objects and actually regex stuff runs faster.


  • Closed Accounts Posts: 37 kleftangel


    Hobbes wrote: »
    Are you referring to my post? Because it is the best method to use, and you appeared to suggest the exact same thing.

    Incidently split(" ") will only look for one space. "\\s+" will look for one or more spaces.

    Hey, no sorry. I was saying you wouldn't ask for the first name and last name together if you had the intention of using them seperatly. You should ask for them seperatly.


  • Registered Users, Registered Users 2 Posts: 1,311 ✭✭✭Procasinator


    Hobbes wrote: »
    StringTokenizer, you would be making more objects and actually regex stuff runs faster.

    No point really blabbing on, but I think we are going to have to agree to disagree here (cause I ain't got the desire to benchmark it).

    Saying that, I've never seen String.split() reported faster than StringTokenizer. For example:

    http://worsethanfailure.com/Comments/AddComment.aspx?ArticleId=5074&ReplyTo=135084&Quote=Y
    http://blog.emptyway.com/2007/04/04/dont-always-trust-javadocs/

    I can't really see how StringTokenizer creates significantly more objects (they both return String objects, one in an array, one you iterate through), or how putting Regex into to the mix couldn't cause a performance penalty.

    That said, unless I need to shave milliseconds in a program that uses this functionality (in other words, very rarely) I too would use String.split(). It also has a lot more power and flexibility than StringTokenizer with regex.


  • Registered Users, Registered Users 2 Posts: 21,264 ✭✭✭✭Hobbes


    No point really blabbing on, but I think we are going to have to agree to disagree here (cause I ain't got the desire to benchmark it).

    Spent the last few months studying for and passing the Java5 exam (passed it last week). Trust me, split() is faster then StringTokenizer for the actions it can do.

    StringTokenizer may win out if you did simple tokens, but even the code itself is not optimized. It sends data into a Vector then into an Enumeration, and finally into an array. But StringTokenizer itself implements Enumeration.

    Stringtokenizer is only in the JVM for legacy reasons. Check the 5.0 Javadoc for it.

    If you want speed in relation to regex then you should use the Pattern / Matcher Classes over the split() method. Also if you are experiencing lag in your regex expressions then you should review what regex you are using. That in itself is a science.


  • Registered Users, Registered Users 2 Posts: 1,311 ✭✭✭Procasinator


    Hobbes wrote: »
    Spent the last few months studying for and passing the Java5 exam (passed it last week). Trust me, split() is faster then StringTokenizer for the actions it can do.

    As in the first in the track? I'm a JCP in in the 1.4 platform myself, and while it gives a good grounding, at the end of the day it a piece of paper. I can't remember the certificating going into that much detail.
    Hobbes wrote: »
    StringTokenizer may win out if you did simple tokens, but even the code itself is not optimized. It sends data into a Vector then into an Enumeration, and finally into an array. But StringTokenizer itself implements Enumeration.

    And simple tokens is what I am talking about. That is all StringTokenizer does.
    Hobbes wrote: »
    Stringtokenizer is also deprecated and only in the JVM for legacy reasons. Check the 5.0 Javadoc for it.

    Yeah, I know. I quoted it the Java API in a previous post in this thread that said exactly that. :D:p

    At the end of the day, it doesn't really matter. I'm not recommending StringTokenizer - I'm just being pedantic. ;)


  • Registered Users, Registered Users 2 Posts: 21,264 ✭✭✭✭Hobbes


    As in the first in the track? I'm a JCP in in the 1.4 platform myself, and while it gives a good grounding, at the end of the day it a piece of paper. I can't remember the certificating going into that much detail.

    SCJP5 exam. It is much harder then the 1.4 (which I have as well). It does go that deep now and they have mechanisms in to stop question dumping.
    And simple tokens is what I am talking about. That is all StringTokenizer does.

    As I said if you want speed then use the Pattern / Matcher classes. You can compile the regex expression once that way instead at every split().


  • Registered Users, Registered Users 2 Posts: 1,311 ✭✭✭Procasinator


    Hobbes wrote: »
    As I said if you want speed then use the Pattern / Matcher classes. You can compile the regex expression once that way instead at every split().

    Of course, if you are going to be calling split over and over again with the same regex then you should be using the Pattern/Matcher class. But we are still assuming the need for regex - something which is not needed every time.

    All I am really saying is for a string like:
    "a,b,c,d,e,f,g"

    Which holds to strict formating, StringTokenizer should outperform String.split().

    Not that it matters - no one would need to fine tune this performance out - they'd be using the wrong language.

    Anyway, I'll stop flogging the dead horse and let this thread RIP.


  • Registered Users, Registered Users 2 Posts: 21,264 ✭✭✭✭Hobbes


    Of course, if you are going to be calling split over and over again with the same regex then you should be using the Pattern/Matcher class. But we are still assuming the need for regex - something which is not needed every time.

    Which is why you compile the regex pattern. The code is also cleaner.

    I did the following blocks of code and ran them 1,000 times.
    String[] a = "a,b,c,d,e,f".split(",");
    for (String s: a);
    ...
    StringTokenizer st = new StringTokenizer("a,b,c,d,e,f",",");
    while (st.hasMoreElements()) st.nextElement();
    ...
    Pattern p = Pattern.compile(","); // OUTSIDE OF LOOP.
    
    String[] pm = p.split("a,b,c,d,e,f");
    for (String s: pm);
    

    Running it...

    by Split: 94 milliseconds
    by StringTokenizer : 31 milliseconds
    by PatternMatcher : 16 milliseconds

    I ran it a few times and the times vary, but remained consistent with above.

    So as you can see Regex is the better way to go.
    Not that it matters - no one would need to fine tune this performance out - they'd be using the wrong language.

    Current JIT compilers are consistent with operating system speeds.


  • Registered Users, Registered Users 2 Posts: 1,311 ✭✭✭Procasinator


    Yes, I actually meant on a once-off, but repeatedly using then compile will be better. The first and last block will perform around the same on a once-off basis. That's why I said over and over.

    Not that this matters. This keeps rolling further offtopic, and the case where you would only need to use a delimiter once is quite small, and not very unique delimiters are going to come up in one application.
    Hobbes wrote:
    Current JIT compilers are consistent with operating system speeds.

    They are, but if you are trying to push out performance as fine-grained as this things like garbage collection can come up and kick you in the balls when performance is needed.

    I ain't going to get into JIT performance debate, because it one that has been going on for a long time. It usually is comparable, though often it has been in the middle range when compared against C compilers - better than the worse C compilers, worse than the best.

    Now, I promise to stop posting on this thread. :p


  • Advertisement
Advertisement