Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Excel query regarding data in cells

Options
  • 29-08-2014 3:55pm
    #1
    Registered Users Posts: 610 ✭✭✭


    Mods, please move this thread if it is in the wrong section.

    I have an Excel query regarding getting data from a cell.

    Basically, I have multiple account names for an professional members organisation. Some of these are professional companies, and some are individual clients. However, there is no consistency in how the information has been entered.

    For example, I might have:

    IEIE Plc for John Smith
    Potr SA A/C Michelle deSmith
    Potr SA / Mr John Brown
    IEIE Plc - John Reynolds
    JDBN FBO NASFR

    While the list above is small, it is a sample of about 30,000 different members, so manually cleaning it up is exceptionally time consuming, and prone to errors.

    What I am looking to do is find some ways of splitting the cell into seperate cells. So for Potr SA A/C Michelle deSmith and Potr SA / Mr John Brown, I have the following command in a cell:

    =IFERROR(IF(AND(MID(E26,FIND("/",E26)-1,1)="A",MID(E26,FIND("/",E26)+1,1)="C"),FIND("/",E26)-2,FIND("/",E26)),0)

    Where E26 is the cell with the information in it. Using the return (9 for both), I cal use the Mid command to extract the information either side of the split. This will give the following information in the following cells:

    Potr SA A/C Michelle deSmith
    Potr SA Mr John Brown

    This works well for any cell with "/" in it.

    However, I am looking to get the same cell (where I have the command above) to also do a search for "-", "FBO", "for", etc. and return a value, so that I can use the MID function.

    The FIND command can be used for "-", but I can't find a way of integrating the search into the above command (it keeps returning an error). FIND can't be used for "FBO", "for", etc as they will return more than one point.

    Without using VBA, as the organisation will not allow macros (set at corporate level), is there any simple way to split/extract the information, and return a single value, without going through massive amount of cells.

    Any help would be gratefully appreciated.

    Regards.


Comments

  • Registered Users Posts: 166 ✭✭gleesonger


    Firstly VBA is exactly what I would use, you wouldn't have to save the code just to run it once in a *.xlsx then delete it.
    But accepting you must use Excel;

    I have had similar types of problems and the solution was always the same, come up with some method to solve the majority of the cases use it, save my results, remove the data items I successfully extracted and move on to coming up with a new method for the remaining data. Keep repeating until the data set is sufficiently small enough that I can manually correct it.

    Using that approach I would modify your formula to be generic moving the search variable to an input parameter.
    Splitting your formula into little chunks will allow you to modify an individual section at a whim it is also much easier to debug.
    Have a scan though the data and find the longest variables which can be used to split the data
    In this example you would first split the data with "A/C" then after these data items have been removed split by "/", that way you do not need to consider the possibility of "A/C" in the remaining data set.

    Consider what happens is a split variable like "/" is present twice, you would need a few Find functions each one linking to the outcome of the last (see argument [start_num]) then take the max over all of the results to obtain the last index of the variable.
    Also what if a name contains a split variable eg "Anne-Marie"

    In short I don't have a quick answer for you, when dealing with manually inputted data that did not conform to some rules/standards you can processes it with ease. It also is no help if the tools at your disposable are not meant to deal with this type of problem.

    2605p38.png


  • Closed Accounts Posts: 19,777 ✭✭✭✭The Corinthian


    It 'can' be done with either formulas or VBA, although I'd agree that using VBA would be a lot more convenient.

    The main problem with either approach is that there's no standard format you can rely upon, leaving you to have to take a holistic approach which is not going to be 100%.

    I'd draw up a list of potential qualifications and check for each of them, removing them from the cell string as you go along. Then parse for characters such as slashes or dashes and remove them too. Finally you should be left with just the name. Just to add, make all your sub-string searches case insensitive.

    If you do it using formulas, I recommend you use a separate sheet and split up the operation in multiple cells so that it's easier to follow and amend.

    There are always going to be problems with this or any holistic approach, of course. Typos won't get correctly handled. If you've missed out on any potential qualifications, neither will they. Some characters may get parsed from names that should not (dashes in double-barreled names, for example). I suspect the best you can do is catch 99% of them, and the rest you'll need to do by hand.


Advertisement