Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Linear Interpolation in excel

  • 26-07-2010 5:55pm
    #1
    Registered Users, Registered Users 2 Posts: 872 ✭✭✭


    Hey, I'm wondering if anyone can help me sort something out in excel.

    I've got a sheet with quite a bit of price data (>60k rows, a good few columns) and there are some gaps (marked with x's) in the data which i'm trying to get rid of. The data has time stamps so I can't just delete them, so i'm trying to linearly interpolate from one point to the next.

    I've come across a few ways to do it, the handiest seems to be to highlight from one price to the next over any gaps then go to edit>fill>series>select trend> click ok. It works fine, but there are thousands of these gaps in the sheet.

    I'm trying to set up a function that looks for the next number shown above that cell and the next number below that cell and uses these values in the interpolation formula. Then ideally run a script to change all the x's in the sheet to this formula.

    It's a long shot, but any ideas? Cheers


Comments

  • Registered Users, Registered Users 2 Posts: 1,595 ✭✭✭MathsManiac


    There might be a value in keeping your original data intact, so that you can still know which values are true observations and which have been interpolated.

    I can see a couple of options:
    1. Set up the formulae you refer to on a second sheet that duplicates the first. That is, the formula on sheet2 looks into the corresponding cell on sheet1. If the cell is non-empty, it duplicates the value; otherwise, it interpolates between the value abova and the value below. (This will only work if the values above and below are non-empty - if your gaps are sparse, that shouldn't be a major problem and you could fix any problems by hand.)

    2. Write a macro that more-or-less does the same thing as 1 above, but places values on sheet2 instead of formulae. (i.e., when you run the macro, it looks at every cell on sheet1, processes as above and inserts the relevant value into sheet 2.)

    Option 1 is easier to implement, but it has the possible disadvantage is that if your data set is truly huge, Excel might slow down a lot, because all formulae update every time you update one cell.

    In option 2, with a bit of fancier programming, you could cope with gaps larger than one cell.

    Whatever solution you find, it might also be helpful to apply a different formatting to the interpolated cells than to the others, so you can see at a glance which are the true data points.


  • Registered Users, Registered Users 2 Posts: 872 ✭✭✭gerry87


    I tried using a formula to just take the previous datapoint if there's a gap and leave it if there's no gap, same method as you said just without the interpolating. I did one column and it took about half an hour, so I don't think that's feasible. Unless I just let it churn through for a few hours then copy and paste the values getting rid of the formulae.

    VBA is another slight issue, excel 08 on mac doesn't have VBA, but I could probably find another comp to work on so I'll keep that in mind.

    It seems it might be easier to do it in matlab then excel, so I think I'll give that a shot first.

    Thanks.


  • Registered Users, Registered Users 2 Posts: 13,077 ✭✭✭✭bnt


    It's possible to calculate an overall trend line for all the data, using the SLOPE and INTERCEPT functions. Those give you the m and c parameters respectively for the trend line y = m x + c. Once you have that you can do all kinds of things, including calculating missing data or errors from the trend line.

    It seems to me that simply interpolating between adjacent data points won't tell you all that much about the data, but drawing a linear trend assumes that the data is roughly linear in the first place!

    You are the type of what the age is searching for, and what it is afraid it has found. I am so glad that you have never done anything, never carved a statue, or painted a picture, or produced anything outside of yourself! Life has been your art. You have set yourself to music. Your days are your sonnets.

    ―Oscar Wilde predicting Social Media, in The Picture of Dorian Gray



  • Registered Users, Registered Users 2 Posts: 872 ✭✭✭gerry87


    bnt wrote: »
    It's possible to calculate an overall trend line for all the data, using the SLOPE and INTERCEPT functions. Those give you the m and c parameters respectively for the trend line y = m x + c. Once you have that you can do all kinds of things, including calculating missing data or errors from the trend line.

    It seems to me that simply interpolating between adjacent data points won't tell you all that much about the data, but drawing a linear trend assumes that the data is roughly linear in the first place!

    The data is 1 minute stock price data, so a trend for the whole series wouldn't really work. There generally aren't huge gaps so I don't think it will have a huge influence on the results, I just needed some numbers in there instead of x's and I figured interpolation was better than carrying the previous price.

    I got it sorted using MATLAB's interp1 function.


Advertisement