Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Algorithms
  4. Best fit polynomial curve to non-continuous data points [Solved]

Best fit polynomial curve to non-continuous data points [Solved]

Scheduled Pinned Locked Moved Algorithms
questioncomalgorithmsdata-structureshelp
15 Posts 5 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • I Ian Shlasko

    I'm sure there's a relatively-straightforward way to do this, but my research so far has been a bit overwhelming... Heavy statistics and calculus have never been one of my strengths... Basically, I need to calculate a best-fit curve on a scatter graph... Points are not evenly spaced on either axis, and there's no cost function to work with (These are derivations of stock/bond price histories)... Just raw (X,Y) values. I've been trying to work out an algorithm that'll start with a second-degree polynomial function, iterate through and minimize the difference, and then jump to third-degree if it's not within tolerance (And so on, up to a maximum degree). Juggling the data around is something I can do blindfolded, but there's one thing I can't figure out... Given the polynomial coefficients ((a,b,c,d) -> y = a + bx + cx^2 + dx^3) and a function to determine the total difference across the curve (Or the total square difference, or whatever's needed), how do I adjust the coefficients for the next pass? I mean, how do I choose new values? I was originally going to try to fix the highest-degree coefficient first, then drill down to the zeroth-degree one (I have a root-solving algorithm that I use for yield calculations - Was going to adapt that), but then I looked at it again with the proper amount of caffeine, and realized that solving one coefficient at a time wasn't going to work :) SOLVED: See my reply below... Linear/Polynomial Regression

    Proud to have finally moved to the A-Ark. Which one are you in?
    Author of the Guardians Saga (Sci-Fi/Fantasy novels)

    modified on Tuesday, May 24, 2011 9:41 AM

    _ Offline
    _ Offline
    _Erik_
    wrote on last edited by
    #2

    Not sure if this will help but, have you tried a spline interpolation algorithm?

    I 1 Reply Last reply
    0
    • I Ian Shlasko

      I'm sure there's a relatively-straightforward way to do this, but my research so far has been a bit overwhelming... Heavy statistics and calculus have never been one of my strengths... Basically, I need to calculate a best-fit curve on a scatter graph... Points are not evenly spaced on either axis, and there's no cost function to work with (These are derivations of stock/bond price histories)... Just raw (X,Y) values. I've been trying to work out an algorithm that'll start with a second-degree polynomial function, iterate through and minimize the difference, and then jump to third-degree if it's not within tolerance (And so on, up to a maximum degree). Juggling the data around is something I can do blindfolded, but there's one thing I can't figure out... Given the polynomial coefficients ((a,b,c,d) -> y = a + bx + cx^2 + dx^3) and a function to determine the total difference across the curve (Or the total square difference, or whatever's needed), how do I adjust the coefficients for the next pass? I mean, how do I choose new values? I was originally going to try to fix the highest-degree coefficient first, then drill down to the zeroth-degree one (I have a root-solving algorithm that I use for yield calculations - Was going to adapt that), but then I looked at it again with the proper amount of caffeine, and realized that solving one coefficient at a time wasn't going to work :) SOLVED: See my reply below... Linear/Polynomial Regression

      Proud to have finally moved to the A-Ark. Which one are you in?
      Author of the Guardians Saga (Sci-Fi/Fantasy novels)

      modified on Tuesday, May 24, 2011 9:41 AM

      L Offline
      L Offline
      Luc Pattyn
      wrote on last edited by
      #3

      Hi Ian, I don't think there are any closed formula's for what you are trying. What I would recommend is this, inspired by Newton-Raphson: 1. have some values a1,b1,c1,d1 and the corresponding value of your error function e1. 2. slightly change a1 (say by 1%) and recalculate e; find the "a-sensitivity" as (change in e)/(change in a) 3. same for b1, then c1, then d1 4. now choose new values a2,b2,c2,d2 based on the four sensitivities, maybe like so: a2=a1+coef*e/"a-sensitivity" where maybe coef is one. then iterate until convergence is reached (i.e. sufficiently accurate); warning: this might diverge right away, if so try smaller values of coef (you could halve it in each consecutive try). FWIW: with more-or-less random data, you'll never find a good fit with a single polynomial. What one normally does is a local approximation, which applies in a small region only. That is similar to a font being drawn as a sequence of Bezier curves; it pays of to have several simple Bezier curves, rather than trying to have a single very complex one (which in general will not be good enough). :)

      Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

      Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

      I 1 Reply Last reply
      0
      • _ _Erik_

        Not sure if this will help but, have you tried a spline interpolation algorithm?

        I Offline
        I Offline
        Ian Shlasko
        wrote on last edited by
        #4

        Kind of the opposite of where I'm going here... The goal isn't to pass through every point, but to give more of a trend-line. Generally going to be working with a few hundred data points, but only looking for maybe a second or third-degree function in most cases. Proud to have finally moved to the A-Ark. Which one are you in?
        Author of the Guardians Saga (Sci-Fi/Fantasy novels)

        _ 1 Reply Last reply
        0
        • I Ian Shlasko

          Kind of the opposite of where I'm going here... The goal isn't to pass through every point, but to give more of a trend-line. Generally going to be working with a few hundred data points, but only looking for maybe a second or third-degree function in most cases. Proud to have finally moved to the A-Ark. Which one are you in?
          Author of the Guardians Saga (Sci-Fi/Fantasy novels)

          _ Offline
          _ Offline
          _Erik_
          wrote on last edited by
          #5

          Ok, now I understand it better. Then maybe bezier curves would be a better choice.

          1 Reply Last reply
          0
          • L Luc Pattyn

            Hi Ian, I don't think there are any closed formula's for what you are trying. What I would recommend is this, inspired by Newton-Raphson: 1. have some values a1,b1,c1,d1 and the corresponding value of your error function e1. 2. slightly change a1 (say by 1%) and recalculate e; find the "a-sensitivity" as (change in e)/(change in a) 3. same for b1, then c1, then d1 4. now choose new values a2,b2,c2,d2 based on the four sensitivities, maybe like so: a2=a1+coef*e/"a-sensitivity" where maybe coef is one. then iterate until convergence is reached (i.e. sufficiently accurate); warning: this might diverge right away, if so try smaller values of coef (you could halve it in each consecutive try). FWIW: with more-or-less random data, you'll never find a good fit with a single polynomial. What one normally does is a local approximation, which applies in a small region only. That is similar to a font being drawn as a sequence of Bezier curves; it pays of to have several simple Bezier curves, rather than trying to have a single very complex one (which in general will not be good enough). :)

            Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

            Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

            I Offline
            I Offline
            Ian Shlasko
            wrote on last edited by
            #6

            Hmm, didn't think of the sensitivity thing... Giving that a shot now.

            FWIW: with more-or-less random data, you'll never find a good fit with a single polynomial. What one normally does is a local approximation, which applies in a small region only. That is similar to a font being drawn as a sequence of Bezier curves; it pays of to have several simple Bezier curves, rather than trying to have a single very complex one (which in general will not be good enough).

            It's not actually random data... It's graphing related values, so it should roughly cluster around y=x unless there's another factor (Which is what I'm trying to illustrate). Proud to have finally moved to the A-Ark. Which one are you in?
            Author of the Guardians Saga (Sci-Fi/Fantasy novels)

            L 1 Reply Last reply
            0
            • I Ian Shlasko

              Hmm, didn't think of the sensitivity thing... Giving that a shot now.

              FWIW: with more-or-less random data, you'll never find a good fit with a single polynomial. What one normally does is a local approximation, which applies in a small region only. That is similar to a font being drawn as a sequence of Bezier curves; it pays of to have several simple Bezier curves, rather than trying to have a single very complex one (which in general will not be good enough).

              It's not actually random data... It's graphing related values, so it should roughly cluster around y=x unless there's another factor (Which is what I'm trying to illustrate). Proud to have finally moved to the A-Ark. Which one are you in?
              Author of the Guardians Saga (Sci-Fi/Fantasy novels)

              L Offline
              L Offline
              Luc Pattyn
              wrote on last edited by
              #7

              If you suspect some periodicity, I'd suggest you first subtract x, so according to what you said the average becomes zero, and the first-order approximation would be the x-axis itself. Then consider a Fourier analysis. :)

              Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

              Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

              I 1 Reply Last reply
              0
              • L Luc Pattyn

                If you suspect some periodicity, I'd suggest you first subtract x, so according to what you said the average becomes zero, and the first-order approximation would be the x-axis itself. Then consider a Fourier analysis. :)

                Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

                Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

                I Offline
                I Offline
                Ian Shlasko
                wrote on last edited by
                #8

                Not a cyclic function, really... Bond valuations are a bit odd, so hard to explain... But dropping it to the X-axis does kinda help the initial approximation. Still needs a lot more work (Got it to approximate the first degree, and working on expanding from there), but the sensitivity thing is working like a charm. Thanks :) Proud to have finally moved to the A-Ark. Which one are you in?
                Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                L 1 Reply Last reply
                0
                • I Ian Shlasko

                  Not a cyclic function, really... Bond valuations are a bit odd, so hard to explain... But dropping it to the X-axis does kinda help the initial approximation. Still needs a lot more work (Got it to approximate the first degree, and working on expanding from there), but the sensitivity thing is working like a charm. Thanks :) Proud to have finally moved to the A-Ark. Which one are you in?
                  Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                  L Offline
                  L Offline
                  Luc Pattyn
                  wrote on last edited by
                  #9

                  you're welcome. :)

                  Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

                  Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

                  1 Reply Last reply
                  0
                  • I Ian Shlasko

                    I'm sure there's a relatively-straightforward way to do this, but my research so far has been a bit overwhelming... Heavy statistics and calculus have never been one of my strengths... Basically, I need to calculate a best-fit curve on a scatter graph... Points are not evenly spaced on either axis, and there's no cost function to work with (These are derivations of stock/bond price histories)... Just raw (X,Y) values. I've been trying to work out an algorithm that'll start with a second-degree polynomial function, iterate through and minimize the difference, and then jump to third-degree if it's not within tolerance (And so on, up to a maximum degree). Juggling the data around is something I can do blindfolded, but there's one thing I can't figure out... Given the polynomial coefficients ((a,b,c,d) -> y = a + bx + cx^2 + dx^3) and a function to determine the total difference across the curve (Or the total square difference, or whatever's needed), how do I adjust the coefficients for the next pass? I mean, how do I choose new values? I was originally going to try to fix the highest-degree coefficient first, then drill down to the zeroth-degree one (I have a root-solving algorithm that I use for yield calculations - Was going to adapt that), but then I looked at it again with the proper amount of caffeine, and realized that solving one coefficient at a time wasn't going to work :) SOLVED: See my reply below... Linear/Polynomial Regression

                    Proud to have finally moved to the A-Ark. Which one are you in?
                    Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                    modified on Tuesday, May 24, 2011 9:41 AM

                    G Offline
                    G Offline
                    Graham Toal
                    wrote on last edited by
                    #10

                    Assuming that standard numerical methods such as simplex aren't going to work here, and given that you do alreaduy have a fitness function available and a few coefficients to tweak, I'm wondering if perhaps a genetic algorithm or something like simulated annealing would work here?

                    1 Reply Last reply
                    0
                    • I Ian Shlasko

                      I'm sure there's a relatively-straightforward way to do this, but my research so far has been a bit overwhelming... Heavy statistics and calculus have never been one of my strengths... Basically, I need to calculate a best-fit curve on a scatter graph... Points are not evenly spaced on either axis, and there's no cost function to work with (These are derivations of stock/bond price histories)... Just raw (X,Y) values. I've been trying to work out an algorithm that'll start with a second-degree polynomial function, iterate through and minimize the difference, and then jump to third-degree if it's not within tolerance (And so on, up to a maximum degree). Juggling the data around is something I can do blindfolded, but there's one thing I can't figure out... Given the polynomial coefficients ((a,b,c,d) -> y = a + bx + cx^2 + dx^3) and a function to determine the total difference across the curve (Or the total square difference, or whatever's needed), how do I adjust the coefficients for the next pass? I mean, how do I choose new values? I was originally going to try to fix the highest-degree coefficient first, then drill down to the zeroth-degree one (I have a root-solving algorithm that I use for yield calculations - Was going to adapt that), but then I looked at it again with the proper amount of caffeine, and realized that solving one coefficient at a time wasn't going to work :) SOLVED: See my reply below... Linear/Polynomial Regression

                      Proud to have finally moved to the A-Ark. Which one are you in?
                      Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                      modified on Tuesday, May 24, 2011 9:41 AM

                      I Offline
                      I Offline
                      Ian Shlasko
                      wrote on last edited by
                      #11

                      Thanks for the input, guys... Got me thinking along the right lines, and that got me to the right google search... I was searching for things like "curve fitting" and "best fit algorithm"... Turned out the 'proper' term was "Linear Regression"... And that got me here: An Algorithm for Weighted Linear Regression[^] Figures, the solution would be right here on CP :)

                      Proud to have finally moved to the A-Ark. Which one are you in?
                      Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                      1 Reply Last reply
                      0
                      • I Ian Shlasko

                        I'm sure there's a relatively-straightforward way to do this, but my research so far has been a bit overwhelming... Heavy statistics and calculus have never been one of my strengths... Basically, I need to calculate a best-fit curve on a scatter graph... Points are not evenly spaced on either axis, and there's no cost function to work with (These are derivations of stock/bond price histories)... Just raw (X,Y) values. I've been trying to work out an algorithm that'll start with a second-degree polynomial function, iterate through and minimize the difference, and then jump to third-degree if it's not within tolerance (And so on, up to a maximum degree). Juggling the data around is something I can do blindfolded, but there's one thing I can't figure out... Given the polynomial coefficients ((a,b,c,d) -> y = a + bx + cx^2 + dx^3) and a function to determine the total difference across the curve (Or the total square difference, or whatever's needed), how do I adjust the coefficients for the next pass? I mean, how do I choose new values? I was originally going to try to fix the highest-degree coefficient first, then drill down to the zeroth-degree one (I have a root-solving algorithm that I use for yield calculations - Was going to adapt that), but then I looked at it again with the proper amount of caffeine, and realized that solving one coefficient at a time wasn't going to work :) SOLVED: See my reply below... Linear/Polynomial Regression

                        Proud to have finally moved to the A-Ark. Which one are you in?
                        Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                        modified on Tuesday, May 24, 2011 9:41 AM

                        A Offline
                        A Offline
                        Alan Balkany
                        wrote on last edited by
                        #12

                        I'm surprised with all those answers, no one gave you the right approach: Polynomial Regression: http://en.wikipedia.org/wiki/Polynomial_regression[^] It will give you a best-fit quadratic or cubic equation (or even higher orders if you want). The Wikipedia link defines it, but the explanation there isn't very intuitive. The basic idea is you get an equation for the error terms, take the derivative, and set it to zero. Solving that gives you the equation with the minimum error.

                        I 1 Reply Last reply
                        0
                        • A Alan Balkany

                          I'm surprised with all those answers, no one gave you the right approach: Polynomial Regression: http://en.wikipedia.org/wiki/Polynomial_regression[^] It will give you a best-fit quadratic or cubic equation (or even higher orders if you want). The Wikipedia link defines it, but the explanation there isn't very intuitive. The basic idea is you get an equation for the error terms, take the derivative, and set it to zero. Solving that gives you the equation with the minimum error.

                          I Offline
                          I Offline
                          Ian Shlasko
                          wrote on last edited by
                          #13

                          Well, to be fair, the answers got me thinking along the right lines, and led me to the right terms, which brought me right back one of Walt's articles on CP... See my above post :) Marked the original as solved

                          Proud to have finally moved to the A-Ark. Which one are you in?
                          Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                          A 1 Reply Last reply
                          0
                          • I Ian Shlasko

                            Well, to be fair, the answers got me thinking along the right lines, and led me to the right terms, which brought me right back one of Walt's articles on CP... See my above post :) Marked the original as solved

                            Proud to have finally moved to the A-Ark. Which one are you in?
                            Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                            A Offline
                            A Offline
                            Alan Balkany
                            wrote on last edited by
                            #14

                            But if your data set naturally has a curve, a linear model will be inaccurate.

                            I 1 Reply Last reply
                            0
                            • A Alan Balkany

                              But if your data set naturally has a curve, a linear model will be inaccurate.

                              I Offline
                              I Offline
                              Ian Shlasko
                              wrote on last edited by
                              #15

                              Well it's multiple linear regression... It lets me specify the degree. Using Walt's library, it's looking perfect... Using third-degree (x^3, so a possible S-shape in the curve), and releasing the update next week.

                              Proud to have finally moved to the A-Ark. Which one are you in?
                              Author of the Guardians Saga (Sci-Fi/Fantasy novels)

                              1 Reply Last reply
                              0
                              Reply
                              • Reply as topic
                              Log in to reply
                              • Oldest to Newest
                              • Newest to Oldest
                              • Most Votes


                              • Login

                              • Don't have an account? Register

                              • Login or register to search.
                              • First post
                                Last post
                              0
                              • Categories
                              • Recent
                              • Tags
                              • Popular
                              • World
                              • Users
                              • Groups