Linear Least Squares

Problem Description

Find the best fit linear least squares line for a given set of points defined by (X, Y) coordinates..

Background & Techniques

	X, Y Data 1.1, 2.1 2.3, 3.9 3.7, 6.1 3.9, 7.9 4.0, 9.0 5.0, 11.0
Best Fit Line is Y=2.3X-1.0, R²=0.9384

The "least squares line" is the unique line which minimizes the sum of squares of the differences between the Y value for each X point and the Y value for the line at that X. Whew! Is is harder to define the line in words than it is to calculate it!

A straight line is defined by parameters "slope" , M, and "intercept", B with the equation y = Mx + B where M indicates how much Y changes for each unit change in X. The intercept B is the point where the line intercepts the Y axis. Least Squares is popular because it also allows measurement of the goodness of fit. The Correlation Coefficient, R, is one such measure. Specifically R² has a value that ranges from zero to one, and is the fraction of the variance in the two variables that is shared. It serves as a measure of the likelihood that one of X or Y is dependent on the other , (or they are both dependent on some 3rd unmeasured independent variable.

The demo program allows users to enter an arbitrary set of data points and calculates the M, B, and R Squared values, The input data point and the best fit line are drawn.

Notes for programmers

When a viewer asked how to do it the other day, I was mildly surprised to find that I had never posted a demo about linear regression.

The function LinearLeastSquares resolves that using equations published in many places on the web. I will include in our Mathslib unit for the next library update.

I also finally created two other functions which deserve to be in our library namely, ScaleDataForPlot and ScalePoint. ScaleDataForPlot takes an array of TRealpoint records along with the size of the image canvas to draw on and returns an array of TPoint integer values have been scaled to cover the range from 10% to 90% o the X and Y ranges. Y values are also inverted to account for the computer's insistence that Y increases from top to bottom. It also returns X and Y offset and scaling information in a record which can be passed to the ScalePoint function to scale individual points. (For example, the end points of the regression line to be drawn in the current program.)

The final function which could be expanded to be a useful library tool is the GetNextNumber function which scans the input lines after the user enters data to validate and convert strings to real values. It handles a few of the errors which users can make but is still a little fragile and could use more work.

All in all, a potentially useful exercise if I just get around to updating the library making these new functions available for easy access.

Running/Exploring the Program

	Download source
	Download executable

Suggestions for Further Explorations

Polish and insert the key functions from this demo into our DFFLIB library units.

Original Date: January 18, 2012

Modified: May 15, 2018

Support DFF - Donate

Contact

[Home] [Puzzles & Projects] [Delphi Techniques] [Math topics] [Library] [Utilities]

Search

Problem Description

Background & Techniques

Notes for programmers

Running/Exploring the Program

Suggestions for Further Explorations

[Feedback] [Newsletters (subscribe/view)] [About me]

Copyright © 2000-2018, Gary Darby All rights reserved.