Saturday, January 27, 2024

Analytic Geometry

MEAN

Introduction

Rene Descartes (1596-1650) ले 1600 मा बिकास गरेको Geometry लाई Analytic Geometry भनिन्छ। यसलाई Coordinate Geometry वा Cartesian Geometry पनि भनिन्छ ।

Geometrical Object हरुलाई Position र Value (स्थान र मान) दिनु नै यस Geometry को मुख्य बिशेषता थियो। यस Geometry को बिकासले गर्दा geometrical object हरुलाई जस्तै point, line, circle, …. आदिलाई algebraic expression बाट number को प्रयोग गरि ब्याख्या बिश्लेषण गर्न सकिने भयो ।

यसको लागी Descartes ले Geometry को अध्ययनमा Algebra को प्रयोग गरेका थिए। Descartes को यो योगदानलाई “the turning point of modern mathematics” पनि भनिन्छ। यस सताब्दी लाई गणितको बिकासमा सबैभन्दा धनी सताब्दी पनि मानिन्छ । Descartes ले बिकास गरको Analytic Geometry को मुख्य आधार “there is one-to-one correspondence between the points of a line and real numbers” भन्ने तथ्य हो ।

जस्तै, Two dimensional plane मा कुनै एउटा point A भएमा point A लाई X-axis बाट एउटा number x र Y-axis बाट एउटा number y correspond गर्न सकिन्छ। यसै number (x,y) लाई A को coordinate भनिन्छ।

In the figure above, there are two scales – One is the X-axis which is running across the plane and the other one is the Y-axis which is at the right angles to the X-axis. This is similar to the concept of the rows and columns.

Another mathematician Legendre is considered one of the fathers of modern analytic geometry, a geometry that incorporates all the inherent power of both algebra and calculus.

By the use of this coordinate geometry, number of things are possible, some of them are given below.

  1. Determine the distance between these points.
  2. Find the equation, midpoint, and slope of the line segment.
  3. Determine if the given lines are perpendicular or parallel.
  4. Find the perimeter and the area of the polygon formed by the points on the plane.
  5. Transform the shape by reflecting, moving and rotating it.
  6. Define the equations of ellipses, curves, and circles.



Gradient

gradient =\( \frac{\text{change in y}}{\text{change in x}} \)

Gradient

We know that the gradient of a straight line is a measure of how steep it is. To calculate the gradient of a straight line, we choose any two points on it and find the run and the rise from the first point to the second point. The run is the change in the x-coordinates, and the rise is the change in the y-coordinates, as illustrated in Figure below [Figure 1].

Then,
gradient =\( \frac{rise}{run} \) (1)
The run and the rise from one point to another on a straight line can be positive, negative or zero, depending on whether the relevant coordinates increase, decrease or stay the same from the first point to the second point.
A straight-line graph that slopes down from left to right has a negative gradient, one that’s horizontal has a gradient of zero, and one that slopes up from left to right has a positive gradient.
If the run is zero, which happens when the line is vertical, then the gradient of the line is undefined, since division by zero isn’t possible. Vertical lines are the only straight lines that don’t have gradients.
If the two points that we choose to calculate the gradient of a line are
\( (x_1, y_1)\) and \( (x_2, y_2)\) as illustrated in Figure below (Figure 2).

Then
run = x2 − x1 and rise = y2 − y1.
Now,
gradient =\( \frac{y_2-y_1}{x_2-x_1} \) (2)
Remember that when we use this formula to calculate the gradient of a straight line, it doesn’t matter which point we take to be the first point, \( (x_1, y_1)\), and which you take to be the second point, \( (x_2, y_2)\), as we get the same result either way.
For example, using the formula to calculate the gradient of the line through the points (1, 8) and (5, 2), which is shown in Figure below [Figure 3]

It gives either
gradient =\( \frac{y_2-y_1}{x_2-x_1}=\frac{2-8}{5-1}=-\frac{3}{2} \)
gradient =\( \frac{y_2-y_1}{x_2-x_1}=\frac{8-2}{1-5}=-\frac{3}{2} \)
The gradient of a straight line

The gradient of a line is a measure of its slope or steepness. It is defined as a ratio of its vertical displacement or Rise, to its horizontal displacement or Runs.

To determine the slope (or gradient), we can use:

  1. direct measurement OR
  2. count the number of units on the vertical and
    horizontal lines when the line is drawn on the Cartesian plane or a grid OR
  3. use slope formula as
    \( m=\frac{y_2-y_1}{x_2-x_1}\)
    if two points \( (x_1,y_1)\) and \( (x_2,y_2)\) is given.

Some Examples

The slope of a line can positive, negative, zero, or undefined.
  1. Positive slope
    If a line admits, \( y\) increases as \( x\) increases, then the slope upwards to the right. This slope will be a positive number. The line in Figure below has a slope of about \( +0.5\) , it goes up about 0.5 for every step of 1 along the x-axis.
  2. Negative slope
    If a line admits, \( y\) decreases as \( x\) increases, then the slope downwards to the right. This slope will be a negative number. The line in Figure below has a slope of about \( -0.5\), it goes down about \( 0.5\) for every step of 1 along the x-axis.
  3. Zero slope
    If a line admits, \( y\) does not change as \( x\) increases /decreases, then the line is exactly horizontal. The slope of any horizontal line is always zero. The line in Figure below goes neither up nor down as \( x\) increases, so its slope is zero.
  4. Undefined slope
    If a line is exactly vertical, it does not have a defined slope. In such a line any two \( x\) coordinates are the same, so the difference is zero. The line in the Figure below is exactly vertical, so it has no defined slope. We say "the slope is undefined".

Example 1

Find the slope of a line through the points (3, 4) and (5, 1).

Solution
We know that,slope of a line through the points \( (x_1,y_1)\) and \( (x_2,y_2)\) is
\( slope=\frac{y_2-y_1}{x_2-x_1}\)
So, using \( (3, 4)\) as \( (x_1,y_1)\) , the slope is
\( slope=\frac{y_2-y_1}{x_2-x_1}=\frac{1-4}{5-3}=-\frac{3}{2}\)
NOTE,
using \( (5, 1)\) as \( (x_1,y_1)\) , the slope is
\( slope=\frac{y_2-y_1}{x_2-x_1}=\frac{4-1}{3-5}=-\frac{3}{2}\)




Straight line

A straight line, drawn on a Cartesian Plane can be described by an equation. Such equations have a general form and vary depending on where the line cuts the axes and its degree of slope.

For detail study of a striaght line, following key points are essential.

Horizental Lines

Some linear equations have only one variable. They may have just y .

Let’s consider the equation y=2. This equation has only one variable, y. The equation says that y is always equal to 2, so its value does not depend on x. No matter what is the value of x, the value of y is always 2.

So to make a graph, draw a horizental line from y=2.

Horizontal lines are all parallel to the x-axis.
We know that the equation of the \( x\) -axis is
\( y = 0\) .
This is because all points on this axis have a \( y\) -coordinate zero, regardless of their different x-coordinates. So, the horizontal line which cuts the y-axis at \( 2\) has equation
\( y = 2\) .
Also, the horizontal line that cuts the y-axis at \( -1\) has equation
\( y = -1\)
and so on.

In general, the equation of a horizontal line is
\( y=a\)
where \( a\) is a constant.
This line cut the vertical axis at \( a\) and all points on the line have a \( y\) -coordinate of \( a\)




Vertical Line

Some linear equations have only one variable. They may have just x .

Let’s consider the equation x=−3. This equation has only one variable, x. The equation says that x is always equal to−3, so its value does not depend on y. No matter what is the value of y, the value of x is always −3.

So to make a graph, draw a verticasl line from x=-3.




Vertical Lines

Vertical lines are all parallel to the \( y\) -axis. We know that the equation of the \( y\) -axis is
\( x = 0\) .

This is because all points on this axis have an \( x\) -coordinate of zero, regardless of their different y-coordinates.
So, the vertical line which cuts the x-axis at \( 2\) has equation
\( x = 2\) .
Also, the vertical line that cuts the \( x\) -axis at \( -1\) has equation
\( x =-1\)
and so on.

In general, the equation of a vertical line is
\( x=b\)
where \( b\) is a constant.
This line cuts the horizontal axis at \( b\) and all points on the line have an \( x\) -coordinate of \( b\) .




Equation of Straight Lines

Point Slope Form

Let \( l\) is a straight line passing through a point \( A=(x_1,y_1)\) with slope \( m\) , then the equation of straight line \( l\) is
\( y-y_1=m(x-x_1)\)

Proof
Given that \( l \) is a straight line passing through a point \( A(x_1,y_1)\) with slope \(m\) . Suppose that, \( P(x,y)\) is arbitrary point on the straight line \( l \) , then
slope of line = slope of PA
or \( m=\frac{y_2-y_1}{x_2-x_1}\)
We take the point \( P(x,y)\) as \( (x_2,y_2)\) , thus
\( m=\frac{y-y_1}{x-x_1}\)
or \( y-y_1=m(x-x_1)\)




Two Point Form

Let \( l \) is a straight line passing through two points \( (x_1,y_1)\) and \( (x_2,y_2)\) , then the equation of straight line \( l \) is
\( y-y_1= \frac{y_2-y_1}{x_2-x_1} (x-x_1)\)

Proof

Given that \( l \) is a straight line passing through two points \( A(x_1,y_1)\) and \( B(x_2,y_2)\)
Suppose that, \( P(x,y)\) is arbitrary point on the straight line \( l \) , then
slope of PA = slope of AB
Here, slope of PA is
slope of PA\( =\frac{y_2-y_1}{x_2-x_1}\)
We take the point \( P(x,y)\) as \( (x_2,y_2)\) , and \( A(x_1,y_1)\) as \( (x_1,y_1)\) , thus
slope of PA\( =\frac{y-y_1}{x-x_1}\) (1)
Next,
slope of \( AB=\frac{y_2-y_1}{x_2-x_1}\) (2)
Now, Equating (1) and (2), we get
slope of PA = slope of AB
or \( \frac{y-y_1}{x-x_1}=\frac{y_2-y_1}{x_2-x_1}\)
or \( y-y_1= \frac{y_2-y_1}{x_2-x_1} (x-x_1)\)




Example 1

Find the equation of a straight line passing through (-9, 5) and inclined at an angle of \( 120^0\) with the positive direction of x-axis.

Solution
Here slope of the line is
\( m = \tan 120^0 = \sqrt{3} \)
Now, the required equation of the straight lien is
\( y - y_1 = m (x - x_1)\)
or \( y - 5 = \sqrt{ 3}(x - (-9))\)
or \( \sqrt{ 3}x - y + 9\sqrt{ 3} + 5 = 0\)

Straight line: Example 2

Find the equation of a straight line passing through the points (2, 3) and (6, - 5).

Solution
The equation of the straight line passing through the points B(2, 3) and A(6, - 5) is
\( y-y_1 =\frac{}{}x-x_1 = x_1 -x_2\)
or \( y-3=\frac{y_2-y_1}{x_2-x_1} (x-x_1) \)
or \( y-3=\frac{-5 --3}{6-2}(x-2)\)
or \( y - 3 = -2(x - 2)\)
or \( 2x + y + 1 = 0\)

Intercepts

In coordinate geometry of two dimension, there are two types of intercepts. they are x-intercept and y-intercept. The x-intercepts are where the straight line crosses the x-axis [In the Figure below, a is x-intercept], and the y-intercepts are where the line crosses the y-axis [In the Figure below, b is y-intercept].

To clarify it algebraically,

  1. an x-intercept is a point on the graph where y is zero, [x-intercept is a point in the equation where the y-value is zero]
  2. a y-intercept is a point on the graph where x is zero.[y-intercept is a point in the equation where the x-value is zero]

Slope Intercept Form

Let l is a straight line with slope m and yintercept c, then the equation of straight line l is
y=mx+c

Proof
Given that l is a straight line with slope m andy intercept c
Then the line intersects y-axis at a point A(0,c)
Now, equation of the lie l is
\( y-y_1=m(x-x_1)\)
Taking the point A(0,c) as \( (x_1,y_1)\) , we get
y-c=m(x-0)
or y=mx+c

Method 2

Slope Intercept Form

Let l is a straight line with slope m and y intercept c , then the equation of straight line l is
y=mx+c

Proof
Given that ll is a straight line with slope m and y intercept c
Then the line intersects y-axis at a point A(0,c)
Now, take an arbitrary point P(x,y) on l, then slope of the lie l is
slope of l=slope of l
Taking the point A(0,c) and P(x,y), we get
\( m= \frac{y_2-y_1}{x_2-x_1}\)
or \( m= \frac{y-x}{x-0}\)
or mx=y-c
or y=mx+c

Double Intercept Form

Let \( l \) is a straight line whose x intercept is a and y intercept is b , then the equation of straight line \( l \) is
\( \frac{x}{a}+\frac{y}{b}=1\)

Solution
Given that \( l \) is a straight line whose x intercept is a and y intercept is b
Thus, the line intersects \( x\) -axis at a point \( A(a,0)\) and \( y\) -axis at a point \( B(0,b)\) [See Figure below

Now, equation of the lie \( l \) is
\( (y-y_1)=\frac{y_2-y_1}{x_2-x_1} (x-x_1)\)
Taking the point \( A(a,0)\) as \( (x_1,y_1)\) , and taking the point \( B(0,b)\) as \( (x_2,y_2)\) , the equation of straight line is
\( (y-0)=\frac{b-0}{0-a} (x-a)\)
or \( y=-\frac{b}{a} (x-a)\)
or \( ay=-b(x-a)\)
or \( ay=-bx+ab\)
or \( bx+ay=ab\)
or \( \frac{x}{a}+\frac{y}{b}=1\)

Normal Form

The equation of the straight with length of the perpendicular from the origin \( p\) and this perpendicular makes an angle \( \alpha \) with x-axis is
\( x \cos \alpha + y \sin \alpha = p\)

Proof
Given that \( l \) is a straight line with length of the perpendicular from the origin \( p\) and this perpendicular makes an angle \( \alpha \)
Suppose the line \( l \) intersects the x-axis at \( A(a,0)\) and the y-axis at \( B(0,b)\) . Now from the origin \( O\) draw \( OD\) perpendicular to \( l \) , whose \( x\) intercept is \( a\) and \( y\) intercept is \( b\) [See Figure below]

Thus, the equation of striaght line is
\( \frac{x}{a}+\frac{y}{b}=1\) (A)
Here, from the right-angled \( \triangle ODA\) , we get,
\( \frac{OD}{OA} = \cos \alpha \)
or \( \frac{p}{a} = \cos \alpha \)
or \( a=\frac{p}{\cos \alpha}\) (1)
Again, from the right-angled \( \triangle ODB\) , we get,
\( \frac{OD}{OB} = \cos \left(\frac{\pi}{2}-\alpha \right ) \)
or\( \frac{p}{b} = \sin \alpha \)
or \( b=\frac{p}{\sin \alpha}\) (2)
Now, using (1) and (2) in (A), the equation of the straight line is
\( \frac{x}{a}+\frac{y}{b}=1\)
or \( \frac{x}{\frac{p}{\cos \alpha}}+\frac{y}{\frac{p}{\sin \alpha}}=1\)
or \( x \cos \alpha + y \sin \alpha = p\)




Bisector of the angles between two lines

Bisector of the angles between two lines (or Angle bisector of two lines) are the lines which bisects the angle between the two given lines. These Angle bisector of two lines are the locus of a point which is equidistant from the two lines. In other words, an angle bisector has equal perpendicular distance from the two lines.

Equation of Angle Bisector

Let us consider a pair of straight lines given by
\(l_1 : a_1x + b_1y + c_1= 0\) and \(l_2 : a_2x + b_2y + c_2= 0\)
Also let, P(x, y) be a point lies on the angle bisectors, then length of perpendicular from the point P to both the lines \(l_1,l_2\) are equal.
Thus
\( \frac{a_1x + b_1y + c_1}{\sqrt{a_1^2+b_1^2}}=\pm \frac{a_2x + b_2y + c_2}{\sqrt{a_2^2+b_2^2}}\)(1)
Solving (1), we will get two bisectors \(b_1,b_2\) as required.




General equation of second degree

Let us consider the general equation of the second degree in two variables x and y given by
\(ax^2 + 2hxy + by^2 + 2gx + 2fy + c = 0\)
where a, h, b, g, f and c are constants.

The study of this general equation of the second degree in two variables used to be a major chapter in a course on analytic geometry. The equation usually represents a pair of straight lines or a conic.
Consider
\( \Delta =abc+2fgh-af^2-bg^2-ch^2=0\)
The detail classification are given below

  1. a circle if \( \Delta \ne 0, a=b,h=0\)
  2. a parabola if \( \Delta \ne 0, h^2=ab\)
  3. an ellipse if \( \Delta \ne 0, h^2 < ab\)
  4. a hyperbola if \( \Delta \ne 0, h^2 > ab\)
  5. a rectangular hyperbola if \( \Delta \ne 0, h^2 > ab, a+b=0\)


  6. a pair of st. lines if \( \Delta =0\)
  7. a pair of parallel line if \( \Delta =0,h^2=ab\)
  8. a pair of perpendicular line if \( \Delta =0,a+b=0\)
  9. a point if \( \Delta =0,h^2 < ab\)

NOTES

  1. If the general equation of second degree \( ax^2 +2hxy + by^2 +2gx + 2fy + c = 0\) represents a pair of striaght lines then the discriminat must be perfect square, thus
    the general equation is
    \( ax^2 +2hxy + by^2 +2gx + 2fy + c = 0\)
    or\( ax^2 +(2hy+2g)x + (by^2 + 2fy + c) = 0\)
    Now, the discriminat must be perfect square, thus
    \((2hy+2g)^2-4(a)(by^2+2fy+c)\) is perfect square
    or \((hy+g)^2-a(by^2+2fy+c)\) is perfect square
    or \((h^2-ab)y^2+2(hg-af)y +(g^2-ac)\) is perfect square
    Again, \((h^2-ab)y^2+2(hg-af)y +(g^2-ac)\) is perfect square if its discriminant is zero
    Thus,
    \( 4(hg-af)^2-4(h^2-ab)(g^2-ac)=0\)
    or \( (hg-af)^2-(h^2-ab)(g^2-ac)=0\)
    or \( abc+2fgh-af^2-bg^2-ch^2=0\)

  2. If the straight lines \( ax^2 +2hxy + by^2 +2gx + 2fy + c = 0\) intersect on the X-axis, then
    \( \Delta =abc+2fgh-af^2-bg^2-ch^2=0\) and \(y=0\)
    Thus we have
    \( abc+2fgh-af^2-bg^2-ch^2=0\) and \( ax^2 +2gx + c = 0\)
    In the equation \( ax^2 +2gx + c = 0\), the discriminant must be zero, thus
    \( abc+2fgh-af^2-bg^2-ch^2=0\) and \( g^2=ac\)
    or\( abc+2fgh-af^2-bac-ch^2=0\) and \( g^2=ac\)
    or\( 2fgh-af^2-ch^2=0\) and \( g^2=ac\)
    or\( 2fgh=af^2+ch^2\) and \( g^2=ac\)

  3. If the straight lines \( ax^2 +2hxy + by^2 +2gx + 2fy + c = 0\) intersect on the Y-axis, then
    \( \Delta =abc+2fgh-af^2-bg^2-ch^2=0\) and \(x=0\)
    Thus we have
    \( abc+2fgh-af^2-bg^2-ch^2=0\) and \(by^2 + 2fy + c = 0\)
    In the equation \(by^2 + 2fy + c = 0\), the discriminant must be zero, thus
    \( abc+2fgh-af^2-bg^2-ch^2=0\) and \( f^2=bc\)
    or\( abc+2fgh-abc-bg^2-ch^2=0\) and \( f^2=bc\)
    or\( 2fgh-bg^2-ch^2=0\) and \( f^2=bc\)
    or\( 2fgh=bg^2+ch^2\) and \( f^2=bc\)




Homogeneous Equation

A polynomial is homogeneous if all its terms have the same degree. For example,
\( 7x^5y^2-3xy^6\) is homogeneous of degree 7.
Therefore, an equation is called homogeneous if all its terms have the same degree.
For example,
\( ax^2+2hxy+by^2=0\) is a homogeneous equation of degree 2.
NOTE
The general form of second degree equation in x and y is
\( ax^2+2hxy+by^2+2gx+2fy+c=0 \)
and it is non-homogeneous because its terms have the NOT same degree.




Pair of Straight Lines

Statement
Prove that \( ax^2+2hxy+by^2=0\) represents a pair of straight lines passing through the origin.
Solution
Given the equation is
\( ax^2+2hxy+by^2=0\)
or \( b \left ( \frac{y}{x}\right )^2 +2h\left ( \frac{y}{x}\right )+a=0\)
Assume that the roots of the lines are \(m_1,m_2\), then
\( m_1=\frac{y}{x}\) and \( m_2=\frac{y}{x}\)
or\( y=m_1x\) and \(y= m_2x\)
These two lines are the straight lines passes through origin.

Next

We know that equation of straight line in slope intercept form is
\( y=mx+c\) .
Therefore, any two lines through origin is written as
\( y=m_x\) and \( y=m_2x\) . (1)
So multiplying both the equation given in (1), we get a pair of lines as
\( (y-m_1x)(y-m_2x)=0\)
The general form of this equation is
\( ax^2+2hxy+by^2=0\) (2)
The equation (2) represents a pair of straight lines.

Simply, it is also known as homogeneous equation of second degree. This equation represents a pair of straight lines passing through the origin.




Angle between two lines

The angle \( \theta\) between two straight lines \( l_1\) and \( l_2\) having slope \( m_1\) and \( m_2\) respectively is
\( \tan \theta = \pm \frac{m_1 -m_2}{1+m_1 m_2 }\)

Proof
Let \( l_1\) and \( l_2\) are two straight lines with equations \( y = m_1 x + c_1\) and \( y = m_2 x + c_2\) respectively. Then, clearly, the slope of \( l_1\) and \( l_2\) are \(m_1\) and \(m_2\) respectively.

Also let \( l_1\) and \( l_2\) make angles \( \theta_1\) and \( \theta_2\) respectively with the positive direction of x-axis.Then,
\( m_1 = \tan \theta_1 \) and \( m_2 = \tan \theta_2\) (1)
Also let \( l_1\) and \( l_2\) intersect at a point P and angle between \( l_1\) and \( l_2\) is \( \theta\) such that
\( \measuredangle APC = \theta\)
Now, we get,
\( \theta = \theta_2-\theta_1\)
Now taking tangent on both sides, we get,
\( \tan \theta = \tan (\theta_2 - \theta_1)\)
or \( \tan \theta = \frac{\tan\theta_2-tan \theta_1}{1+tan \theta_1 tan\theta_2} \) Using the formula, \( \tan (A +B) =\frac{\tan A-\tan B }{1+\tan A \tan B}\)
or \( \tan \theta = \pm \frac{m_1-m_2}{1+m_1 m_2 }\)
Notes:

  1. The angle between the lines \( l_1\) and \( l_2\) is acute or obtuse according as the value of \( \frac{m_2 -m_1} {1+m_1 m_2}\) is positive or negative.
  2. The angle between two intersecting straight lines means the measure of the acute angle between the lines.
  3. If two lines are parallel then \( m_1=m_2\)
  4. If two lines are perpendicular then \( m_1m_2=-1\)

Angle between pair of lines

Given the pair of lines are
\( ax^2+2hxy+by^2=0\)
or \( b \left ( \frac{y}{x}\right )^2 +2h\left ( \frac{y}{x}\right )+a=0\)
Assume that the slopes of the lines are \(m_1,m_2\), then
\( m_1+m_2=-\frac{2h}{b}\) and \( m_1m_2=\frac{a}{b}\)
Now, the angle between the lines is given by
\( \tan \theta = \pm \frac{m_1-m_2}{1+m_1m_2} \)
or\( \tan \theta = \pm \frac{ \sqrt{(m_1+m_2)^2-4m_1m_2} }{1+m_1m_2} \)
or\( \tan \theta = \pm \frac{ \sqrt{\left ( \frac{-2h}{b} \right )^2-4 \left ( \frac{a}{b} \right )} }{1+\left ( \frac{a}{b} \right )} \)
or\( \tan \theta = \pm \frac{ 2 \sqrt{h^2-ab}}{a+b} \)

Notes:
  1. The angle between pair of lines is acute or obtuse according as the value of \( \frac{ 2 \sqrt{h^2-ab}}{a+b} \) is positive or negative.
  2. If two lines are coincide/parallel then, \( h^2=ab\)
  3. If two lines are perpendicularl then, \( a+b=0\)



Example 1

If A (-2, 1), B (2, 3) and C (-2, -4) are three points, fine the angle between the straight lines AB and BC.

Solution
Let the slope of the line AB and BC are \(m_1\) and \(m_2\) respectively. Then
\( m_1 = \frac{1}{2} \) and \( m_2 = \frac{7}{4} \)
Let \( \theta\) be the angle between AB and BC. Then,
\( \tan \theta =\frac{m_2 -m_1}{1+m_1 m_2}=\frac{2}{3}\)
or \( \theta =\tan^{-1}\frac{2}{3}\)
which is the required angle.




Example 2

Find the acute angle between the lines 7x - 4y = 0 and 3x - 11y + 5 = 0.

First we need to find the slope of both the lines.
Thus,
Slope of the line 7x - 4y = 0 is
\( \frac{7}{4}\) (1)
Slope of the line 3x - 11y + 5 = 0 is
\( \frac{3}{11}\) (2)
Now, let the angle between the given lines 7x - 4y = 0 and 3x - 11y + 5 = 0 is \( \theta\) , then
\( \tan \theta = \frac{m_2 -m_1}{1+m_1 m_2} = 1\)
or \( \theta = 45^0\)


Condition of Parallelism of Lines

If two lines of slopes \(m_1\) and \(m_2\) are parallel, then the angle \( \theta\) between them is of \( 0^0\) .
Therefore,
\( \tan \theta = \tan 0\)
or \(\tan \theta = 0\)
or \( \frac{m_2 -m_1}{ 1+m_1 m_2 } = 0\)
or \( m_2 -m_1 = 0\)
or \( m_1 = m_2 \)
Thus when two lines are parallel, their slopes are equal.

Example 1

What is the value of k so that the line through (3, k) and (2, 7) is parallel to the line through (-1, 4) and (0, 6)?

Solution
Let A(3, k), B(2, 7), C(-1, 4) and D(0, 6) be the given points. Then,
\( m_1 =\) slope of the line \( AB = \frac{7-k}{2-3} =k -7\)
\( m_2 =\) slope of the line \( CD = \frac{6-4}{0-(-1)} = 2\)
Since, AB and CD are parallel, therefore
\( m_1 = m_2 \)
or k - 7 = 2
or k = 9

Example 2

A quadrilateral has the vertices at the points (-4, 2), (2, 6), (8, 5) and (9, -7). Show that the mid-points of the sides of this quadrilateral are the vertices of a parallelogram.

Solution
Let A(-4, 2), B(2, 6), C(8, 5) and D(9, -7) be the vertices of the given quadrilateral. Let P,Q, R and S be the mid-points of AB, BC, CD and DA respectively.
Then the coordinates of P, Q, R and S are P(-1, 4), Q (5, 11/2), R(17/2, -1) and S(5/2, -5/2).
In order to prove that PQRS is a parallelogram, it is sufficient to show that PQ || RS and PQ =RS .
We have,
\( m_1 =\) Slope of the side \( PQ = \frac{1}{4}\)
\( m_2 = \) Slope of the side \( RS = \frac{1}{4}\)
Clearly, \( m_1 = m_2\).
This shows that
PQ || RS. (1)
Now,
\( PQ = \sqrt{(5+1)^2+(112-4)^2}=\sqrt{1532}\)
\( RS = \sqrt{(52-172)^2+(-52+1)^2}=\sqrt{1532}\)
Therefore,
PQ = RS (2)
Using (1) and (2), we claim that
PQRS is a parallelogram.




Condition of Perpendicularity of Two Lines

If two lines \( l_1\) and \( l_2\) of slopes \(m_1\) and \(m_2\) are perpendicular, then the angle between the lines \( \theta\) is of \( 90^0\) .
Therefore,
\( \cot \theta = 0\)
or \( \frac{1+m_1 m_2}{ m_2 -m_1} = 0\)
or \( 1 + m_1 m_2 = 0\)
or \( m_1 m_2 = -1\)
Thus when two lines are perpendicular, the product of their slope is \( -1\) .
NOTE
If m is the slope of a line, then the slope of a line perpendicular to it is \( -1/m\) .

Example 1

If P (6, 4) and Q (2, 12) are two points, then find slope of a line perpendicular to PQ.

Solution

Let m be the slope of PQ. Then
m = -2
Therefore the slope of the line perpendicular to PQ is
\( m_1=\frac{1}{2}\)

Example 2

Without using the Pythagoras theorem, show that P (4, 4), Q (3, 5) and R (-1, -1) are the vertices of a right angled triangle.

Solution

In \( \triangle ABC\) , we have:
\( m_1 =\) Slope of the side PQ = -1
\( m_2 =\) Slope of the side PR = 1
Now clearly we see that
\( m_1 m_2 = -1\)
Therefore, the side PQ perpendicular to PR that is \( \measuredangle RPQ = 90^0\) .
Therefore, the given points P (4, 4), Q (3, 5) and R (-1, -1) are the vertices of a right angled triangle.

Example 3

Find the ortho-centre of the triangle formed by joining the points P (- 2, -3), Q (6, 1) and R (1, 6).

The slope of the side QR of the \( \triangle PQR\) is
-1 ∙
Let PS be the perpendicular from P on QR; hence, if the slope of the line PS be m then,
m=1 .
Therefore, the equation of the straight line PS is
y + 3 = 1 (x + 2)
or x - y = 1 (1)
Again, the slope of the side RP of the \( \triangle PQR \) is 3 ∙
Let QT be the perpendicular from Q on RP; hence, if the slope of the line QT be m_1 then,
\( m_1=-\frac{1}{3}\)
Therefore, the equation of the straight line QT is
\( y-1 =-\frac{1}{3}(x-6) \)
or x + 3y = 9 (2)
Now, solving equations (1) and (2) we get
x = 3, y = 2 .
Therefore, the co-ordinates of the ortho-centre of the \( \triangle PQR\) is (3, 2).

Example 4

What is the single equation of straight lines passing through the origin and perpendicular to the lines represented by ax^2-2hxy+by^2=0?
Solution
Given the pair of lines are
\( ax^2+2hxy+by^2=0\) (A)
or \( b \left ( \frac{y}{x}\right )^2 +2h\left ( \frac{y}{x}\right )+a=0\)
Assume that the slopes of the lines are \(m_1,m_2\), then
\( m_1+m_2=-\frac{2h}{b}\) and \( m_1m_2=\frac{a}{b}\)
Now, the slopes of the lines perpendicular to (A) are \(\frac{-1}{m_1}\) and \(\frac{-1}{m_2}\), then
The required equation is given by
\( \left ( \frac{y}{x}\right )^2 +\left ( \frac{-1}{m_1}+\frac{-1}{m_2} \right )+ \left ( \frac{-1}{m_1}.\frac{-1}{m_2} \right ) =0\)
or \( bx^2+2hxy+ay^2=0\)




Angle Bisector

Bisector of the angles between two lines (or Angle bisector of two lines) are the lines which bisects the angle between the two given lines. These Angle bisector of two lines are the locus of a point which is equidistant from the two lines. In other words, an angle bisector has equal perpendicular distance from the two lines.
Equation of Angle Bisector of Two Lines

Let us consider a pair of straight lines given by
\(l_1 : a_1x + b_1y + c_1= 0\) and \(l_2 : a_2x + b_2y + c_2= 0\)
Also let, P(x, y) be a point lies on the angle bisectors, then length of perpendicular from the point P to both the lines \(l_1,l_2\) are equal.
Thus
\( \frac{a_1x + b_1y + c_1}{\sqrt{a_1^2+b_1^2}}=\pm \frac{a_2x + b_2y + c_2}{\sqrt{a_2^2+b_2^2}}\)(1)
Solving (1), we will get two bisectors \(b_1,b_2\) as required.




Equation of Angle Bisector of the pair of lines
  1. Equation of the angle bisectors of the pair of lines \( ax^2+2hxy+by^2=0\)
    Given the pair of lines are
    \( ax^2+2hxy+by^2=0\)
    or \( b \left ( \frac{y}{x}\right )^2 +2h\left ( \frac{y}{x}\right )+a=0\)
    Assume that the slopes of the lines \(l_1\) and \(l_2\) are \(m_1,m_2\), then
    \( m_1+m_2=-\frac{2h}{b}\) and \( m_1m_2=\frac{a}{b}\)

    Now, angle bisector \(l\) of two lines \(l_1\) and \(l_2\) is the locus of a point which is equidistant from the two lines \(l_1\) and \(l_2\) . In other words, it has equal perpendicular distance from the two lines.Thus, the euqtion of angle bisector is
    \( \frac{y-m_1x}{\sqrt{1+m_1^2}} = \pm \frac{y-m_2x}{\sqrt{1+m_2^2}} \)
    or \( \frac{(y-m_1x)^2(1+m_2^2)-(y-m_2x)^2(1+m_1^2) }{(1+m_1^2)(1+m_2^2)}=0\)
    or \( y^2(m_2^2-m_1^2)-x^2(m_2^2-m_1^2)+2xy(m_2-m_1)-2xym_1m_2(m_2-m_1)=0\)
    or \( y^2(m_2+m_1)-x^2(m_2+m_1)+2xy-2xym_1m_2=0\)
    or \( y^2(m_2+m_1)-x^2(m_2+m_1)+2xy(1-m_1m_2)=0\)
    or \( \frac{x^2-y^2}{a-b}=\pm \frac{xy}{h}\)




  2. Equation of the angle bisectors of the pair of lines \( ax^2+2hxy+by^2=0\)
    Given the pair of lines are
    \( ax^2+2hxy+by^2=0\)
    or \( b \left ( \frac{y}{x}\right )^2 +2h\left ( \frac{y}{x}\right )+a=0\)
    Assume that the slopes of the lines \(l_1\) and \(l_2\) are \(m_1,m_2\), then
    \( m_1+m_2=-\frac{2h}{b}\) and \( m_1m_2=\frac{a}{b}\)

    Now, angle bisector \(l\) of two lines \(l_1\) and \(l_2\) is the locus of a point which bisects the angle \( (\theta _2 - \theta _1 )\) , where \( (\theta _2 - \theta _1 )\) is the angle made by two lines \(l_1\) and \(l_2\) . In other words, the angle bisector makes an angle \( \frac{\theta _2 - \theta _1}{2}+ \theta _1 \)
    Thus, the angle bisector makes an angle \( \frac{\theta _1 + \theta _2}{2} \)
    Let us suppose that,
    \( \frac{\theta _1 + \theta _2}{2} = \phi\)
    Then,
    \( \tan 2 \phi= \theta _1 + \theta _2\)
    or \( \frac{2 \tan \phi}{1-\tan^2 \phi}= \frac{\tan \theta _1+ \tan \theta _2}{1-\tan \theta _1 \tan \theta _2} \)
    or \( \frac{2 \frac{y}{x} }{1-\frac{y^2}{x^2} }= \frac{m_1+ m_2}{1-m_1 m_2} \)
    or \( \frac{2 \frac{y}{x} }{1-\frac{y^2}{x^2} }= \frac{\frac{-2h}{b}}{1-\frac{a}{b}} \)
    or \( \frac{x^2-y^2}{a-b}=\frac{xy}{h}\)




Solved Examples

  1. Find the combined equation of the straight lines whose separate equations are \(x-2y-3 = 0\) and \(x + y+5 = 0\).
    Solution :
    Combined equation of straight lines :
    (x-2y-3)(x + y + 5)
    or \(x^2 + xy + 5x-2xy - 2y^2 - 10y-3x-3y-15\)
    or \(x^2-xy - 2y^2 + 2x - 13-15\)
  2. Show that \(4x^2 + 4xy + y^2-6x-3y-4 = 0 \)represents a pair of parallel lines.
    Solution :
    Given pair of lines is
    \( 4x^2 + 4xy + y^2-6x-3y-4 = 0\)
    By comparing the given equation with the general equation of pair of straight lines
    \( ax^2 + 2hxy + by^2 + 2gx + 2fy + c = 0\)
    We get,
    a = 4, b = 1, 2h = 4 ==> h = 2
    Now, we can see that
    \(h^2=ab\)
    or\(2^2=4.1\)
    Therefore, the two lines represented by \(4x^2 + 4xy + y^2 - 6x - 3y - 4 = 0 \) are parallel
  3. Show that \(2x^2 + 3xy − 2y^2 + 3x + y + 1 = 0 \)represents a pair of perpendicular lines.
    Solution :
    \(2x^2 + 3xy − 2y^2 + 3x + y + 1 = 0\)
    By comparing the given equation with the general equation of pair of straight lines
    \( ax^2 + 2hxy + by^2 + 2gx + 2fy + c = 0\)
    We get,
    a = 2, b = -2, 2h = 3 ==> h = 3/2
    If two lines are perpendicular then
    a + b = 0
    or 2 + (-2) = 0
    Hence the given pair of straight line is perpendicular
  4. Prove that the equation \(6x^2+13xy+6y^2+8x+7y+2=0\) represents a pair of straight lines.
    Solution:
    To show that this equation represents a pair of straight lines, we use the determinant condition
    \( \begin{pmatrix} a & h & g \\ h &b&f\\g&f&c \end{pmatrix}\)
    or \( \begin{pmatrix} 6 &13/2&4 \\13/2&6&7/2\\4&7/2&2 \end{pmatrix} =0\)
    which confirms the stated assertion



Exercise

  1. Find the combined equation of the straight lines whose separate equations are \(x - 2y -3 = 0\) and \(x + y+5 = 0\)
  2. Show that \( 4x^2 + 4xy + y^2 - 6x - 3y - 4 = 0 \) represents a pair of parallel lines
  3. Show that \( 2x^2 + 3xy - 2y^2 + 3x + y + 1 = 0\) represents a pair of perpendicular lines.
  4. Show that the equation \( 2x^2 -xy-3y^2 -6x + 19y - 20 = 0 \) represents a pair of intersecting lines. Show further that the angle between them is \( \tan^{-1}(5) \).
  5. Prove that the equation to the straight lines through the origin, each of which makes an angle \( \alpha \) with the straight line y = x is \( x^2 - 2xy \sec 2 \alpha + y^2 = 0 \)
  6. Find the equation of the pair of straight lines passing through the point (1, 3) and perpendicular to the lines \(2x - 3y+1 = 0\) and \(5x + y - 3 = 0\)
  7. Find the separate equation of the following pair of straight lines
    1. \( 3x^2 + 2xy - y^2 = 0 \)
    2. \(6(x - 1)^2 + 5(x - 1)(y - 2) - 4(y - 2)^2 = 0\)
    3. \(2x^2 - xy - 3y^2 - 6x + 19y - 20 = 0 \)
  8. The slope of one of the straight lines \(ax^2 + 2hxy + by^2 = 0 \) is twice that of the other, show that \( 8h^2 = 9ab \) .
  9. The slope of one of the straight lines \(ax^2 + 2hxy + by^2 = 0\) is three times the other, show that \( 3h^2 = 4ab . \)
  10. A \( \triangle OPQ\) is formed by the pair of straight lines \(x^2 -4xy +y^2 = 0\) and the line PQ . The equation of PQ is x + y - 2 = 0 . Find the equation of the median of the triangle \( \triangle OPQ\) drawn from the origin O .
  11. Find p and q , if the following equation represents a pair of perpendicular lines \( 6x^2 + 5xy - py^2 + 7x + qy - 5 = 0 \)
  12. Find the value of k , if the following equation represents a pair of straight lines. Further, find whether these lines are parallel or intersecting, \( 12x^2 + 7xy - 12y^2 - x + 7y + k = 0 \)
  13. For what value of k does the equation \( 12x^2 +2kxy+2y^2 +11x-5y+2 = 0 \) represent two straight lines.



Exercise 9.1 [BCB pg 244]

  1. Find the lengths of perpendiculars drawn from
    1. (0,0) to the line \(3x+y+1=0\)
    2. (-3,0) to the line \(3x+4y+7=0\)
    3. (2,3) to the line \(8x+15y+24=0\)
  2. If p is the length of the perpendicular dropped from the origin on the line \(\frac{x}{a}+\frac{y}{b}=1\) prove that \(\frac{1}{a^2}+\frac{1}{b^2}=\frac{1}{p^2}\)
  3. Find the distance between the parallel lines
    1. \(3x+5y=11\) and \(3x+5y=-23\)
    2. \(2x-5y=6\) and \(6x-15y+11=0\)
  4. Find the equation of bidectors of the angles between the lines
    1. \(3x-4y+2=0\) and \(5x+12y+5=0\)
    2. \(3x-4y+6=0\) and \(5x+12y+10=0\)
    3. \(x-2y=0\) and \(2y-11x=6\)

    1. Find the equation of the bisectors of the angles between the lines containing the origin in each of the following cases
      1. \(4x-3y+1=0\) and \(12x-5y+7=0\)
      2. \(7x-y+11=0\) and \(x+y-15=0\)
      3. Prove that the bisectors of angles are at right angles to each other.
    2. Find the equation of the bisectors of acute angles between each of the following pair of lines
      1. \(y=x\) and \(y=7x+4\)
      2. \(x+2y=5\) and \(4x+2y+9=0\)
      3. Prove that the bisectors of angles are at right angles to each other.

    1. Are the points (1,2) and (-5,6) on the same side or on opposite side of the line \(3x+5y-8=0\)?
    2. On what side of the line \(5x-4y+6=0\) do the points (0,0) and (-1,3) lie?
    3. Show that two of the three points (0,0),(2,3) and (3,4) lie on one side and the remaining on the other side of the line \(x-3y+3=0\)

    1. The length of the perpendicular drawn from the point (a,3) on the line 3x+4y+5=0 is 4. Find the value of a.
    2. What are the points on the axis of X whose perpendicular distance from the striaght line \(\frac{x}{a}+\frac{y}{b}=1\) is a ?

    1. Determine the equation and length of the altitude drawn from the vertex A to the opposite side of the triangle A(1,0), B(1,3), C(4,-2).
    2. Find the equation of two straight lines each of which is parallel to and at a distance of \(\sqrt{5}\) from the line x+2y-7=0
    3. Find the equations of the two straight lines drawn through the point (0,a) on which the perpendicular drawn from the point (2a,2a) are each of length a
    4. Find the equation of line which is at right angles to \(3x+4y=12\) such that its perpendicular distance from the origin is equal to the length of the perpendicular from (3,2) on the given line.
    5. The equation of the diagonal of a parallelogram is \(3y=5x+k\). The two opposite vertices of a parallelogram are the points (1,-2) and (-2,1). Find the value of k.
  5. If p and p' be the length of the perpendiculars from the origin upon the straight line whose equations are \(\sec \theta + y \csc \theta = a \) and \(x \cos \theta - y \sin \theta = a \cos 2 \theta \), prove that \(4p^2 + p'^2 = a^2\)
  6. Show that the product of the perpendiculars drawn from the two points \( ( \pm \sqrt{a^2-b^2},0) \) upon the line \(\frac{x}{a}\cos \theta +\frac{y}{b} \sin \theta =1\) is \(b^2\)
  7. The origin is a corner of a square and two of its sides are \(y + 2x = 0\) and \(y + 2x = 3\). Find the equation of the other two sides.

    1. A triangle is fomed by lines \(x+ y= 6, 7x - y + 10 = 0\) and \(3x + 4y + 9 = 0\). Find the equation of the internal bisector of the angles between the first two sides.
    2. Find the equations of the internal bisectors of the angles of the triangle whose sides are \(4x - 3y + 2 = 0, 3x - 4y + 12 = 0\) and \(3x + 4y - 12 = 0\). Also find the incentre of the triangle.

Wednesday, January 24, 2024

Correlation and Regression

MEAN

Introduction

The data below gives marks obtained by 10 students taking exam on math and computer test.

StudentsABCDEFGHIJ
X Marks (Math)1518324966151212
Y Marks (Computer)10151131362111316

Is there a connection between the marks obtained by 10 students in Math and computer test? A starting point would be to plot the marks of both subjects in a scatter diagram.

Now calculating the means, we get
\(\bar{X}=\frac{120}{10}=12\)
\(\bar{Y}=\frac{100}{10}=10\)

And using them to divide the graph into four slots. It clearly shows that, the areas in the bottom right and top left of the graph are largely vacant.
So there is a tendency for the points to run from bottom left to top right.
In this example, most of the points (1st and 3rd quadrants) give positive value of
\((x-\bar{X})(y-\bar{Y})\)

The problem is to find a way to measure how strong this tendency is. To answer this question, we proceed further.
Here
\(\frac{x-\bar{x}}{s_x}\)
gives normalized distance of each x from \(\bar{x}\) and makes it unit free.

Also
\(\frac{y-\bar{y}}{s_y}\)
gives normalized distance of each y from \(\bar{y}\) and makes it unit free.

So
\(\frac{1}{n} \displaystyle \sum \left (\frac{x-\bar{x}}{s_x} \right ) \left ( \frac{y-\bar{y}}{s_y} \right )\)
gives normalized product moment, which is the value of correlation.


The value of correlation (r) gives a measure of how close the points are to lying on a straight line.

  1. r= 1 indicates that all the points lie exactly on a state line with positive gradient
  2. r=-1 gives the same information with a line having negative gradient
  3. r=0 tells us that there is no connection at all between the two sets of data

The illustration shows that, quantification of the relationship between variables is very essential to take the benefit of study of relationship, called corelation. For this, we find there are two basic methods of measurement of correlation, which can be represented as graphical method and algebraic method.




Scatter Diagram: Graphic method

Scatter Diagram is graphic method of measurement of correlation. It is a diagrammatic representation of bivariate data to ascertain the relationship between two variables. Under this method the given data are plotted on a graph paper. Once the values are plotted on the graph it reveals the type of the correlation between variable X and Y. However please note that, the correlation is affected by each point.

Strong Positive Correlation Low Positive Correlation No Correlation Low Negative Correlation Strong Negative Correlation



Correlation

Correlation is a technique to measures strength of association (सम्बन्धको मापन) between two variables, say X and Y. The intensity (मात्रा) of the correlation is expressed by a number, called the coefficient of correlation, and it is denoted by r. The value of correlation lies between -1 to 1 (inclusive).

  1. Coefficient of correlation is first introduced by Galton (1886)
  2. Formalized by Karl Pearson (1896)
  3. Developed/extended by Fisher (1935)
  4. The main idea is to compute an index (number) which reflects how much two variables are related to each other.
  5. f two variables are related such that both increase or both decrease, then the correlation is positive
  6. If increase in any one variable is associated with decrease in the other variable, the correlation is negative



Types of Correlation coefficient
There are two main types of correlation coefficients: Pearson's product moment correlation coefficient and Spearman's rank correlation coefficient. The correct usage of correlation coefficient type depends on the types of variables being used. However, the different types of correlation are given in the table below.
QuantativeOrdinalNominal
Quantitative Pearson's BiserialPoint Biserial
Ordinal Biserial Spearman rho Rank Biserial
Nominal Point Biserial Rank Biserial Phi



Interpretation of correlation coefficients

Generally, the coefficient of correlation is either positive or negative or zero. If the correlation is positive, then the variables are related such that both increase or both decrease. If the correlation is negative , then increase in any one variable is associated with decrease in the other variable and vice-versa. If the correlation is zero then the variables are not related. On top, the correlation is interpreted as following
Correlation coefficients whose magnitude \(r\) lies

between 0.8 and 1.0 is very high (perfect) correlation
between 0.6 and 0.8 is high correlation
between 0.4 and 0.6 is moderate correlation
between 0.2 and 0.4 is low correlation
between 0.0 and 0.2 is very low (no) correlation



Coefficient of determination

Correlation coefficient measuring a linear relationship between the two variables indicates the amount of variation one variable accounted for by the other variable. A better measure for this purpose is provided by the square of the correlation coefficient, known as “coefficient of determination”. This can be interpreted as the ratio between the explained variance to total variance:
\(r^2 =\frac{\text{explained variance}}{\text{total variance}}\)
Similarly, Coefficient of non-determination is
\(1-r^2 \)
Thus
The square of correlation coefficient is called coefficient of determination. It r is obtained by two variables \(X\) and \(Y\) then \(r^2\) is the fraction of variation in \(Y\) that is explained by \(X\).

For example, if correlation between “Math score” and “Anxiety” is \(r=-0.4\), then \(r^2=0.16\), it means 16% of the variability in Math score and anxiety “overlaps” in opposite manner.
Based on this example, a coefficient of determination of 0.16 is obtained. It can be interpreted that the variation in Math Score can explain 16% of the variation in Anxiety score. The remaining 84% represents the variation in Math Score explained by other variables not included in the model.




Properties of coefficient of Correlation

As correlation measure the strength of association between two variables, the major properties of such correlation coefficients can be summarized into following bullets:

  1. The correlation coefficient lies between \(-1\) and \(+1\)
  2. Independent from unit of measurement
  3. Independent of origin and scale
  4. Symmetrical i.e., \(r_{xy} = r_{yx}\)



Limitation of Correlation

A key thing to remember is that correlation is not responsible in change in one variable causes a change in another. Sales of personal computers and athletic shoes have both risen strongly over the years and there is a high correlation between them, it cannot be assumed that buying computers causes people to buy athletic shoes (or vice versa).

The second caution is that the Pearson correlation technique works best with linear relationships: as one variable gets larger, the other gets larger (or smaller) in direct proportion. It does not work well with curvilinear relationships (in which the relationship does not follow a straight line). An example of a curvilinear relationship is age and health care. They are related, but the relationship doesn't follow a straight line. Young children and older people both tend to use much more health care than teenagers or young adults.

  1. r is measure of linear relationship only. There may be an exact connection between X and Y, but if it is no straight line, there is no help.
  2. correlation does not imply causality. A survay may result that strong correlation between left feet people and mental mathematics.
  3. An usual freak result may have strong effect on the value of r

Pearson's product moment correlation:Algebraic Method

Karl Pearson’s method of calculating coefficient of correlation is based on the covariance of the two variables in a series. This method is widely used in practice and the coefficient of correlation is denoted by the symbol \(r\). It is used when both variables being studied are normally distributed and Quantitative in scale . For a correlation between variables \(X\) and \(Y\) the formula for calculating the Pearson's correlation coefficient is given by

\( r=\frac{Cov(X,Y)}{\sigma_x \sigma_y}\) where \(Cov(X,Y) =\frac{1}{n} \sum (x-\bar{x})(y-\bar{y})\) Variance method

\( r=\frac{ \sum xy}{\sqrt{\sum x^2} \sqrt{\sum y^2}}\)Deviation method

\( r=\frac{n \sum XY-\sum X \sum Y}{\sqrt{n\sum X^2-(\sum X)^2} \sqrt{n\sum Y^2-(\sum Y)^2}}\)Raw method




Example 1
Calculate the correlation cofficient of the marks in Mathematics and Statistics for eight students as below.
Marks in Math (X)67 68 65 68 72 72 69 71
Marks in Stat (Y)65 66 67 67 68 69 70 72
Solution
Based on the data given above, we can calculate the correlation subtracting \(65\) from each data both in X and Y. Now, the table of calculation is given below.
X Y X Y \(X^2\) \(Y^2\) XY
67 65 2 0 4 0 0
68 66 3 1 9 1 3
65 67 0 2 0 4 0
68 67 3 2 9 4 6
72 68 7 3 49 9 21
72 69 7 4 49 16 28
69 70 4 5 16 25 20
71 72 6 7 36 49 42
\(\sum X=32\) \(\sum Y=24\) \(\sum X^2=172\) \(\sum Y^2=108\) \(\sum XY=120\)

Now, using formula, the correlation cofficient is
\( r=\frac{N \sum XY-\sum X \sum Y}{\sqrt{N\sum X^2-(\sum X)^2} \sqrt{N\sum Y^2-(\sum Y)^2}}\)
or \( r=\frac{8 .120-32.24}{\sqrt{8.172-(32)^2} \sqrt{8.108-(24)^2}}=0.60\)




Spearman’s Rank Correlation

When quantification of variables becomes difficult such as beauty of female, leadership ability, knowledge of person etc, then this method of rank correlation is useful which was developed by British psychologist Charles Edward Spearman in 1904. In this method ranks are allotted to each element either in ascending or descending order. The correlation coefficient between these allotted two series of ranks is popularly called as “Spearman’s Rank Correlation” and denoted by \(\rho\). It is appropriate when one or both variables are skewed or ordinal in scale. For a correlation between variables \(X\) and \(Y\) the formula for calculating the Spermans' rho correlation coefficient is given by
\(\rho=1-\frac{6\displaystyle \sum_{i=1}^n d_i^2}{n(n^2-1)}\)
where
\(d_i\)= the difference between the ranks of corresponding variables
\(n=\) number of observations

NOTE
If there are tied ranks, we give mean of the ranks they would have if they were not tied. In this case, we use the formula as below.
\(\rho=1-\frac{6 \left [\displaystyle \sum_{i=1}^n d_i^2+ \sum_k \frac{m_k^2(m_k^2-1)}{12}\right ]}{n(n^2-1)}\)
where
\(k=\)repeated items




Proof of \(\rho=1-\frac{6\displaystyle \sum_{i=1}^n d_i^2}{n(n^2-1)}\)

Consider a bivariate sample \(x_i,y_i\) for \(i=1,2, \cdots,n\), then, \(x_i\) and \(y_i\) get ranks, each is a permutation of the same sequence of numbers \(1,2,3,\cdots,n\)
Thus
\(\bar{x}=\frac{\displaystyle \sum_{i=1}^n i}{n}\)
or\(\bar{x}= \frac{1+2+\cdots+n}{n}\)
or\(\bar{x}=\frac{n+1}{2}\)

Similarly,
\(s_x^2=\frac{1}{n} \displaystyle \sum_{i=1}^n (i^2)-(\bar{x})^2\)

or\(s_x^2=\frac{1}{n}\frac{n(n+1)(2n+1)}{6}-(\frac{n+1}{2})^2\)

or\(s_x^2=\frac{(n+1)(2n+1)}{6}-(\frac{n+1}{2})^2\)

or\(s_x^2=(\frac{n+1}{2}) \left [\frac{(2n+1)}{3}-\frac{n+1}{2} \right ]\)

or\(s_x^2=(\frac{n+1}{2})[\frac{n-1}{6}] \)

or\(s_x^2=\frac{n^2-1}{12}\)

So, we have
\(\bar{x}=\bar{y}=\frac{n+1}{2}\)
\(s_x^2=s_y^2=\frac{n^2-1}{12}\)
Next, we consider that
\(d_i= (x_i-\bar{x})(y_i-\bar{y})\)
Therefore
\(\displaystyle \frac{1}{n} \sum_{i=1}^n d_i^2= \frac{1}{n} \sum_{i=1}^n [(x-\bar{x})(y-\bar{y})]^2\)

or\(\displaystyle \frac{1}{n}\sum_{i=1}^n d_i^2= s_x^2+s_y^2-2.r. s_xs_y\)

or\(\displaystyle \frac{1}{n}\sum_{i=1}^n d_i^2= 2s_x^2-2.r. s_x^2\)

or\(\displaystyle \frac{1}{n}\sum_{i=1}^n d_i^2= 2s_x^2(1-r)\)

or\(\displaystyle \frac{1}{n}\sum_{i=1}^n d_i^2= 2 \frac{n^2-1}{12}. (1-r)\)

or\(\displaystyle \frac{1}{n}\sum_{i=1}^n d_i^2= \frac{n^2-1}{6}. (1-r)\)

or\( \frac{ \displaystyle 6\sum_{i=1}^n d_i^2}{n(n^2-1)}= (1-r)\)

or\(r=1- \frac{ \displaystyle 6\sum_{i=1}^n d_i^2}{n(n^2-1)}\)




Example 2
Calculate the rank correlation score in Mathematics and IQ for seven students as below.
Score in Math (X)52 51 53 55 54 56 57
Score in IQ (Y)61 63 62 64 67 65 66

Based on the data given above, we can calculate the correlation subtracting \(50\) from each data in \(X\) and subtracting \(60\)( from each data in \(Y\).
Now, the table of calculation is given below.

X Y X Y Rank of X Rank of Y d:Rx-Ry Square of Rank difference: \(d^2\)
52 61 2 1 6 7 -1 1
51 63 1 3 7 5 2 4
53 62 3 2 5 6 -1 1
55 64 5 4 3 4 -1 1
54 67 4 7 4 1 3 9
56 65 6 5 2 3 -1 1
57 66 7 6 1 2 -1 1
\(\sum d^2=18\)

Now, using formula, the correlation cofficient is
\(\rho=1-\frac{6\displaystyle \sum_{i=1}^n d_i^2}{n(n^1-1)}\)
or \(\rho=1-\frac{6 \times 18}{7(7^2-1)}=0.68\)




Regression equations of two variables

Regression analysis is a statistical tool to estimate (or predict) the unknown values of dependent variable from the known values of independent variable.
The variable that forms the basis for predicting another variable is known as the Independent Variable and the variable that is predicted is known as dependent variable.
For Example,
in \(Y=a+bX\)
one can obtain value of \(Y\) by putting the value of \(X\)
So,
X is called independent variable
Y is called dependent variable
Therefore
Regression is a technique to measure “dependence of one variable upon other variable”.




Regression Equation

Let ‘X’ is a independent variable and ‘Y’ is an dependent variable, then regression equation of Y on X is
\(Y=a +b X\) (1)
where ‘a’ and ‘b’ are constants, where
\(a\) represent y-intercept
\(b\) represent slope of line
To better understand it, just compare \(Y=b X+a\) with \(Y=m X+c\), then, it can be said that
\(a=c\) represent y-intercept
\(b=m\) represent slope of line
To compute the values of these constant ‘a’ and ‘b’, corresponding normal equations are given below
Normal Equation for ‘a’ is [Taking sum of both sides, we get]
\( \sum Y=na+b \sum X \)(2)
Normal Equation for ‘b’ is [Multiply equation (1) by X and take sum of both sides we get] \( \sum XY=a \sum X+ b \sum X^2\) (3)
Solving (2) and (3) to find ‘a’ and ‘b’ , we get
\( b =\frac{\sum XY -\frac{\sum X \sum Y}{n}}{\sum X^2-\frac{(\sum X)^2}{n}}\)

or\( b =\frac{\sum XY -n \bar{X}\bar{Y}}{\sum X^2-n \bar{X}^2} \)

And
\( a =\bar{Y}-n \bar{X}\)




The Method of Least Square
Let us consider a data set given as
X4812
Y616
Let us plot these data in a graph, and try to estimate a best fit line. The two possibilities are given below.
From the two possibilities of goodness of fit of an estimating line, summing the error of two estimations, we get that
First GraphSecond Graph
8-6=2
1-5=-4
6-4=2
-----=--
Error=0
8-2=6
1-5=-4
6-8=-2
-----=--
Error=0

It shows that, the process of summing individual differences for calculating the error is NOT a relaible way to judge the goodness of fit for an estimating line.
Therefore
We further proceed with absolute value of each error to judge the estimation line with best goodness of fit, in a new example.The result is as follows

From the two possibilities of goodness of fit of an estimating line, summing the error of two estimations, we get that
First GraphSecond Graph
|4-4|=0
|7-3|=4
|2-2|=0
-----=--
Error=4
|4-5|=1
|7-4|=3
|2-3|=1
-----=--
Error=5

It shows that, the process of summing absolute value of individual differences for calculating the error is NOT a relaible way to judge the goodness of fit for an estimating line, because,
According to data, Graph 2 has best fit
The absolute value shows that, Graph 1 has best fit
Therefore
We further proceed with summing square of each error to judge the estimation line with best goodness of fit. This is called least square methos. The result is as follows

First GraphSecond Graph
(4-4)2=0
(7-3)2=16
(2-2)2=0
-----=--
Error=16
(4-5)2=1
(7-4)2=9
(2-3)2=1
-----=--
Error=11

It shows that, the process of summing square value of individual differences for calculating the error is BEST relaible way to judge the goodness of fit for an estimating line, because,
According to data, Graph 2 has best fit
The square value shows that, Graph 2 has best fit
Therefore
Least Square Method is the best way to judge the goodness of fit for an estimating line




Coefficient of Regression equation

Let ‘X’ is a independent variable and ‘Y’ is an dependent variable, then regression equation of Y on X is
\(Y=a +b X\) (1)
The quantity \(a\) in the regression equation (1) is called y-intercept or orign (threashold) coefficient of Y. Here, \(a\) is the average value of Y when X is zero.
Next, the quantity \(b\) in the regression equation (1) is called slope coefficient.
Since there are two regression equations,
Y on X, given as \(Y=a+bX\)
X on Y, given as \(X=c+dx\)
Therefore, we have two regression coefficients.
Regression Coefficient of X on Y, symbolically written as \(b_{xy}\)
Regression Coefficient of Y on X, symbolically written as \(b_{yx}\)
Which can be summarized as below

  1. \( Y= a + b_{yx} X \)
  2. \( Y-\bar{Y}=b_{yx} (X-\bar{X})\)
  3. \( b_{yx} = \frac{Cov(X,Y)}{V(X)} \)
  4. \( b_{yx} = r \times \frac{s_y}{s_x} \)
  5. \( r = \pm \sqrt{b_{xy} \times b_{yx} } \)



Properties of Regression Coefficients
The major properties of regression coefficients can be summarized into following bullets:
  1. Regression coefficients are independent of the changes of origin but not of scale.
  2. If one of the regression coefficients is greater than unity, the other must be less than unity.
  3. Both regression coefficient have same sign with respect to the correlation coefficient.
  4. The regression line always passes through the mean
  5. The intersection point of two regression line is the means
  6. Two regression line coincides if \( r=\pm 1\)
  7. Two regression line are perpendicular if \( r=o\)
  8. Arithmetic mean of the regression coefficients is greater than the correlation coefficient r, provided that r > 0.
    \( \frac{ b_{xy} + b_{yx} }{2} \ge r \)
  9. Geometric mean of regression coefficients is the correlation coefficient
    i.e. \( r = \sqrt{b_{xy} \times b_{yx} } \)



Difference Between Correlation and Regression
Below mentioned are a few key differences between these two aspects.
CorrelationRegression
‘Correlation’ determines the interconnection or a co-relationship between the variables. ‘Regression’ explains how an independent variable is numerically associated with the dependent variable
Both the independent and dependent values have no difference. Both the dependent and independent variable are different.
The primary objective is, to find out a quantitative/numerical value expressing the association between the values. The primary intent is, to find the values of a a variable based on the values of the fixed variable.
Correlation stipulates the degree to which both of the variables can move together. Regression specifies the effect of the change in unit, in the known variable (X) on the evaluated variable (Y).
Correlation helps is constituting the connection between the two variables. Regression helps in estimating a variable’s value based on another given value.
Example 3
Calculate the regression equation of Score in IQ on Score in Math from the following data.
Score in Math (X)8 10 9 12 10 11
Score in IQ (Y)2 2 3 5 5 6

Solution
Assuming \(‘X’\) as a independent variable and \(‘Y’\) as dependent variable, the regression equation of \(Y\) on \(X\) is
\(Y=a +b X\), where \(‘a’\) and \(‘b’\) are constants
To compute the values of these constant \(‘a’\) and \(‘b’\), the corresponding normal equations are
\(\sum Y=na+b \sum X\) (1)
\(\sum XY= a \sum X +b \sum X^2\) (2)
Based on the data given above, the table of calculation is given below.

X Y \(X^2\) \(Y^2\) XY
8 2 64 4 16
10 2 100 4 20
9 3 81 9 27
12 5 144 25 60
10 5 100 25 50
11 6 121 36 66
\(\sum X=60\) \(\sum Y=23\) \(\sum X^2=610\) \(\sum Y^2=103\) \(\sum XY=239\)

Based on the table of calculation, the normal equations are
\(23 = 7 a + 60b\) (3)
\(239 = 60 a + 610 b\) (4)
Solving the two equation (3) and (4), we get
\(a = 0.46, b = 0.44\)
Hence, the regression equation of \(Y\) on \(X\) is
\(Y=0.46 +0.44 X\)
Next, assuming \(‘Y’\) as a independent variable and \(‘X’\) as dependent variable, the regression equation of \(X\) on \(Y\) is
\(X=c +d Y\), where \(c\) and \(d\) are constants
To compute the values of these constant \(c\) and \(d\), the corresponding normal equations are
\(\sum X=nc+d \sum Y\) (5)
\(\sum XY= a \sum Y +b \sum Y^2\) (6)
Based on the table of calculation, the normal equations are
\(60 = 7 c + 23 d\) (7)
\(239= 23 c + 103d \) (8)
Solving the two equation (7) and (8), we get
\(c =3.56, d=1.53\)
Hence, the regression equation of \(Y\) on \(X\) is
\(X=3.56 +1.53 Y\)
NOTE
The regression equation of X on Y and Y on X do NOT necessarily estimate same value, as it does in linear equation. For example,

Consider a linear equation
\(2x+3y=12\)
Here, equation for x is
\(x=\frac{12-3y}{2}\)
If we put \(y=2\) then we get \(x=3\)
Next, the equation for y is
\(y=\frac{12-2x}{3}\)
If we put \(x=3\) then we get \(y=2\)
Here, we get same pair of values.

But, this situation may not happen in a pair of regression equation.
Based on example given above, the regression equation for x is
\(X=3.56 +1.53 Y\)
If we put \(y=2\) then we get \(x=6.62\)
Next, Based on example given above, the regression equation for y is
\(Y=0.46 +0.44 X\)
If we put \(x=6.62\) then we get \(y=3.37\)
Here, we get different pair of values.

Therefore, regression equation of X on Y and Y on X do not necessarily estimate same value


Exercise

  1. Coefficient of correlation between X and Y is 0.3. Their covariance is 9. The variance of X is 16. Find the standard devotion of Y series.
  2. Find the two regression equation of X on Y and Y on X from the following data:
    X : 10 12 16 11 15 14 20 22
    Y : 15 18 23 14 20 17 25 28
  3. The data below gives marks obtained by 10 students taking exam on math and computer test.
    Students A B C D E F G H I J
    X Marks (Math) 15 18 3 24 9 6 6 15 12 12
    Y Marks (Computer) 10 15 1 13 13 6 2 11 13 16
    Is there a connection between the marks obtained by 10 students in Math and computer test?
  4. Suppose you have calculated two months of attendance of five randomly selected students. Their unit test results are presented in a table, which is given below:
    X 10 20 30 40 50
    Y 20 30 40 30 50
    Calculate the Pearson's correlation coefficient of the above data and interpret the correlation. What conclusion do you draw from the correlation coefficient?
  5. In a laboratory experiment on correlation research study, the equation to the to regression lines were to be 2X-Y+1=0 and 3X-2Y+7=0. Find the means of X and Y. Also work out the values of the regression coefficients and the coefficient of correlation between the two variables X and Y. Given variance of X=9 find the standard deviation of Y.
  6. The coefficient of rank correlation of the marks obtained by 10 students in statistics and accountancy was found to be 0.8. It was later discovered that the difference in ranks in the two subjects obtained by one of the students was wrongly taken as 7 instead of 9. Find the correct coefficient of rank correlation.
  7. find the correlation between X and Y.
    X Good Excellent Good Excellent Excellent Excellent
    Y Poor Good Poor Excellent Very Good Good
  8. Imagine you are a secondary level teacher conducting research on the correlation between students' self-reported study habits and their academic achievement. Determine whether you would use Pearson's correlation or Spearman's rank correlation for this study, and justify your choice. Discuss potential benefits and challenges associated with your chosen method, emphasizing how the results could inform teaching strategies.

Search This Blog