1
$\begingroup$

I have received this confusing task: You have two variables 𝑥 and 𝑦, where y is a response variable which can be written as an explicit linear function of 𝑥. However, the technique used for measuring 𝑥 is twice as better than that for measuring 𝑦 in the sense of error variance, i.e. the variance of the error in 𝑥 is twice as small as the variance of error in 𝑦. The task is to model y as a function of x.

How should I go about doing this? When I used a basic linear regression algorithm on the data, this was the result: enter image description here

When I removed the outliers, this was the result:

enter image description here

The second one looks pretty good, but I think I am missing something. This is supposedly the data that has a "measurement error". What do you think I should do instead?

$\endgroup$
5
  • 1
    $\begingroup$ en.wikipedia.org/wiki/Deming_regression ? $\endgroup$ Commented Dec 3, 2022 at 18:29
  • $\begingroup$ @seanv507 I thought about it but it gives me a very similar regression line... $\endgroup$ Commented Dec 3, 2022 at 18:37
  • 1
    $\begingroup$ This looks like two separate problems: (1) what to do about possible outliers; (2) how to perform regression that takes into account variance in both x and y, once you decide which are the valid observations. Don't confuse those problems. $\endgroup$ Commented Dec 3, 2022 at 18:57
  • $\begingroup$ @EdM For the first problem I removed 95% outliers, and the result you saw above. So I thought this was pretty much resolved. What about the second one? I don't know exactly how to go about it. $\endgroup$ Commented Dec 3, 2022 at 19:02
  • 1
    $\begingroup$ Good search keywords include "errors in variables regression" and "Deming regression." "Least angle regression" might be useful, too. $\endgroup$ Commented Dec 3, 2022 at 20:23

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.