0

I have one table that is a list of parts that are going to be unavailable and I want to find replacements for each one of the parts. I made another table of potential replacement parts that I want to compare with the first table and find the best matching alternative.

To demonstrate my issue in a much smaller scale, I threw together the following:

End of Life Products

The above table is a list of end of life parts I want to replace as well as their respective attributes. (Note, I did not look up this data, so forgive me if it's absurd)

Alternative Products

Then the next table is a list of alternative products with the same attribute columns.

Is there a way in Excel to compare all the specs of each orange to the specs of each apple and select the most similar apple to each orange (and thus fill in the "Similar Apple" column on the orange table)? I do not need exact matches for any of the numbers, but I want to effectively score the data in each column of each apple for each orange and choose the apple with the highest score. Preferably with weighting so that some rows have more importance than others.

I'm not too familiar with VB but I'm open to using it if I need to. If Excel can't do this, is there an alternative route that anyone would recommend?

Most solutions I found online were rather simple VLOOKUP() or INDEX() formulas which would search for a specific value in a single column or search for part of a string rather than the closest number value in multiple columns.

5
  • You should have a look on Excel Solver add-in
    – Black cat
    Commented Apr 24 at 15:33
  • You need to think about how you are going to measure "similarity". First think in terms of just having one attribute to compare (rather than the five different ones) and maybe measure "difference" rather than "similarity". Once you have a similarity/difference measure for each attribute you will then need to think about how to combine these measures to get an overall similarity/difference measure between each orange and each of the possible replacement apples. Excel is the easy part, the hard part is devising appropriate mathematical rules to measure and combine similarity/difference.
    – DMM
    Commented Apr 24 at 16:27
  • @Blackcat Excel Solver looks like a way to solve for parameters to meet a certain outcome, if I'm understanding it right. In my case, I need my parameters to stay constant and I want to compare many possible solutions
    – jkHeat
    Commented Apr 24 at 16:54
  • Solver parameters are not fixed. They are formulas (created by you) based on the table values weighted in a way. You calculate a reachable value using the same formula for a type of orange. Solver can return the appropriate apple which has the minimum difference in the weights targeting the orange's value. But this is not a simple task.
    – Black cat
    Commented Apr 24 at 17:50
  • You are right that it can be solved with only formulas, but it is also not seems simple to create.
    – Black cat
    Commented Apr 24 at 18:01

1 Answer 1

1

Ok, I decided to take the time and learn some VB as I realized it wouldn't be too bad of an algorithm if I could figure out how to implement it into Excel.

I'm still open to any feedback as I'm just guessing that the math is all right

Here's what I decided to write:

Sub FindApples()

'Weights for calculating scores
Const wgtShape As Double = 0.5
Const wgtHue As Double = 0.1
Const wgtSweet As Double = 0.2
Const wgtWeight As Double = 0.1
Const wgtPrice As Double = 0.1

'Set up tables
Dim oranges As ListObject
Set oranges = Sheets("Sheet1").ListObjects("Table1")
Dim apples As ListObject
Set apples = Sheets("Sheet1").ListObjects("Table2")

'Step through orange table
For Each orange In oranges.DataBodyRange.Rows

    'Collect Orange row data
    Dim shpe As String
    Dim hue As Double
    Dim sweet As Double
    Dim weight As Double
    Dim price As Double
    shpe = orange.Cells(1, 2)
    hue = orange.Cells(1, 3).Value
    sweet = orange.Cells(1, 4).Value
    weight = orange.Cells(1, 5).Value
    price = orange.Cells(1, 6).Value
    Debug.Print (orange.Cells(1, 1).Text)
    Debug.Print ("- shape:  " & shpe)
    Debug.Print ("- hue:    " & hue)
    Debug.Print ("- sweet:  " & sweet)
    Debug.Print ("- weight: " & weight)
    Debug.Print ("- price:  " & price)
    
    'Set up scoring
    Dim highScore As Double
    Dim matchName As String
    highScore = 0
    matchName = "NA"
    
    'Step through apple table
    For Each apple In apples.DataBodyRange.Rows
    
        'Collect Apple row data
        Dim appShape As String
        Dim appHue As Double
        Dim appSweet As Double
        Dim appWeight As Double
        Dim appPrice As Double
        appShape = apple.Cells(1, 2).Text
        appHue = apple.Cells(1, 3).Value
        appSweet = apple.Cells(1, 4).Value
        appWeight = apple.Cells(1, 5).Value
        appPrice = apple.Cells(1, 6).Value
        
        Dim score As Double
        score = 0
        
        'Calculate and sum up the scores
        If shpe = appShape Then
            score = score + 100 * whtShape
        End If
        If hue > 0 Then
            score = score + PercentDiff(appHue, hue) * wgtHue
        End If
        If sweet > 0 Then
            score = score + PercentDiff(appSweet, sweet) * wgtSweet
        End If
        If weight > 0 Then
            score = score + PercentDiff(appWeight, weight) * wgtWeight
        End If
        If price > 0 Then
            score = score + PercentDiff(appPrice, price) * wgtPrice
        End If
        
        'Update score if a new high score is reached
        If score > highScore Then
            highScore = score
            matchName = apple.Cells(1, 1).Value
        End If
        
        Debug.Print ("|- " & apple.Cells(1, 1).Value & " | " & score)
    Next apple
    
    If matchName <> "" Then
        orange.Cells(1, 7).Value = matchName
    End If
    
Next orange

End Sub

'std: standard value to compare against
'comp: compared value
Function PercentDiff(std As Double, comp As Double) As Double
    PercentDiff = (WorksheetFunction.Min(std, comp) / WorksheetFunction.Max(std, comp)) * 100
End Function
4
  • The percent similarity measure (misnomered as PercentDiff) is unconventional from the perspective of statistical analysis. (a,o)=(1,2) has same similarity as (a,o)=(80,40) but both are very dissimilar to (39,40). Conventionally difference rather than similarity measures are used. Measures such as Abs(a-o) and (a-o)^2 overcome the problems of positives and negatives cancelling in the weighted average and, of course, with difference measures the best substitute is the one that minimises the measure.
    – DMM
    Commented Apr 27 at 0:30
  • Other than the unconventional measure used, the code looks fine (though I have not attempted to test). It also suggests that you are solving 9 independent problems: one for each orange. This has two implications. First there are no apple constraints - ie if, for example, the best substitute for each orange turned out to be the same apple variety then you will be able to source sufficient of those apples. Second, it seems likely that you could solve each individual problem using formulas, maybe by adding apple columns to the orange table determining the chosen measure.
    – DMM
    Commented Apr 27 at 18:33
  • Thank you for the feedback on the statistics as it's obviously not my strong suit. What I'm understanding you're saying is that a better strategy would be to use Abs(a-o) (which would give a higher number when the values are more different) multiplying that by my weight value, then averaging all the categories. The apple that has the lowest average difference would be the best alternative for the orange. Is there a strategy to selecting a good weight then? Otherwise I'd be worried of the Sweetness easily drowning out the price difference.
    – jkHeat
    Commented 2 days ago
  • Broadly your understanding is correct, though averaging is not strictly necessary. If you are interested in learning further a related problem is how to put the line of best fit on an (x,y) scatterplot, where the usual approach is to minimise the differences between the estimates of y from the line and the actual y values and the measure of difference is (in your terms) (a-o)^2. A good search term might be "estimation of parameters in simple linear regression".
    – DMM
    Commented 2 days ago

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.