0

I wonder how to compare two almost identical strings and output another string containing the new and the old value. It has to work wordwise. Actually its exactly what you can see on this page in the edit-history.

Example:

$string1 = 'A very, very nice day today.';
$string2 = 'An almost nice day today';

$output = compare_strings($string1, $string2);

Outputs: "A very, very An almost nice day today. today"

I know there is the text_diff package, but I want to know exactly how to do it manually.


I thought of creating an array with one entry per word and then compare them, but I don't know how to compare the best.

6
  • Take a look at that: php.net/manual/en/function.levenshtein.php edit: yep, a duplicate ;) Commented May 23, 2014 at 8:41
  • en.wikipedia.org/wiki/Longest_common_subsequence_problem - the algorithms are plentiful and well documented. Commented May 23, 2014 at 8:42
  • @SatishSharma - As I wrote in the end, I thought about it a long time but stucked with the comparism of the strings. Commented May 23, 2014 at 8:42
  • 3
    Your example doesn't quite work wordwise. "A very, very" is 3 words; "An almost" is 2. So what you're trying to do is pretty complicated. As everyone else says, you need to look at diff algorithms, which have a long history. Commented May 23, 2014 at 8:43
  • first decide the rule for get the expected output. then we can provide the solution for it. Commented May 23, 2014 at 8:44

2 Answers 2

2

Well, This is not a simple task, It requires more effort. I think you may know about sentiment analysis and Linguistics analysis,

Yes, It is a part of that and it is not easy.Although you can write the program for that but It can't have 100% accuracy.

Go through this link: Linguistics analysis: http://en.wikipedia.org/wiki/Linguistics

And for sentiment analysis:
http://en.wikipedia.org/wiki/Sentiment_analysis

My suggestion:

A. Go through linguistics analysis.

B. Very simple way: Write a program, keep every word on array then compare two array and define some threshold(may be 70%) and find out similar or not.

C. Make rule using positive and negative words, then map your sentence.

Sign up to request clarification or add additional context in comments.

Comments

0

This really requires an algorithm and can be - depending on your skill level - really difficult or hard atleast. On the internet, the diff algorithm is explained alot and I want to suggest you to take a look at Wikipedia: DIFF

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.