13
\$\begingroup\$

Input a list of strings a and a string s for search keyword. Find out all strings in a which contains s as subsequence. And sort them in the following order:

  1. Exactly equals to s
  2. Starts with s
  3. Contains s as substring (continuous subsequence)
  4. Contains s as subsequence

Detail

  • When two strings belongs to the same sorting group, you may sort them in any order you prefer.
  • String matching is case sensitive. "A" and "a" are different characters.
  • All strings will only contain printable ASCII (#32~#126).
  • All strings will not have leading or trailing whitespaces.
  • All strings will be non-empty.
  • List a does not contain duplicate strings.

Example

When the list is ["center","encounter","enter","enterprise","event"], and the search target is "enter", output should be ["enter","enterprise","center","encounter"]. "event" is not included in the output as it doesn't contain "enter" as subsequence.

Test cases

["center","encounter","enter","enterprise","event"]
"enter"
-> ["enter","enterprise","center","encounter"]

["celebration","cooperation","generation","operation","ratio"]
"ratio"
-> ["ratio","celebration","cooperation","generation","operation"]

["combination","explanation","international","nation","national","nomination","notation"]
"nation"
-> ["nation","national","combination","explanation","international","nomination","notation"]

["ever","every","here","very","where"]
"everywhere"
-> []

["interaction","traditional","train","training","transformation"]
"train"
-> ["train","training","interaction","traditional","transformation"]

["condition","confusion","construction","contribution","information","organization","recommendation","transportation"]
"onion"
-> ["condition","confusion","construction","contribution","organization","recommendation"]

["...","---",".-.-.-","..--","-..-"]
"--"
-> ["---","..--",".-.-.-","-..-"]

["#","##","###","####","#####"]
"####"
-> ["####","#####"]

["Another", "example", "with spaces", "and also", "question marks", "...??"]
"a"
-> ["and also", "example", "with spaces", "question marks"]

["/.\\", "...", "><", "[[]]", "~.~", ".1.2", "_[(("]
"."
-> ["...", ".1.2", "/.\\", "~.~"]

["(())", "()()", "((()))", "(())()", "()(())", "()()()"]
"(())"
-> ["(())", "(())()", "((()))", "()(())", "()()()"]

["]["]
"]["
-> ["]["]

["\\", "\\\\", "\\\\\\"] # Input is encoded as JSON, while "\\" means a string with a single backslash
"\\"
-> ["\\", "\\\\", "\\\\\\"]

Output from your program may be different from above test cases, as the order of words in same group is not required.

Rules

Input / Output

Input / Output are flexible. For example, you may use any reasonable ways including but not limited to:

  • You may I/O string as
    • Your languages built-in string in ASCII or any ASCII compatible encoding (e.g. UTF-8);
    • Your languages built-in string in any codepage that supports all printable ASCII characters (e.g. UTF-16);
    • NUL terminated array of characters;
    • array of integers, each integer is the ASCII value of character
    • 0 terminated integer array;
  • You may I/O the array of string as
    • A collection (OrderedSet, LinkedList, Array, ...; or HashSet only for input) of strings
    • A character (or ASCII value) matrix with NUL (0) padding at the ending to each short ones;
      • Output matrix may have unnecessarily extra 0 padding;
    • Line break (CR / LF / CRLF) separated single string;
    • JSON encoded array of string
\$\endgroup\$
2
  • \$\begingroup\$ I think a regex based solution would pass all test cases. Yet it would be invalid unless special characters are escaped properly. So maybe you should either add a test case that makes something like that fail ... or restrict the input to letters only. \$\endgroup\$ Commented Dec 15, 2022 at 8:10
  • \$\begingroup\$ @Arnauld I had added another testcases. \$\endgroup\$ Commented Dec 15, 2022 at 8:32

11 Answers 11

4
\$\begingroup\$

Vyxal, 13 bytes

≬ṗ⁰cFsµ₌ǎṗJ⁰ḟ

Try it online! Input as a then s. Inefficient approach.

≬             # a 3-element lambda:
 ṗ            #   get all subsets
   c          #   does it contain
  ⁰           #   the last input (s)?
    F         # filter according to lambda
     s        # sort alphabetically (handles ordering of 1 and 2)
      µ       # a sorting lambda:
       ₌      #   parallel apply:
        ǎ     #     substrings (handles 1, 2, and 3)
         ṗ    #     subsets (handles 4)
          J   #   join together
            ḟ #   find the first occurrence of
           ⁰  #   last input
\$\endgroup\$
3
\$\begingroup\$

Japt, 16 bytes

Outputs a 2D-array; add 1 byte if that isn't permitted. Works in theory for all tests cases but there is a bug in Japt that prevents it from being able to handle mismatched square brackets in strings within an input array.

fÈà øVÃüøV ÔËñbV

Try it

fÈà øVÃüøV ÔËñbV     :Implicit input of array U & string V
f                    :Filter U by
 È                   :Passing each element through the following function
  à                  :  Combinations
    øV               :  Contains V?
      Ã              :End filter
       ü             :Group & sort by
        øV           :  Contains V?
           Ô         :Reverse
            Ë        :Map
             ñ       :  Sort by
              bV     :    First index of V
\$\endgroup\$
2
\$\begingroup\$

Retina, 85 bytes

~(L$`^.+
$&¶mN$$`^(.$*$\$&)?.$*¶-$$#1¶mO$$`^(.$*?)$\$&.$*¶$$1
\G.
$\$&.$*
^
0A`¶O`¶G`

Try it online! Takes the keyword as the first line and the search strings on the remaining lines. Explanation:

~(`

Evaluate the output of the rest of the program on the original input.

L$`^.+

Match just the keyword but replace the entire input with...

$&¶mN$$`^(.$*$\$&)?.$*¶-$$#1¶mO$$`^(.$*?)$\$&.$*¶$$1

... the keyword followed by two sort commands (see below).

\G.
$\$&.$*

Escape the keyword and insert .* between each escaped character.

^
0A`¶O`¶G`

Prefix two commands to the keyword and turn the keyword into a filter command.

For the example of the input keyword of train, the program that gets evaluated is as follows:

0A`
O`
G`t.*r.*a.*i.*n.*
mN$`^(.*train)?.*
-$#1
mN$`^(.*?)train.*
$.1

Explanation:

0A`

Delete the keyword.

O`

Sort the input. (This ensures that train appears before training.)

G`t.*r.*a.*i.*n.*

Filter for strings that have train as a subsequence.

mN$`^(.*train)?.*
-$#1

Sort strings that contain train first.

mO$`^(.*?)train.*
$1

Sort strings that contain train by the (length of the) prefix of the word train.

\$\endgroup\$
2
\$\begingroup\$

Python 3, 104 bytes

lambda a,s:sorted([i for i in a if all(j in i for j in s)],key=lambda x:(x==s)-(x[:len(s)]==s)-(x in s))

I used sorted to order the result of the nested list comprehension.

Try it online!

\$\endgroup\$
8
  • \$\begingroup\$ This doesn't seem to give the correct output. For example, the first string in the first example should be enter instead of center. \$\endgroup\$ Commented Dec 20, 2022 at 3:53
  • \$\begingroup\$ @chunes It is in different order. But the results are the same. \$\endgroup\$ Commented Dec 20, 2022 at 3:54
  • \$\begingroup\$ @chunes The OP mentioned in his question that "Output from your program may be different from above test cases, as the order of words in same group is not required." \$\endgroup\$ Commented Dec 20, 2022 at 3:55
  • 1
    \$\begingroup\$ Words within the same group can be in any order, but the groups themselves must be in a particular order. enter should come before center because enter is in the 'exactly equal` group whereas center is in the substring group. \$\endgroup\$ Commented Dec 20, 2022 at 3:58
  • \$\begingroup\$ @chunes Edited mine, it's now by order :) \$\endgroup\$ Commented Dec 20, 2022 at 4:11
1
\$\begingroup\$

Charcoal, 50 bytes

Fθ«≔ηζFιF¬⌕ζκ≔Φζνζ¿¬ζ⊞υ⟦¬№ιη⌕ιηι⟧»≔⟦⟧εW⁻υε⊞ε⌊ιEε⊟ι

Try it online! Link is to verbose version of code. Explanation:

Fθ«

Loop through the list of strings.

≔ηζ

Start with none of the letters of the search keyword found.

Fι

Loop through the letters in the current string.

F¬⌕ζκ

If the search keyword starts with the current letter, then...

≔Φζνζ

... remove the first letter from the copy of the search keyword.

¿¬ζ

If the search keyword is now empty, then...

⊞υ⟦¬№ιη⌕ιηι⟧

... push a list of three terms to the predefined empty list: i) a flag that is zero if the keyword is contained in the string rather than just a subsequence ii) the position of the keyword in the string iii) the string.

»≔⟦⟧εW⁻υε⊞ε⌊ι

Sort the matches.

Eε⊟ι

Output just the strings.

\$\endgroup\$
1
\$\begingroup\$

JavaScript (ES10), 126 bytes

-7 thanks to @tsh

Expects (array)(string).

a=>s=>[`^${q=s.replace(/\W/g,'\\$&')}$`,"^"+q,q,q.replace(/\\?./g,'.*$&')].flatMap(e=>a.filter(s=>a[0+s]?0:a[0+s]=s.match(e)))

Try it online!

Escaping characters in the regular expressions costs  41  34 bytes ... :'(

\$\endgroup\$
3
  • 1
    \$\begingroup\$ a=>s=>[`^${q=s.replace(/\W/g,'\\$&')}$`,"^"+q,q,q.replace(/\\?./g,'.*$&')].flatMap(e=>a.filter(s=>a[s]?0:a[s]=s.match(e))) \$\endgroup\$ Commented Dec 16, 2022 at 9:30
  • 1
    \$\begingroup\$ Fail ["0"], "0" \$\endgroup\$ Commented Dec 20, 2022 at 4:26
  • 1
    \$\begingroup\$ And maybe can be fixed by adding 4 bytes a[s] -> a[0+s] \$\endgroup\$ Commented Dec 20, 2022 at 6:24
1
\$\begingroup\$

JavaScript (Node.js), 121 bytes

s=>a=>a.filter(a=>g(a,s),g=([c,...x],y)=>c?g(x,y.slice(c==y[0])):!y).sort((x,y)=>(g=i=>s==i?-1:i.indexOf(s)>>>0)(x)-g(y))

Try it online!

no regex is shorter

JavaScript (Node.js), 133 126 125 bytes

s=>a=>a.filter(x=>x.match(s.replace(/(\w)|./g,(c,d)=>'.*'+[d||'\\'+c]))).sort((x,y)=>(g=i=>s==i?-1:i.indexOf(s)>>>0)(x)-g(y))

Try it online!

Sort equality then the first occurrence

\$\endgroup\$
1
  • \$\begingroup\$ @tsh Outputting [] should be intended \$\endgroup\$ Commented Dec 15, 2022 at 9:03
1
\$\begingroup\$

05AB1E, 16 bytes

ʒæIå}ΣηR¬œ€Œ˜«Ik

Try it online. (Way too slow for a test suite of all test cases.)

Explanation:

ʒ             # Filter the first (implicit) input-list by:
 æ            #  Get the powerset of the current word
  Iå          #  Check if the second input is in this list
}Σ            # After the filter, sort by:
  η           #  Get the prefixes of the current word
   R          #  Reverse this list
    ¬œ        #  Get the permutations of the current word as well
      €Œ      #  Get the substrings of each permutation
        ˜     #  Flatten this list
         «    #  Merge the two lists together
          Ik  #  Get the index of the second input in this list
              # (after which the filtered and sorted list of words is output implicitly)

Try just ηR¬œ€Œ˜« online to see how its order corresponds to the order of \$1\$ through \$4\$.

\$\endgroup\$
0
\$\begingroup\$

Haskell, 137 bytes

import Data.List
o s a=map snd.filter((<0).fst)$sort[(-length(
 takeWhile($x)[isSubsequenceOf s,isInfixOf s,isPrefixOf s,(==)s]),x)|x<-a]

Try it online!

\$\endgroup\$
0
\$\begingroup\$

Raku, 98 bytes

{$^s;@^a.grep(/<{S:g[(\W)|.]='\\'x?$0~$/~'.*'
given $s}>/).sort:{first :k,$_~~*,$s,/^$s/,/$s/,!0}}

Try it online!

I'm dismayed to learn that Raku has lost Perl 5's quotemeta function and \Q quoting construct. Reproducing the concept cost many bytes.

Anyway...

  • The functions arguments are passed in the placeholder variables @^a (the list of strings) and $^s (the search keyword).
  • The grep call on @^a filters the list to those elements in which the search keyword appears as a subsequence. This is done by transforming the search keyword into a new string by prefixing every non-word character with a backslash and appending a .* to every character.
  • Then sort sorts the remaining strings. The sort function is first :k, $_ ~~ *, $s, /^$s/, /$s/, !0. This returns the index (due to the :k flag) of the first of $s, /^$s/, /$s/, or !0 (a shorter way to write True) against which the current word smart-matches. The current word matches the string $s if it's the same as the search keyword, matches the regex /^$s/ if the search keyword appears at the start of the string, matches the regex /$s/ if the search keyword appears anywhere inside the string, and always matches True.
\$\endgroup\$
0
\$\begingroup\$

Wolfram Language (Mathematica), 113 bytes

(s=#2;SortBy[Select[#1,StringContainsQ[#,s]&],(d=Last@LongestCommonSequencePositions[s,#];{Length@d,First@d})&])&

Try it online!

Thank you for the interesting challenge, golf it on Mathematica is hard, but I tried (-_-)
LongestCommonSequencePositions is one of longest function names in WM 8)

\$\endgroup\$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.