Newest 'string-matching' Questions

Advice

0 votes

4 replies

88 views

How to create stable person identifiers when names vary across years

I am working with a university faculty salary dataset where the same person appears across many years, but their name strings are inconsistent. The dataset has about 8,000 unique people and years from ...

Mengyang Cao

13

asked Nov 17 at 6:26

-1 votes

2 answers

167 views

Java regex - Optional Match Capturing Group [closed]

I'd like to process some input queries in 3 possible ways: query: select * from People query: select * from People exclude addresses query: select * from People include department I have two regex1 ...

DayaMoon

364

asked Oct 22 at 13:04

0 votes

2 answers

87 views

Automatically map messy column names to a standard schema in pandas

I'm working with many tabular datasets (Excel, CSV) that contain inconsistent or messy column names due to typos, different naming conventions, spacing, punctuation, etc. I have a standard schema (as ...

Ste347789

11

asked Jun 25 at 15:04

0 votes

1 answer

56 views

expect-5.45.4 shows unexpected spawn output, causing string match to fail; is it a bug?

In SLES15 SP6 on x86_64 I'm using a bash script and expect-5.45.4 to do automated program testing. Basically I'm checking whether the program to test (./pwg.pl) outputs a specific string. Starting to ...

U. Windl

4,748

asked Jun 13 at 8:43

-2 votes

1 answer

116 views

How to match German province names between 2 data sets in R?

I'm working with two datasets for German NUTS-3 level regions: A shapefile from Eurostat via the giscoR package: > library(giscoR) > nuts3_germany <- gisco_get_nuts(country = "Germany&...

Saïd Maanan

811

asked Jun 5 at 13:56

4 votes

4 answers

169 views

Match start of line in multiline string in lua?

Let's say I want to match any sequence of the hash sign # at the start of a string; so I'd want to match ## here: local mystr = "##First line\nSecond line\nThird line" ... and ### here: ...

sdbbs

5,948

asked May 7 at 21:51

2 votes

3 answers

123 views

Pandas DataFrame column partial match and extract matching value

I have a column in Pandas DataFrame(Names) with a large collection of names. I have another DataFrame(Title) text column and in between text, the names in Name frame are there. What would be the ...

Totura

167

asked Apr 9 at 0:53

2 votes

0 answers

88 views

Find Substrings In A Dynamic Collection Of String

This question is a little complicated, so I try to describe it through an example. First, we get a string foo, and put it into collection S. Then we get a string sample, and put it into S too. Next, ...

differentrain

81

asked Feb 8 at 5:44

1 vote

1 answer

71 views

Match similar names [duplicate]

I have a database with three columns: name, occupation, and organization. In these columns, I have duplicates with slightly different names. For example, Anne Sue Frank and Anne S. Frank refer to the ...

Vitoria Sanchez

21

asked Jan 30 at 18:25

0 votes

2 answers

86 views

How to match cross-referenced names from table without duplicates

savvy people, I will have participants of an event sign up where they, aside from their personal details, also provide a duo partners name or leave that blank. So, I will have two columns, ...

Lex Plantenga

1

asked Jan 5 at 6:03

1 vote

3 answers

96 views

Find str.contains in two large Pandas DataFrames

I have a large pandas DataFrames like below. import pandas as pd import numpy as np df = pd.DataFrame( [ ("1", "Dixon Street", "Auckland"), ("2&...

Totura

167

asked Nov 14, 2024 at 1:40

0 votes

1 answer

90 views

Full string matching in Pandas dataframes comparison

this seems like it should be an easy problem to solve, but I've been battling with it and cannot seem to find a solution. I have two dataframes of different sizes and different column names. I am ...

Rose_Trojan

117

asked Sep 30, 2024 at 15:09

1 vote

1 answer

79 views

How to match a function but exclude object methods without negative lookbehind

I'm trying to write a regex that matches every occurrence of some_function(...), but it should not match when it's part of an object method like my.some_function(...) or if it is a substring of ...

JVS

2,682

asked Sep 13, 2024 at 7:27

2 votes

2 answers

88 views

Do Kotlin's List/Array data structures have a findSublist method analogous to String.indexOf(CharSequence)?

Do Kotlin's List/Array data structures have a findSublist method analogous to String.indexOf(CharSequence), that takes a List/Array/Sequence to match against the list?

tpdi

35.3k

asked Sep 10, 2024 at 15:44

1 vote

0 answers

78 views

Trying to fix names in my database with fuzzywuzzy

What I'm trying to do is find and correct similar names in my database, like 'Patrick Maxwell' and 'Patrick Maxwel.' However, the issue I'm facing is that the best match for each name is often itself, ...

Kauan Randall Oliveira Ferreir

11

asked Sep 3, 2024 at 12:48

Collectives™ on Stack Overflow

How to create stable person identifiers when names vary across years

Java regex - Optional Match Capturing Group [closed]

Automatically map messy column names to a standard schema in pandas

expect-5.45.4 shows unexpected spawn output, causing string match to fail; is it a bug?

How to match German province names between 2 data sets in R?

Match start of line in multiline string in lua?

Pandas DataFrame column partial match and extract matching value

Find Substrings In A Dynamic Collection Of String

Match similar names [duplicate]

How to match cross-referenced names from table without duplicates

Find str.contains in two large Pandas DataFrames

Full string matching in Pandas dataframes comparison

How to match a function but exclude object methods without negative lookbehind

Do Kotlin's List/Array data structures have a findSublist method analogous to String.indexOf(CharSequence)?

Trying to fix names in my database with fuzzywuzzy

Hot Network Questions