0

I'm literally pulling hairs out in frustration on trying to use multiline regex, I've very little experience with Powershell, and while the examples I've tried work, as soon as I start to adjust them for what I need it never gives any results.

My text file example below

CLO*5000000Z115240057*598.50***94>0
DGP*115*G8*20161024~
DGP*096*G8*20161024~
DI*ABC>121~
QM1*BN*1*~
QM2*H2*1*~
QM1397*2*~
Q3*~
Q4*~
TX*1~
SQU*HV>01480>AB>1S>1>2>3>4~
0T1*472*D8*20160915~
RBF*6R*374196~
TX*2~

There are maybe 200 (at most) of these in the same text file. I'm searching for the line that starts with 'SQU' to contain 1>2>3>4 at the end, there are only a handful that do. I'm able to find all of the SQU lines with the code example I found below, unfortunately I need to also get the 'CLO' line as well, which is above it.

$fpath = 'C:\myfile.txt'
$opath = 'C:\logoutput.txt'
$regx = 'SQU.*1>2>3>4.*'
Get-Content $fpath | % { if($_ -match $regx) {add-content $opath $_}}

I've tried, and I've tried dozens of $'s and ^'s and ()'s all over the example below in every combination I could think of. I don't really understand how to get it into the logoutput.txt either.

$fileContent = [io.file]::ReadAllText($fpath)
$filecontent | Select-String '(?ms)CLO.*SQU.*1>2>3>4.*' -AllMatches | %{ $_.Matches } | %{ $_.Value } 

and this one I tried without the >1>2>3>4 just to see if I could get anything, but no luck.

$stringmatch = Get-Content -raw $fpath
if (Select-String -inputobject $stringmatch -pattern '(?smi)CLO.*SQU.*'){
$matches[1]
} 

I only need the CLO and SQU lines (if it has the 1>2>3>4) but honestly at this point I'll take the entire block if it's easier. Any help would be appreciated.

2
  • is the CLO line always 10 lines away from SQU? Commented Oct 24, 2016 at 18:37
  • No it's not, some of these lines are not in each block beginning with CLO, sometimes there are additional lines even - they do all have the CLO and SQU lines in each block (hope that made sense)
    – jalexander
    Commented Oct 24, 2016 at 18:41

2 Answers 2

0
$fileContent = [io.file]::ReadAllText($fpath)

# Match lines beginning with CLO, and lines beginning with SQU
$m = [regex]::Matches($fileContent,
                      '(?<clo>^CLO.*?$).*?(?<squ>^SQU.*?$)',
                      [System.Text.RegularExpressions.RegexOptions]('Multiline', 'Singleline')) 

# Filter out only the pairs where the SQU lines also have the right ending
$m | Where-Object { $_.Groups['squ'].Value -match "1>2>3>4~" } | 
     ForEach-Object { 
        $_.Groups['clo'].Value
        $_.Groups['squ'].Value
} 
3
  • MatthewG : for some reason when running this from powershell ISE it just hangs forever. Thank you for explaining it for me though
    – jalexander
    Commented Oct 24, 2016 at 19:06
  • TessellatingHeckler : This did work for me , and thank you for explaining it also. I appreciate all the help from both of you, maybe I can calm down now.
    – jalexander
    Commented Oct 24, 2016 at 19:08
  • 1
    (?sm) may be used to set singleline + multi-line mode: quick reference.
    – woxxom
    Commented Oct 25, 2016 at 1:15
0

In your second example - you read the text file, and then you match the regex, and then you print out the value. The value is going to show the entire contents of the match, which includes all the .* in between CLO and SQU. Instead use group capture on the lines you care about (using parens) and then print only the groups' values.

Also - modified the regex to use non-greedy matching so multiple matches will work properly. The other modification to the regex is that you do not want to match characters after the end of line on the CLO and SQU groups - so make sure those matches end with the end of line $ indicator.

$fileContent = [io.file]::ReadAllText($fpath)
$filecontent | Select-String '(?ms)(CLO[^\n]*?$).*?(SQU.*?1>2>3>4[^\n]*$)' -AllMatches | %{ $_.Matches } | %{ $_.Groups[1].Value; $_.Groups[2].Value }
1
  • If there is an SQU line which doesn't end with 1>2>3>4, then your SQU.*?1>2>3>4 match will reach across the subsequent lines, over the next CLO line, and right as far as the next line which does end that way. Commented Oct 24, 2016 at 18:59

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.