4

I would like to use a Perl one-liner to modify numeric values in a text file. My data are stored in a text file:

0, 0, (1.263566e+02, -5.062154e+02)
0, 1, (1.069488e+02, -1.636887e+02)
0, 2, (-2.281294e-01, -7.787449e-01)
0, 3, (5.492424e+00, -4.145492e+01)
0, 4, (-7.961223e-01, 2.740912e+01)

These are complex numbers with their respective i and j coordinates: i, j, (real, imag). I would like to modify the coordinates, to shift them from zero-based to one-based indexing. In other words I would like to add one to each i and each j. I can correctly capture the i and j, but I'm struggling to treat them as numbers not as strings. This is the one-liner I'm using:

perl -p -i.bak -w -e 's/^(\d+), (\d+)/$1+1, $2+1/' complex.txt

How do I tell Perl to treat $1 and $2 as numbers?

My expected output would be:

1, 1, (1.263566e+02, -5.062154e+02)
1, 2, (1.069488e+02, -1.636887e+02)
1, 3, (-2.281294e-01, -7.787449e-01)
1, 4, (5.492424e+00, -4.145492e+01)
1, 5, (-7.961223e-01, 2.740912e+01)
8
  • 2
    After the battle, a suggestion: perl -i.bak -pwe's/\G[, ]*\K\d+/$&+1/ge' complex.txt (using a glue anchor: it's only coquetry to make it shorter) Commented Feb 7 at 19:45
  • 1
    @CasimiretHippolyte Won't that change all the numbers on the line, and not just the first two? Commented Feb 8 at 16:25
  • 2
    @TLP: all consecutive numbers separated by spaces and commas from the start of the line. In other words, the first two, since there's an opening round bracket that breaks the contiguity. Commented Feb 8 at 21:35
  • 1
    @CasimiretHippolyte It might make an interesting answer, if you can explain the use of \G. Commented Feb 8 at 23:26
  • 1
    There is the regex inline code construct (?{..}) that does num++ -> "" as well. Try this perl -pe "s/^(\d+)(?{$r=$^N+1}), (\d+)(?{$r.=', '.($^N+1)})/$r/" complex.txt onlinegdb.com/2-gl2jAln Commented Feb 11 at 22:00

4 Answers 4

4

If you're sure the i and j will alway be integers, the simplest way is to use /e:

perl -pwe 's/^(\d+),\s*(\d+)/($1+1).", ".($2+1)/e' ./raw

gives

1, 1, (1.263566e+02, -5.062154e+02)
1, 2, (1.069488e+02, -1.636887e+02)
1, 3, (-2.281294e-01, -7.787449e-01)
1, 4, (5.492424e+00, -4.145492e+01)
1, 5, (-7.961223e-01, 2.740912e+01)

The /e enables right side expression evaluation: perlop

e   Evaluate the right side as an expression.
ee  Evaluate the right side as a string then eval the
    result.
r   Return substitution and leave the original string
    untouched.

The .", ". is a simple string concatenation to easily back the , between i and j.

Sign up to request clarification or add additional context in comments.

8 Comments

Can you please elaborate on, what is /e and .", ".?
e makes the RHS a Perl expression that's called to generate the replacement value. In other words, s{...}{...} is short for s{...}{ qq{...} }e
Can you please explain the s{..}{ qq{...} }e. Does it mean, that by default the RHS in the s/.../.../ syntax treats everything as quoted words? And if I add an /e the RHS will be treated as an expression?
Your confusing qq and qw. qq is double-quoted string. See perlop
Yes, without /e it is a double quoted string, with interpolation. With /e it is evaluated as a Perl expression. I'm not sure if it will enforce string context, in cases where expressions return different things depending on context, but the end result would be a string to be inserted.
Better with /a.
What is /a? What does it do?
Restricts things like \d from matching non ASCII characters. Otherwise it can match lots of digits that aren't 0-9, such as ෯ or ႘
2

The following discussion was posted as comments but they ended up running too long for comments and I guess they do constitute an answer so here it is, expanded.

The problem is not Perl's string-vs-number treatment but the fact that the whole replacement side in s/// is processed as a double-quoted string, yielding a direct replacement for what is matched. The dollar variables, if populated in captures, do get interpolated. So you get the value for $1, followed by the sequence of characters + and 1, and then follow the , and the space and the value for $2 followed by characters + and 1.

One cannot do anything there while you'd like to add as + is meant as an operation. With the /e modifier the replacement side is evaluated as code and then you can write Perl code in there, so that $1 + 1 is an addition -- but then the whole replacement side is one piece of code, the result of which is used to replace everything matched. Thus the need to use concatenation . in code, to handle (reproduce) the comma and spaces that were matched, etc. We still must capture on the matching side whatever is meant to be available in a dollar variable.

So apart from needing to add $1 and 1 you also need to then tack a comma and a space onto it, and then concatenate the other piece, $2 + 1, so

s/^([0-9]+), ([0-9]+)/($1+1) . ", " . ($2+1)/e

Thus the answer by 0stone0


On the other hand, Perl very readily converts between strings and numbers as needed; one could say that it is even too eager. So if you say on the command line

perl -wE'my $v = "a"; say $v + 1'   # warns, converts 'a' to 0, adds

it will treat $v as a number because of that +, and will warn that it isn't numeric (if you enable warnings, what you should) -- and it will do it: 'a' gets converted to 0 and it adds 1 to it.

The capital E there is e with "features" enabled so we can use say. One may avoid to use it in one-liners that are meant for long-term use because things change in the language and with E everything is enabled so one may be using a feature one never even heard of but which at some point ends up affecting the code.


Search for the word modifier in perlretut: Simple word matching, for example. The full, more strict, reference is perlre.

Comments

1

Your data is well strcutured, so i don't see any benefit from using Regex here.

You can just parse the file, parse the lines, get first two items from each line, add 1 to them and write back.

In my opinion, here regex will be too implicit. I always prefer explicit implementation, if implicit option does not give me great benefits.

Here's simple example:

#!/usr/bin/env perl
use strict;
use warnings;

my $data = <<'END_DATA';
0, 0, (1.263566e+02, -5.062154e+02)
0, 1, (1.069488e+02, -1.636887e+02)
0, 2, (-2.281294e-01, -7.787449e-01)
0, 3, (5.492424e+00, -4.145492e+01)
0, 4, (-7.961223e-01, 2.740912e+01)
END_DATA

for my $line (split /\n/, $data) {
    my @fields = split /, /, $line, 3;   # i, j, "(real, imag)"
    $fields[0]++;
    $fields[1]++;
    print join(", ", @fields), "\n";
}

Comments

1

Given the nice structure of your file (CSV) and its simplicity (only contains numbers which use . as decimal separator), I think that it would be easier to use autosplit (-a) rather than a regex:

perl -F, -ane '$F[0]++; $F[1]++; print join ",", @F'

(the -F, flag tells Perl to split on commas instead of spaces)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.