0

sample input is

<bre rt="1600" et="1550794901464" st="1550794899864" tid="8390500116294391399" mh="N" cn="" lc="" ts="N/A" cidc="" IDC="" eidc="BRE-S-TRA-0085418501"/>
    <r1>
        <gr1>
            <a="1" b="smaple data with spaces" c="Created TrasctionInfo" d="1550794901228"/>
            <e="INITIAL" f="2" g="INITIAL_LEGACY" h="1550794901228" i="LegacyToggle is off. Follow Legacy flow"/>
            <lx ets="2019-02-22T00:21:41.228Z" trxn="smaple data with spaces 2 record" rn="Derive data" abc="COT def" def="Season occur" trxn="smaple data with spaces 3rd record" den="andys and others" trxn="smaple data with spaces 4th record" kit="Theater - Span day"
             rns="Span day" trxn="smaple data with spaces 5th record" off="|"/>
            <cwl wc="2.0766" tot="16" act="116.28960000000001" CSE="CHE-CSFL" wg1.0" high="1" </cwl>
                </gr1>
            </r1>
</bre>
<bre rt="1234" et="1234794901464" st="1234794899864" tid="2345500116294391399" mh="Y" cn="At123" lc="" ts="NA" cidc="" IDC="some text value" eidc="abc-def-gh-2385418501"/>
    <r1>
        <gr1>
            <a="1" trxn="other data with spaces" c="Created Info" d="3434794545228"/>
            <e="begin" f="2" g="INITIAL_LEGACY" h="1234709901228" i="Toggle hig. Follow toggle flow"/>
            <lx ets="2017-02-22T00:21:41.228Z" trxn="another record data" rn="Derive data" abc="COT def" trxn="smaple data with spaces record" def="Season occur" den="andys and others" trxn="smaple data with spaces 4th record" kit="Theater - Span day"
             rns="Span day" trxn="data with spaces" off="|"/>
            <cwl wc="2.0766" tot="16" act="116.28960000000001" CSE="CHE-CSFL" wg1.0" high="1" </cwl>
                </gr1>
            </r1>
</bre>
<bre rt="1234" et="1234794901464" st="1234794899864" tid="2345500116294391399" mh="Y" cn="At123" lc="" ts="NA" cidc="" IDC="some text value" eidc="abc-def-gh-2385418501"/>
    <r1>
        <gr1>
            <a="1" c="Created transaction" b="3434794545228"/>
            <e="begin" f="2" g="INITIAL_LEGACY" h="1234709901228" i="Toggle hig. Follow toggle flow"/>
            <lx ets="2017-02-22T00:21:41.228Z" rn="Derive data" abc="COT def" def="Season occur" den="andys and others" kit="Theater - Span day"
             rns="Span day" off="|"/>
            <cwl wc="2.0766" tot="16" act="116.28960000000001" CSE="CHE-CSFL" wg1.0" high="1" </cwl>
                </gr1>
            </r1>
</bre>

output should be

tid="8390500116294391399"
ts="N/A"
ets="2019-02-22T00:21:41.228Z" 
trxn="smaple data with spaces 2 record"
trxn="smaple data with spaces 3rd record"
trxn="smaple data with spaces 5th record"
tid="2345500116294391399"
ts="NA"
ets="2017-02-22T00:21:41.228Z" 
trxn="other data with spaces"
trxn="another record data"
trxn="smaple data with spaces record"
trxn="data with spaces"
tid="2345500116294391399"
ts="NA"
ets="2017-02-22T00:21:41.228Z"

I tried like below

sed -e 's/trxn=/\ntrxn=/g' -e 's/tid=/\ntid=/g' -e 's/ts=/\nts=/g'

while IFS= read -r var
do
    if grep -Fxq "$trxn" temp2.txt
    then
      awk -F"=" '/tid/{print VAL=$i} /ts/{print VAL=$i} /ets/{print VAL=$i} /trxn/{print VAL=$i} /tid/{print VAL=$i;next}' temp2.txt >> out.txt
    else
      awk -F"=" '/tid/{print VAL=$i} /ts/{print VAL=$i} /ets/{print VAL=$i} /tid/{print VAL=$i;next}' temp2.txt >> out.txt
    fi
done < "$input"

3 Answers 3

0

Or with grep:

$ grep -Eo '(ets|tid|trxn|ts)="[^"]+"' file
tid="8390500116294391399"
ts="N/A"
ets="2019-02-22T00:21:41.228Z"
trxn="smaple data with spaces 2 record"
trxn="smaple data with spaces 3rd record"
trxn="smaple data with spaces 4th record"
trxn="smaple data with spaces 5th record"
tid="2345500116294391399"
ts="NA"
trxn="other data with spaces"
ets="2017-02-22T00:21:41.228Z"
trxn="another record data"
trxn="smaple data with spaces record"
trxn="smaple data with spaces 4th record"
trxn="data with spaces"
tid="2345500116294391399"
ts="NA"
ets="2017-02-22T00:21:41.228Z"
4
  • My bad, i didn't said the file size. Since the file size 50+ gb, breaking into new lines increases more. All the responses are correct. Thanks to all. @Freddy, validated the code, working fine and suitable in this case.
    – BNRINBOX
    Commented Apr 15, 2019 at 7:05
  • it failed to consider when null ("") values appeared.
    – BNRINBOX
    Commented Apr 15, 2019 at 12:30
  • @BNRINBOX Replace + with *.
    – Freddy
    Commented Apr 15, 2019 at 12:37
  • Its working as expected. Thanks alot Freddy
    – BNRINBOX
    Commented Apr 16, 2019 at 4:37
0

Try this,

sed -e 's/trxn=/\ntrxn=/g' -e 's/tid=/\ntid=/g' -e 's/ets=/\nets=/g' input | awk -F '"' '$1~/ets|trx|tid/{print $1"\""$2"\""}'


tid="8390500116294391399"
ets="2019-02-22T00:21:41.228Z"
trxn="smaple data with spaces 2 record"
trxn="smaple data with spaces 3rd record"
trxn="smaple data with spaces 4th record"
trxn="smaple data with spaces 5th record"
tid="2345500116294391399"
trxn="other data with spaces"
ets="2017-02-22T00:21:41.228Z"
trxn="another record data"
trxn="smaple data with spaces record"
trxn="smaple data with spaces 4th record"
trxn="data with spaces"
tid="2345500116294391399"
ets="2017-02-22T00:21:41.228Z"
0
sed -e "s#\" #\"\n#g;s#.*<lx ##" filename  | grep -E "tid=|ts=|ets=|trxn"

replace all the " (double quotes) with " (double quotes) + new line, then just grep the required patterns.

$ awk -F\" '{for(i=1;i<=NF;i++)if($i~/tid=|ts=|ets=|trxn/){gsub(".* ","",$i);print $i""$(i+1)}}' filename
tid=8390500116294391399
ts=N/A
ets=2019-02-22T00:21:41.228Z
trxn=smaple data with spaces 2 record
trxn=smaple data with spaces 3rd record
trxn=smaple data with spaces 4th record
trxn=smaple data with spaces 5th record

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.