1

I have a file with specific dates and i want to convert them to UTC format. so I prepared a small script

I get below errors:

date: option requires an argument -- 'd'
Try `date --help' for more information.
date: extra operand `16:16:53'

Content of my File looks this:

20191014161653042nmd
20191014161653052egc
20191004081901490egc
20191004081901493nex
20191004081901497nex
20191004081902531nex
20191004081902534ksd

My code looks like this:

for i in $(cut -f1 Firstfile)
do
echo "$i" > tmpfile
TIME=$(awk '{print substr($1,1,14)}' tmpfile | sed -re 's/^([0-9]{8})([0-9]{2})([0-9]{2})([0-9]{2})$/\1\\ \2:\3:\4/' | xargs date +@%s -d | xargs date -u +"%Y-%m-%dT%H:%M:%S" -d)
MSec=$(awk '{print substr($1,15,3)}' tmpfile)
Msg=$(awk '{print substr($1,18,3)}' tmpfile)
echo -e "$TIME.$MSec $Msg" >> ResultFile
done

When I use the command individually it works fine and I get the desired result.

awk '{print substr($1,1,14)}' tmpfile | sed -re 's/^([0-9]{8})([0-9]{2})([0-9]{2})([0-9]{2})$/\1\\ \2:\3:\4/' | xargs date +@%s -d | xargs date -u +"%Y-%m-%dT%H:%M:%S" -d

What is the mistake I am doing in this with the script? why does it not work when I pass it through a script in for loop?

Expected Result:

2019-10-14T20:16:52.042 nmd
2019-10-14T20:16:52.052 egc
2019-10-04T12:19:01.490 egc

and so on

3
  • Do you have GNU awk (gawk)? if so you can probably simplify the conversion significantly using the builtin mktime and strftime Commented Oct 17, 2019 at 1:28
  • Your script works here, I can't reproduce the problem. But you could remove the loop and tmpfile and use xargs -n1 to pass one argument after another to date. A more compact version would be sed -re 's/^([0-9]{8})([0-9]{2})([0-9]{2})([0-9]{2}).*/\1\\ \2:\3:\4/' Firstfile | xargs -n1 date +@%s -d | xargs -n1 date -u +"%FT%T" -d >> ResultFile
    – Freddy
    Commented Oct 17, 2019 at 1:51
  • I appreciate your response. How do I get the rest of the content? I want my output in this format 2019-10-14T20:16:52.042 nmd
    – Paul
    Commented Oct 17, 2019 at 4:52

2 Answers 2

2

Your issue is that you are passing too much data to date using xargs. Also, you don't pass the extra text string at the end to make it part of the output.

It would be better to do the whole operation in an awk script. Both GNU awk and mawk has functions for doing basic timestamp manipulations:

{
        YYYY    = substr($1, 1, 4)   # year
        mm      = substr($1, 5, 2)   # month
        dd      = substr($1, 7, 2)   # day
        HH      = substr($1, 9, 2)   # hour
        MM      = substr($1, 11, 2)  # minute
        SS      = substr($1, 13, 2)  # seconds
        sss     = substr($1, 15, 3)  # fractional seconds

        text    = substr($1, 18)     # the rest

        tm = mktime(sprintf("%s %s %s %s %s %s", YYYY, mm, dd, HH, MM, SS))
        printf("%s.%s %s\n", strftime("%Y-%m-%dT%H:%M:%S", tm, 1), sss, text)
}

This picks out the various components of the input timestamp into various variables using substr(). A Unix time is then calculated using mktime() (it's assumed that the input time is in the local time zone) and this is converted to a (UTC) timestamp string in the appropriate format using strftime().

Note that the fractional seconds (sss in the code) are never part of the time computations and are instead just transferred as is from the input the output.

Running it:

$ awk -f script.awk file
2019-10-14T14:16:53.042 nmd
2019-10-14T14:16:53.052 egc
2019-10-04T06:19:01.490 egc
2019-10-04T06:19:01.493 nex
2019-10-04T06:19:01.497 nex
2019-10-04T06:19:02.531 nex
2019-10-04T06:19:02.534 ksd

See the documentation for mktime() and strftime() in your awk manual.

0

This is a version using sed and bash

 sed -re 's/^([0-9]{8})([0-9]{2})([0-9]{2})([0-9]{2})(([0-9]{3,}){0,1})([a-z].*)$/D="\1\ \2:\3:\4";M="\5";E="\7"/' /tmp/tmpfile  |\ 
  xargs -d'\n' -I=:= bash -c  '(=:=;echo $(date -u +"%Y-%m-%dT%H:%M:%S.$M" -d "TZ=\"America/New_York\" $D" )  $E)'

i did only 1 sed step to avoid awk + sed
i removed all \ in the second part of sed

i use -I option of xargs to specify where i will use my arg , like {} in find

to avoid to use twice date , i use the TZ option in -d option of date

Here i use America/New_York as a timezone for the input , you must change to the right value .

1
  • I am trying to get this format. your version is excluding the characters other than time. How to get in this format? I appreciate your response. 2019-10-14T20:16:52.042 nmd
    – Paul
    Commented Oct 17, 2019 at 4:54

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.