Problem
We have the following text file.
$ cat ads_log_small.csv ADS_ID,DEVICE_OS,NUM_IMPRESSION,NUM_CONVERSION 32, Android, 1, 0 32, Android, 1, 0 32, Android, 1, 0 32, Android, 2, 0 32, Android, 1, 0 32, Android, 2, 0 32, Android, 1, 0 32, Android, 3, 0 32, Android, 2, 0
If we load records in the text file into a database table, the original order of records in the file is lost. In the post SAS data set vs. relational database table, we mentioned that,Unlike SAS data set, a database table is a set where records no order (unless we explicitly sort them by keys). If we want to preserve the original sequence of the record, how do we add a line number to the original file?
Solution
We can use Linux (or Cygwin on Windows) command awk to add a line number to the file.
$ cat ads_log_small.csv | awk '{if (NR==1) print "LINE_NUMBER" ,",",$0; else print NR-1,",",$0}' LINE_NUMBER , ADS_ID,DEVICE_OS,NUM_IMPRESSION,NUM_CONVERSION 1 , 32, Android, 1, 0 2 , 32, Android, 1, 0 3 , 32, Android, 1, 0 4 , 32, Android, 2, 0 5 , 32, Android, 1, 0 6 , 32, Android, 2, 0 7 , 32, Android, 1, 0 8 , 32, Android, 3, 0 9 , 32, Android, 2, 0We can write the output to a new file.
$ cat ads_log_small.csv | awk '{if (NR==1) print "LINE_NUMBER" ,",",$0; else print NR-1,",",$0}' > ads_log_new.csv
No comments:
Post a Comment