unix - Bash : Huge file size processing issue in vim mode -

- February 15, 2014

i have huge file size of 500mb , each line have data mentioned below.

#vim results.txt {"count": 8, "time_first": 1450801456, "record": "a", "domain": "api.ai.", "ip": "54.240.166.223", "time_last": 1458561052} {"count": 9, "time_first": 1450801456, "record": "a", "domain": "cnn.com.", "ip": "54.240.166.223", "time_last": 1458561052}  .........

25 million lines in total.

now , keep results.txt file ,

8,1450801456,a,api.ai,54.240.166.223,1458561052 9,1450801456,a,cnn.com,54.240.166.223,1458561052 ....

by removing unwanted strings count , time_first , record ,domain , ip , time_last.

right , in vim mode i'm removing each , every string. example, %s/{"count": //g .

for 1 string , took more time replace it.

i'm beginner in bash/shell, how can using sed / awk ? suggestions please ?

with sed:

sed -e 's/[{ ]*"[^"]*": *|["}]//g' file #    ^    ^    ^         ^^---- remaining double quotes , closing bracket #    |    |    |         '----- or #    |    |    '--------------- key enclosed between double quotes #    |    '-------------------- leading opening curly bracket , spaces #    '------------------------- use ere (extended regular expression) syntax

other way: using xidel includes json parser:

xidel -q file -e '$json/*' | sed 'n;n;n;n;n;y/\n/,/' #     ^           ^     ^         ^         ^---- translate newlines commas #     |           |     |         '-------------- append next 5 lines #     |           |     '------------------------ values #     |           '------------------------------ each json string #     '------------------------------------------ quiet mode

shorter way @benibela doesn't need sed join fields together:

xidel -q file -e '$json/join(*,",")'

Search This Blog

First Image

unix - Bash : Huge file size processing issue in vim mode -

Comments

Post a Comment

Popular posts from this blog

php - Passing multiple values in a url using checkbox -

compilation - PHP install fails on Ubuntu 14 (make: *** [sapi/cli/php] Error 1) PHP 5.6.20 -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -