unix - Bash : Huge file size processing issue in vim mode -


i have huge file size of 500mb , each line have data mentioned below.

#vim results.txt {"count": 8, "time_first": 1450801456, "record": "a", "domain": "api.ai.", "ip": "54.240.166.223", "time_last": 1458561052} {"count": 9, "time_first": 1450801456, "record": "a", "domain": "cnn.com.", "ip": "54.240.166.223", "time_last": 1458561052}  ......... 

25 million lines in total.

now , keep results.txt file ,

8,1450801456,a,api.ai,54.240.166.223,1458561052 9,1450801456,a,cnn.com,54.240.166.223,1458561052 .... 

by removing unwanted strings count , time_first , record ,domain , ip , time_last.

right , in vim mode i'm removing each , every string. example, %s/{"count": //g .

for 1 string , took more time replace it.

i'm beginner in bash/shell, how can using sed / awk ? suggestions please ?

with sed:

sed -e 's/[{ ]*"[^"]*": *|["}]//g' file #    ^    ^    ^         ^^---- remaining double quotes , closing bracket #    |    |    |         '----- or #    |    |    '--------------- key enclosed between double quotes #    |    '-------------------- leading opening curly bracket , spaces #    '------------------------- use ere (extended regular expression) syntax 

other way: using xidel includes json parser:

xidel -q file -e '$json/*' | sed 'n;n;n;n;n;y/\n/,/' #     ^           ^     ^         ^         ^---- translate newlines commas #     |           |     |         '-------------- append next 5 lines #     |           |     '------------------------ values #     |           '------------------------------ each json string #     '------------------------------------------ quiet mode 

shorter way @benibela doesn't need sed join fields together:

xidel -q file -e '$json/join(*,",")' 

Comments

Popular posts from this blog

java - nested exception is org.hibernate.exception.SQLGrammarException: could not extract ResultSet Hibernate+SpringMVC -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -

asp.net mvc - breakpoint on javascript in CSHTML? -