unix - Bash : Huge file size processing issue in vim mode -
i have huge file size of 500mb , each line have data mentioned below.
#vim results.txt {"count": 8, "time_first": 1450801456, "record": "a", "domain": "api.ai.", "ip": "54.240.166.223", "time_last": 1458561052} {"count": 9, "time_first": 1450801456, "record": "a", "domain": "cnn.com.", "ip": "54.240.166.223", "time_last": 1458561052} ......... 25 million lines in total.
now , keep results.txt file ,
8,1450801456,a,api.ai,54.240.166.223,1458561052 9,1450801456,a,cnn.com,54.240.166.223,1458561052 .... by removing unwanted strings count , time_first , record ,domain , ip , time_last.
right , in vim mode i'm removing each , every string. example, %s/{"count": //g .
for 1 string , took more time replace it.
i'm beginner in bash/shell, how can using sed / awk ? suggestions please ?
with sed:
sed -e 's/[{ ]*"[^"]*": *|["}]//g' file # ^ ^ ^ ^^---- remaining double quotes , closing bracket # | | | '----- or # | | '--------------- key enclosed between double quotes # | '-------------------- leading opening curly bracket , spaces # '------------------------- use ere (extended regular expression) syntax other way: using xidel includes json parser:
xidel -q file -e '$json/*' | sed 'n;n;n;n;n;y/\n/,/' # ^ ^ ^ ^ ^---- translate newlines commas # | | | '-------------- append next 5 lines # | | '------------------------ values # | '------------------------------ each json string # '------------------------------------------ quiet mode shorter way @benibela doesn't need sed join fields together:
xidel -q file -e '$json/join(*,",")'
Comments
Post a Comment