我有數據,每個數據都以<SUBBEGIN開始,以<SUBEND結束,示例內容如下:
<SUBBEGIN
SUBSCRIBERIDENTIFIER=803838478;
PAIDTYPE=0;
SUBSCRIPTION=TOOMUCH&73337E0380B4B30F&1&AAA&BBB&CCC&1&1&FFFFFFFFFFFFFFFF&255&1&255&256&FFFFFFFFFFFFFFFF&0&0&128&1&255&255&FFFFFFFFFFFFFF&FFFFFFFFFFFFFF&0&0&0&1&0&0&1;
SUBSCRIPTION=TASKS&E7CC601262AB3535&1&DDD&EEE&FFF&2&1&FFFFFFFFFFFFFFFF&255&0&255&256&FFFFFFFFFFFFFFFF&0&0&128&1&255&255&FFFFFFFFFFFFFF&FFFFFFFFFFFFFF&0&21&0&1&0&0&1;
<SUBEND
<SUBBEGIN
SUBSCRIBERIDENTIFIER=705959905;
PAIDTYPE=254;
SUBSCRIPTION=REALLY&73337E0380B4B30F&1&GGG&HHH&LLL&1&1&FFFFFFFFFFFFFFFF&255&1&255&256&FFFFFFFFFFFFFFFF&0&0&128&1&255&255&FFFFFFFFFFFFFF&FFFFFFFFFFFFFF&0&0&0&1&0&0&1;
SUBSCRIPTION=TIRED&E7CC601262AB3535&1&MMM&NNN&PPP&2&1&FFFFFFFFFFFFFFFF&255&0&255&256&FFFFFFFFFFFFFFFF&0&0&128&1&255&255&FFFFFFFFFFFFFF&FFFFFFFFFFFFFF&0&21&0&1&0&0&1;
<SUBEND
我計劃制作水平版本,只使用一些字段,結果的標題計劃如下:
SUBSCRIBERIDENTIFIER,,,PAIDTYPE,,1,255,SERVICENAME,SUBSCRIBEDATETIME,VALIDFROMDATETIME,EXPIREDDATETIME,,,,,
根據這些數據:
SUBSCRIBERIDENTIFIER sample is 803838478 (we can see it in SUBSCRIBERIDENTIFIER)
PAIDTYPE sample is 0 (we can see it in PAIDTYPE)
SERVICENAME sample is TOOMUCH (we can see it in SUBSCRIPTION)
SUBSCRIBEDATETIME sample is AAA (we can see it in SUBSCRIPTION)
VALIDFROMDATETIME sample is BBB (we can see it in SUBSCRIPTION)
EXPIREDDATETIME sample is CCC (we can see it in SUBSCRIPTION)
因此,預期結果如下:
803838478,,,0,,1,255,TOOMUCH,AAA,BBB,CCC,,,,,
803838478,,,0,,1,255,TASKS,DDD,EEE,FFF,,,,,
705959905,,,254,,1,255,REALLY,GGG,HHH,LLL,,,,,
705959905,,,254,,1,255,TIRED,MMM,NNN,PPP,,,,,
我試過這個腳本:
awk -F"&" '/^<SUBBEGIN$/{a=1} a && /^[[:blank:]]+(SUBSCRIBERIDENTIFIER|PAIDTYPE|SUBSCRIPTION)/{l=l OFS $1} a && /^<SUBEND$/ {print l; a=l=""}' sample.txt
但結果并不像預期的那樣:
SUBSCRIBERIDENTIFIER=803838478; PAIDTYPE=0; SUBSCRIPTION=TOOMUCH SUBSCRIPTION=TASKS
SUBSCRIBERIDENTIFIER=705959905; PAIDTYPE=254; SUBSCRIPTION=REALLY SUBSCRIPTION=TIRED
需要你的建議,謝謝。