问从日志文件中提取单词
EN

Stack Overflow用户

提问于 2018-06-15 18:58:02

回答 3查看 152关注 0票数 0

我正在尝试从日志文件中提取作业id，而我在bash中提取它们时遇到了困难。我试过用苏打水。

我的日志文件是这样的：

> 2018-06-16 02:39:39,331 INFO  org.apache.flink.client.cli.CliFrontend 
> - Running 'list' command.
> 2018-06-16 02:39:39,641 INFO  org.apache.flink.runtime.rest.RestClient                      
> - Rest client endpoint started.
> 2018-06-16 02:39:39,741 INFO  org.apache.flink.client.cli.CliFrontend                       
> - Waiting for response...
>  Waiting for response...
> 2018-06-16 02:39:39,953 INFO  org.apache.flink.client.cli.CliFrontend                       
> - Successfully retrieved list of jobs
> ------------------ Running/Restarting Jobs -------------------
> 15.06.2018 18:49:44 : 1280dfd7b1de4c74cacf9515f371844b : jETTY HTTP Server -> servlet with content decompress -> pull from
> collections -> CSV to Avro encode -> Kafka publish (RUNNING)
> 16.06.2018 02:37:07 : aa7a691fa6c3f1ad619b6c0c4425ba1e : jETTY HTTP Server -> servlet with content decompress -> pull from
> collections -> CSV to Avro encode ->  Kafka publish (RUNNING)
> --------------------------------------------------------------
> 2018-06-16 02:39:39,956 INFO  org.apache.flink.runtime.rest.RestClient                      
> - Shutting down rest endpoint.
> 2018-06-16 02:39:39,957 INFO  org.apache.flink.runtime.rest.RestClient                      
> - Rest endpoint shutdown complete.

我使用以下代码提取包含jobId的行：

extractRestResponse=`cat logFile.txt`
echo "extractRestResponse: "$extractRestResponse

w1="------------------ Running/Restarting Jobs -------------------"
w2="--------------------------------------------------------------"
extractRunningJobs="sed -e 's/.*'"$w1"'\(.*\)'"$w2"'.*/\1/' <<< $extractRestResponse"
runningJobs=`eval $extractRunningJobs`
echo "running jobs :"$runningJobs

然而，这并没有给我任何结果。我还注意到，当我打印extractRestResponse变量时，所有换行符都丢失了。

我也尝试使用这个命令，但它没有给出任何结果：

extractRestResponse="sed -n '/"$w1"/,/"$w2"/{//!p}' logFile.txt"

sed

bash

awk

回答 3

Stack Overflow用户

回答已采纳

发布于 2018-06-15 19:31:26

使用sed：

sed -n '/^-* Running\/Restarting Jobs -*/,/^--*/{//!p;}' logFile.txt

解释：

在应用命令之后，输入行默认回显到标准输出。-n标志将取消此行为。
/^-* Running\/Restarting Jobs -*/,/^--*/：匹配从^-* Running\/Restarting Jobs -*到^--*的行(包括)
//!p;：打印行，但与地址匹配的行除外

票数 1

Stack Overflow用户

发布于 2018-06-15 19:16:01

awk去营救！

awk '/^-+$/{f=0} f; /^-+ Running\/Restarting Jobs -+$/{f=1}' logfile

票数 1

Stack Overflow用户

发布于 2018-06-15 19:55:11

您可以改进您的原始替代：

sed -e 's/.*'"$w1"'\(.*\)'"$w2"'.*/\1/' <<< $extractRestResponse

通过使用@作为分隔符：

sed -n "s@.*$w1\(.*\)$w2.*@\1@p" <<< $extractRestResponse

输出是$w1和$w2之间的文本。

> 15.06.2018 18:49:44 : 1280dfd7b1de4c74cacf9515f371844b : jETTY HTTP Server -> servlet with content decompress -> pull from > collections -> CSV to Avro encode -> Kafka publish (RUNNING) > 16.06.2018 02:37:07 : aa7a691fa6c3f1ad619b6c0c4425ba1e : jETTY HTTP Server -> servlet with content decompress -> pull from > collections -> CSV to Avro encode -> Kafka publish (RUNNING) >