blocks|key|2909818|text|尝试使用包含an+example+of+how+to+use+it+with+grep的GNU+parallel|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|2909819|2909820|+grep+-r以递归方式遍历目录。在多核parallel上，GNU+CPU通常可以加速这一过程。|blockquote|2909821|2909822|查找。-type+f%7C+parallel+-k+-j150%25+-n+1000+-m+grep+-H+-n字符串{}|2909823|这将在每个核心上运行1.5个作业，并向grep提供1000个参数。|2909824|2909825|对于大文件，它可以使用--pipe和--block参数将输入拆分为几个块：|2909826|+parallel+--pipe+--block+2M+grep+foo+<+bigfile|code-block|syntax|javascript|2909827|您还可以通过SSH在几台不同的机器上运行它(需要使用ssh-agent来避免密码)：|2909828|parallel+--pipe+--sshlogin+server.example.com,server2.example.net+grep+foo+<+bigfile|2909829|entityMap|0|LINK|mutability|MUTABLE|url|https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallel-grep|1|http://www.gnu.org/software/parallel/^0|13|4|6|11|0|18|C|1|0|0|1|7|L|8|0|0|0|J|4|0|0|B|6|I|7|0|0|0|0^^$0|@$1|2|3|4|5|6|7|1A|8|@$9|1B|A|1C|B|C]]|D|@$9|1D|A|1E|1|1F]|$9|1G|A|1H|1|1I]]|E|$]]|$1|F|3|-4|5|6|7|1J|8|@]|D|@]|E|$]]|$1|G|3|H|5|I|7|1K|8|@$9|1L|A|1M|B|C]|$9|1N|A|1O|B|C]]|D|@]|E|$]]|$1|J|3|-4|5|6|7|1P|8|@]|D|@]|E|$]]|$1|K|3|L|5|6|7|1Q|8|@]|D|@]|E|$]]|$1|M|3|N|5|6|7|1R|8|@$9|1S|A|1T|B|C]]|D|@]|E|$]]|$1|O|3|-4|5|6|7|1U|8|@]|D|@]|E|$]]|$1|P|3|Q|5|6|7|1V|8|@$9|1W|A|1X|B|C]|$9|1Y|A|1Z|B|C]]|D|@]|E|$]]|$1|R|3|S|5|T|7|20|8|@]|D|@]|E|$U|V]]|$1|W|3|X|5|6|7|21|8|@]|D|@]|E|$]]|$1|Y|3|Z|5|T|7|22|8|@]|D|@]|E|$U|V]]|$1|10|3|-4|5|6|7|23|8|@]|D|@]|E|$]]]|11|$12|$5|13|14|15|E|$16|17]]|18|$5|13|14|15|E|$16|19]]]]

Try with <a href="http://www.gnu.org/software/parallel/" rel="nofollow noreferrer">GNU parallel</a>, which includes <a href="https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallel-grep" rel="nofollow noreferrer">an example of how to use it with <code>grep</code></a>:

<blockquote>
 <code>grep -r</code> greps recursively through directories. On multicore CPUs GNU
 <code>parallel</code> can often speed this up.

<pre><code>find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
</code></pre>
 
 This will run 1.5 job per core, and give 1000 arguments to <code>grep</code>.
</blockquote>

For big files, it can split it the input in several chunks with the <code>--pipe</code> and <code>--block</code> arguments:

<pre><code> parallel --pipe --block 2M grep foo &lt; bigfile
</code></pre>

You could also run it on several different machines through SSH (ssh-agent needed to avoid passwords):

<pre><code>parallel --pipe --sshlogin server.example.com,server2.example.net grep foo &lt; bigfile
</code></pre>

blocks|key|3124120|text|如果你正在搜索非常大的文件，那么设置你的语言环境真的很有帮助。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3124121|GNU+grep在C语言环境中的运行速度比在UTF-8中要快得多。|3124122|export+LC_ALL=C|code-block|syntax|javascript|3124123|entityMap^0|0|0|0^^$0|@$1|2|3|4|5|6|7|K|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|L|8|@]|9|@]|A|$]]|$1|D|3|E|5|F|7|M|8|@]|9|@]|A|$G|H]]|$1|I|3|-4|5|6|7|N|8|@]|9|@]|A|$]]]|J|$]]

If you're searching very large files, then setting your locale can really help.

GNU grep goes a lot faster in the C locale than with UTF-8.

<pre><code>export LC_ALL=C
</code></pre>

blocks|key|3124247|text|Ripgrep声称现在是最快的。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3124248|https://github.com/BurntSushi/ripgrep|offset|length|3124249|默认情况下还包括并行性|3124250|+-j,+--threads+ARG
++++++++++++++The+number+of+threads+to+use.++Defaults+to+the+number+of+logical+CPUs+(capped+at+6).++[default:+0]|code-block|syntax|javascript|3124251|来自自述文件|3124252|它是建立在|3124253|的正则表达式引擎之上的。Rust的正则表达式引擎使用有限自动机、SIMD和积极的文字优化来使搜索非常快速。|blockquote|3124254|3124255|entityMap|0|LINK|mutability|MUTABLE|url^0|0|0|11|0|0|0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|11|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|12|8|@]|9|@$D|13|E|14|1|15]]|A|$]]|$1|F|3|G|5|6|7|16|8|@]|9|@]|A|$]]|$1|H|3|I|5|J|7|17|8|@]|9|@]|A|$K|L]]|$1|M|3|N|5|6|7|18|8|@]|9|@]|A|$]]|$1|O|3|P|5|6|7|19|8|@]|9|@]|A|$]]|$1|Q|3|R|5|S|7|1A|8|@]|9|@]|A|$]]|$1|T|3|-4|5|6|7|1B|8|@]|9|@]|A|$]]|$1|U|3|-4|5|6|7|1C|8|@]|9|@]|A|$]]]|V|$W|$5|X|Y|Z|A|$10|C]]]]

Ripgrep claims to now be the fastest.

<a href="https://github.com/BurntSushi/ripgrep" rel="noreferrer">https://github.com/BurntSushi/ripgrep</a>

Also includes parallelism by default

<pre><code> -j, --threads ARG
 The number of threads to use. Defaults to the number of logical CPUs (capped at 6). [default: 0]
</code></pre>

From the README

<blockquote>
 It is built on top of Rust's regex engine. Rust's regex engine uses
 finite automata, SIMD and aggressive literal optimizations to make
 searching very fast.
</blockquote>

blocks|key|2909844|text|显然，在某些系统上使用--mmap会有所帮助：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|2909845|http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html|offset|length|2909846|entityMap|0|LINK|mutability|MUTABLE|url^0|0|0|22|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|N|8|@]|9|@$D|O|E|P|1|Q]]|A|$]]|$1|F|3|-4|5|6|7|R|8|@]|9|@]|A|$]]]|G|$H|$5|I|J|K|A|$L|C]]]]

Apparently using --mmap can help on some systems:

<a href="http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html" rel="noreferrer">http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html</a>

blocks|key|240453|text|严格来说，这不是代码改进，但在对2%2B百万个文件运行grep后，我发现它很有帮助。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|240454|我将操作转移到一个便宜的SSD驱动器(120+SSD)上。大约100美元，如果你经常处理大量文件，这是一个负担得起的选择。|240455|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|F|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|G|8|@]|9|@]|A|$]]|$1|D|3|-4|5|6|7|H|8|@]|9|@]|A|$]]]|E|$]]

Not strictly a code improvement but something I found helpful after running grep on 2+ million files.

I moved the operation onto a cheap SSD drive (120GB). At about $100, it's an affordable option if you are crunching lots of files regularly.

blocks|key|2909867|text|如果您不关心哪个文件包含该字符串，那么您可能希望将读取的和读取的分别放在两个作业中，因为多次生成grep的成本可能很高--每个小文件一次。|type|unstyled|depth|inlineStyleRanges|offset|length|style|BOLD|CODE|entityRanges|data|2909868|2909869|如果您有一个非常大的文件：|ordered-list-item|2909870|2909871|parallel+-j100%25+--pipepart+--block+100M+-a+<very+large+SEEKABLE+file>+grep+<...>|2909872|Many小型压缩文件(按inode排序)|unordered-list-item|2909873|2909874|ls+-i+%7C+sort+-n+%7C+cut+-d'+'+-f2+%7C+fgrep+\.gz+%7C+parallel+-j80%25+--group+"gzcat+{}"+%7C+parallel+-j50%25+--pipe+--round-robin+-u+-N1000+grep+<..>|2909875|我通常用lz4压缩我的文件以获得最大的吞吐量。|2909876|2909877|如果您只想要匹配的文件名：|2909878|2909879|ls+-i+%7C+sort+-n+%7C+cut+-d'+'+-f2+%7C+fgrep+\.gz+%7C+parallel+-j100%25+--group+"gzcat+{}+%7C+grep+-lq+<..>+&&+echo+{}|2909880|entityMap^0|S|15|1C|4|0|0|0|D|0|0|0|28|0|28|0|0|K|0|0|0|3U|0|3U|0|0|N|0|0|0|D|0|0|0|2Z|0|2Z|0^^$0|@$1|2|3|4|5|6|7|13|8|@$9|14|A|15|B|C]|$9|16|A|17|B|D]]|E|@]|F|$]]|$1|G|3|-4|5|6|7|18|8|@]|E|@]|F|$]]|$1|H|3|I|5|J|7|19|8|@$9|1A|A|1B|B|C]]|E|@]|F|$]]|$1|K|3|-4|5|6|7|1C|8|@]|E|@]|F|$]]|$1|L|3|M|5|6|7|1D|8|@$9|1E|A|1F|B|C]|$9|1G|A|1H|B|D]]|E|@]|F|$]]|$1|N|3|O|5|P|7|1I|8|@$9|1J|A|1K|B|C]]|E|@]|F|$]]|$1|Q|3|-4|5|6|7|1L|8|@]|E|@]|F|$]]|$1|R|3|S|5|6|7|1M|8|@$9|1N|A|1O|B|C]|$9|1P|A|1Q|B|D]]|E|@]|F|$]]|$1|T|3|U|5|6|7|1R|8|@$9|1S|A|1T|B|C]]|E|@]|F|$]]|$1|V|3|-4|5|6|7|1U|8|@]|E|@]|F|$]]|$1|W|3|X|5|J|7|1V|8|@$9|1W|A|1X|B|C]]|E|@]|F|$]]|$1|Y|3|-4|5|6|7|1Y|8|@]|E|@]|F|$]]|$1|Z|3|10|5|6|7|1Z|8|@$9|20|A|21|B|C]|$9|22|A|23|B|D]]|E|@]|F|$]]|$1|11|3|-4|5|6|7|24|8|@]|E|@]|F|$]]]|12|$]]

If you don't care about which files contains the string, you might want to separate reading and grepping into two jobs, since it might be costly to spawn <code>grep</code> many times – once for each small file.

<ol>
<li>If you've one very large file:

<code>parallel -j100% --pipepart --block 100M -a &lt;very large SEEKABLE file&gt; grep &lt;...&gt;</code></li>
<li>Many small compressed files (sorted by inode)

<code>ls -i | sort -n | cut -d' ' -f2 | fgrep \.gz | parallel -j80% --group "gzcat {}" | parallel -j50% --pipe --round-robin -u -N1000 grep &lt;..&gt;</code></li>
</ol>

I usually compress my files with lz4 for maximum throughput.

<ol start="3">
<li>If you want just the filename with the match:

<code>ls -i | sort -n | cut -d' ' -f2 | fgrep \.gz | parallel -j100% --group "gzcat {} | grep -lq &lt;..&gt; &amp;&amp; echo {}</code></li>
</ol>

blocks|key|238041|text|基于Sandro的回应，我查看了他提供给here的参考资料，并研究了BSD+grep与GNU+grep。我的快速基准测试结果显示:+GNU+grep要快得多。|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|238042|因此，我对原始问题“尽可能快的grep”的建议是:确保您使用的是GNU+grep而不是BSD+grep(例如，这是MacOS上的默认设置)。|238043|entityMap|0|LINK|mutability|MUTABLE|url|http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html^0|K|4|0|0|0^^$0|@$1|2|3|4|5|6|7|N|8|@]|9|@$A|O|B|P|1|Q]]|C|$]]|$1|D|3|E|5|6|7|R|8|@]|9|@]|C|$]]|$1|F|3|-4|5|6|7|S|8|@]|9|@]|C|$]]]|G|$H|$5|I|J|K|C|$L|M]]]]

Building on the response by Sandro I looked at the reference he provided <a href="http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html" rel="nofollow">here</a> and played around with BSD grep vs. GNU grep. My quick benchmark results showed: GNU grep is way, way faster.

So my recommendation to the original question "fastest possible grep": Make sure you are using GNU grep rather than BSD grep (which is the default on MacOS for example).

blocks|key|240479|text|我个人使用的是ag+(银色搜索器)而不是grep，而且它的速度更快，你也可以将它与并行和管道块相结合。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|240480|https://github.com/ggreer/the_silver_searcher|offset|length|240481|更新:我现在使用https://github.com/BurntSushi/ripgrep，它比ag更快，这取决于你的用例。|240482|entityMap|0|LINK|mutability|MUTABLE|url|1|https://github.com/BurntSushi/ripgrep^0|0|0|19|0|0|8|11|1|0^^$0|@$1|2|3|4|5|6|7|Q|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|R|8|@]|9|@$D|S|E|T|1|U]]|A|$]]|$1|F|3|G|5|6|7|V|8|@]|9|@$D|W|E|X|1|Y]]|A|$]]|$1|H|3|-4|5|6|7|Z|8|@]|9|@]|A|$]]]|I|$J|$5|K|L|M|A|$N|C]]|O|$5|K|L|M|A|$N|P]]]]

I personally use the ag (silver searcher) instead of grep and it's way faster, also you can combine it with parallel and pipe block.

<a href="https://github.com/ggreer/the_silver_searcher" rel="nofollow noreferrer">https://github.com/ggreer/the_silver_searcher</a>

Update: 
I now use <a href="https://github.com/BurntSushi/ripgrep" rel="nofollow noreferrer">https://github.com/BurntSushi/ripgrep</a> which is faster than ag depending on your use case.

blocks|key|240484|text|我发现使用grep在单个大文件中搜索(特别是更改模式)速度更快的一件事是，使用带有并行标志的split+%2B+grep+%2B+xargs。例如：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|240485|在一个名为my_ids.txt的大文件中有一个要搜索的in文件，其名称为bigfile+bigfile.txt|240486|使用split将文件拆分为多个部分：|240487|#+Use+split+to+split+the+file+into+x+number+of+files,+consider+your+big+file
#+size+and+try+to+stay+under+26+split+files+to+keep+the+filenames+
#+easy+from+split+(xa[a-z]),+in+my+example+I+have+10+million+rows+in+bigfile
split+-l+1000000+bigfile.txt
#+Produces+output+files+named+xa[a-t]

#+Now+use+split+files+%2B+xargs+to+iterate+and+launch+parallel+greps+with+output
for+id+in+$(cat+my_ids.txt)+;+do+ls+xa*+%7C+xargs+-n+1+-P+20+grep+$id+>>+matches.txt+;+done
#+Here+you+can+tune+your+parallel+greps+with+-P,+in+my+case+I+am+being+greedy
#+Also+be+aware+that+there's+no+point+in+allocating+more+greps+than+x+files|code-block|syntax|javascript|240488|在我的例子中，这将一个17小时的工作缩短为1小时20分钟的工作。我确信在效率上有某种钟形曲线，显然，检查可用内核不会对你有任何好处，但这是一个比上述任何针对我的需求的评论都更好的解决方案。与使用大多数(linux)原生工具时的脚本并行相比，这有一个额外的好处。|240489|entityMap^0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|O|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|P|8|@]|9|@]|A|$]]|$1|D|3|E|5|6|7|Q|8|@]|9|@]|A|$]]|$1|F|3|G|5|H|7|R|8|@]|9|@]|A|$I|J]]|$1|K|3|L|5|6|7|S|8|@]|9|@]|A|$]]|$1|M|3|-4|5|6|7|T|8|@]|9|@]|A|$]]]|N|$]]

One thing I've found faster for using grep to search (especially for changing patterns) in a single big file is to use split + grep + xargs with it's parallel flag. For instance:

Having a file of ids you want to search for in a big file called my_ids.txt
Name of bigfile bigfile.txt

Use split to split the file into parts:

<pre><code># Use split to split the file into x number of files, consider your big file
# size and try to stay under 26 split files to keep the filenames 
# easy from split (xa[a-z]), in my example I have 10 million rows in bigfile
split -l 1000000 bigfile.txt
# Produces output files named xa[a-t]

# Now use split files + xargs to iterate and launch parallel greps with output
for id in $(cat my_ids.txt) ; do ls xa* | xargs -n 1 -P 20 grep $id &gt;&gt; matches.txt ; done
# Here you can tune your parallel greps with -P, in my case I am being greedy
# Also be aware that there's no point in allocating more greps than x files
</code></pre>

In my case this cut what would have been a 17 hour job into a 1 hour 20 minute job. I'm sure there's some sort of bell curve here on efficiency and obviously going over the available cores won't do you any good but this was a much better solution than any of the above comments for my requirements as stated above. This has an added benefit over the script parallel in using mostly (linux) native tools.

blocks|key|3124148|text|如果cgrep可用，它可以比grep快几个数量级。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3124149|entityMap^0|0^^$0|@$1|2|3|4|5|6|7|D|8|@]|9|@]|A|$]]|$1|B|3|-4|5|6|7|E|8|@]|9|@]|A|$]]]|C|$]]

cgrep, if it's available, can be orders of magnitude faster than grep.

blocks|key|241524|text|MCE+1.508包括一个双块级别的{file，list}包装器脚本，支持许多C二进制文件；grep、grep、egrep、fgrep和tre-grep。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|241525|https://metacpan.org/source/MARIOROY/MCE-1.509/bin/mce_grep|offset|length|241526|https://metacpan.org/release/MCE|241527|当想要-i快速运行时，不需要转换为小写。只需将--lang=C传递给mce_grep。|241528|输出顺序被保留。-n和-b输出也是正确的。不幸的是，这并不是本页提到的GNU并行的情况。我真的希望GNU+Parallel能在这里工作。此外，在调用二进制文件时，mce_grep并不对子shell+(sh+-c+/path/to/grep)执行此操作。|241529|另一个替代方法是MCE附带的MCE::Grep模块。|241530|entityMap|0|LINK|mutability|MUTABLE|url|1^0|0|0|1N|0|0|0|W|1|0|0|0|0^^$0|@$1|2|3|4|5|6|7|V|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|W|8|@]|9|@$D|X|E|Y|1|Z]]|A|$]]|$1|F|3|G|5|6|7|10|8|@]|9|@$D|11|E|12|1|13]]|A|$]]|$1|H|3|I|5|6|7|14|8|@]|9|@]|A|$]]|$1|J|3|K|5|6|7|15|8|@]|9|@]|A|$]]|$1|L|3|M|5|6|7|16|8|@]|9|@]|A|$]]|$1|N|3|-4|5|6|7|17|8|@]|9|@]|A|$]]]|O|$P|$5|Q|R|S|A|$T|C]]|U|$5|Q|R|S|A|$T|G]]]]

MCE 1.508 includes a dual chunk-level {file, list} wrapper script supporting many C binaries; agrep, grep, egrep, fgrep, and tre-agrep.

<a href="https://metacpan.org/source/MARIOROY/MCE-1.509/bin/mce_grep" rel="nofollow">https://metacpan.org/source/MARIOROY/MCE-1.509/bin/mce_grep</a>

<a href="https://metacpan.org/release/MCE" rel="nofollow">https://metacpan.org/release/MCE</a>

One does not need to convert to lowercase when wanting -i to run fast. Simply pass --lang=C to mce_grep.

Output order is preserved. The -n and -b output is also correct. Unfortunately, that is not the case for GNU parallel mentioned on this page. I was really hoping for GNU Parallel to work here. In addition, mce_grep does not sub-shell (sh -c /path/to/grep) when calling the binary.

Another alternate is the MCE::Grep module included with MCE.

blocks|key|243180|text|与原始主题略有不同:+googlecodesearch项目中的索引搜索命令行实用程序比grep：https://github.com/google/codesearch快得多|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|243181|一旦编译了它(需要golang包)，就可以用以下命令为文件夹建立索引：|243182|#+index+current+folder
cindex+.|code-block|syntax|javascript|243183|索引将在~/.csearchindex下创建|style|CODE|243184|现在你可以搜索：|243185|#+search+folders+previously+indexed+with+cindex
csearch+eggs|243186|我仍在通过grep传递结果，以获得彩色匹配。|243187|entityMap|0|LINK|mutability|MUTABLE|url|https://github.com/google/codesearch|1|https://golang.org/doc/install^0|1C|10|0|0|9|6|1|0|0|4|F|0|0|0|0^^$0|@$1|2|3|4|5|6|7|14|8|@]|9|@$A|15|B|16|1|17]]|C|$]]|$1|D|3|E|5|6|7|18|8|@]|9|@$A|19|B|1A|1|1B]]|C|$]]|$1|F|3|G|5|H|7|1C|8|@]|9|@]|C|$I|J]]|$1|K|3|L|5|6|7|1D|8|@$A|1E|B|1F|M|N]]|9|@]|C|$]]|$1|O|3|P|5|6|7|1G|8|@]|9|@]|C|$]]|$1|Q|3|R|5|H|7|1H|8|@]|9|@]|C|$I|J]]|$1|S|3|T|5|6|7|1I|8|@]|9|@]|C|$]]|$1|U|3|-4|5|6|7|1J|8|@]|9|@]|C|$]]]|V|$W|$5|X|Y|Z|C|$10|11]]|12|$5|X|Y|Z|C|$10|13]]]]

A slight deviation from the original topic: the indexed search command line utilities from the googlecodesearch project are way faster than grep: <a href="https://github.com/google/codesearch" rel="nofollow noreferrer">https://github.com/google/codesearch</a>:

Once you compile it (the <a href="https://golang.org/doc/install" rel="nofollow noreferrer">golang</a> package is needed), you can index a folder with:



<pre class="lang-bash prettyprint-override"><code># index current folder
cindex .
</code></pre>

The index will be created under <code>~/.csearchindex</code>

Now you can search:

<pre class="lang-bash prettyprint-override"><code># search folders previously indexed with cindex
csearch eggs
</code></pre>

I'm still piping the results through grep to get colorized matches.

I'd like to know if there is any tip to make <code>grep</code> as fast as possible. I have a rather large base of text files to search in the quickest possible way. I've made them all lowercase, so that I could get rid of <code>-i</code> option. This makes the search much faster.

Also, I've found out that <code>-F</code> and <code>-P</code> modes are quicker than the default one. I use the former when the search string is not a regular expression (just plain text), the latter if regex is involved.

Does anyone have any experience in speeding up <code>grep</code>? Maybe compile it from scratch with some particular flag (I'm on Linux CentOS), organize the files in a certain fashion or maybe make the search parallel in some way?

Fastest possible grep

CentOS

Linux

MacOS

我想知道有没有让grep越快越好的建议。我有一个相当大的文本文件库，可以以最快的方式进行搜索。我把它们都改成了小写，这样我就可以去掉-i选项了。这使得搜索速度更快。此外，我还发现-F和-P模式比默认模式更快。当搜索字符串不是正则表达式(纯文本)时，我使用前者，如果涉及正则表达式，则使用后者。有没有人有加速grep的经验？也许可以使用某个特定的标志从头开始编译它(我在Linux CentOS上)，以

问最快的grep
EN

回答 12

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问最快的grepEN

回答 12

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问最快的grep
EN