blocks|key|1312162|text|如果您想获得每个行最常见的两位数字，那么您可以使用：|type|unstyled|depth|inlineStyleRanges|offset|length|style|BOLD|entityRanges|data|1312163|WITH+data_rows(id,+cpv_values)+AS+(
++++VALUES+(1,+ARRAY+['45331110',+'50721000',+'45251250','42160000','39715000','45315000',+'09323000','71321200','45331100',+'50720000'])
+++++++++,+(2,+ARRAY+['50721000'])+--+second+test+case
)
SELECT+id,+leading_two_digits
FROM+data_rows
--+for+every+row+in+`data_rows`+(your+table),
--+select+the+most+common+`leading_two_digits`+(through+GROUP+BY/ORDER+BY/LIMIT+1)
JOIN+LATERAL+(
++++SELECT+left(code,+2)+AS+leading_two_digits
++++FROM+unnest(cpv_values)+AS+f(code)
++++GROUP+BY+left(code,+2)
++++ORDER+BY+COUNT(*)+DESC
++++LIMIT+1
)+s+ON+true|code-block|syntax|javascript|1312164|返回|1312165|%2B--%2B------------------%2B
%7Cid%7Cleading_two_digits%7C
%2B--%2B------------------%2B
%7C1+%7C45++++++++++++++++%7C
%7C2+%7C50++++++++++++++++%7C
%2B--%2B------------------%2B|1312166|1312167|如果您想在所有行中获得最常见的两位数字，可以使用：|1312168|WITH+data_rows(cpv_values)+AS+(
++++VALUES+(ARRAY+['45331110',+'50721000',+'45251250','42160000','39715000','45315000',+'09323000','71321200','45331100',+'50720000']),
+++++++++++(ARRAY+['45'])
)
SELECT+left(code,+2)+AS+leading_two_digits
FROM+data_rows,+unnest(cpv_values)+AS+f(code)
GROUP+BY+left(code,+2)
ORDER+BY+COUNT(*)+DESC
LIMIT+1|1312169|entityMap^0|8|9|0|0|0|0|0|5|2|0|0^^$0|@$1|2|3|4|5|6|7|V|8|@$9|W|A|X|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|Y|8|@]|D|@]|E|$I|J]]|$1|K|3|L|5|6|7|Z|8|@]|D|@]|E|$]]|$1|M|3|N|5|H|7|10|8|@]|D|@]|E|$I|J]]|$1|O|3|-4|5|6|7|11|8|@]|D|@]|E|$]]|$1|P|3|Q|5|6|7|12|8|@$9|13|A|14|B|C]]|D|@]|E|$]]|$1|R|3|S|5|H|7|15|8|@]|D|@]|E|$I|J]]|$1|T|3|-4|5|6|7|16|8|@]|D|@]|E|$]]]|U|$]]

If you want to get the most common leading two digits per row, then you can use:
<pre><code>WITH data_rows(id, cpv_values) AS (
 VALUES (1, ARRAY ['45331110', '50721000', '45251250','42160000','39715000','45315000', '09323000','71321200','45331100', '50720000'])
 , (2, ARRAY ['50721000']) -- second test case
)
SELECT id, leading_two_digits
FROM data_rows
-- for every row in `data_rows` (your table),
-- select the most common `leading_two_digits` (through GROUP BY/ORDER BY/LIMIT 1)
JOIN LATERAL (
 SELECT left(code, 2) AS leading_two_digits
 FROM unnest(cpv_values) AS f(code)
 GROUP BY left(code, 2)
 ORDER BY COUNT(*) DESC
 LIMIT 1
) s ON true
</code></pre>
returns
<pre><code>+--+------------------+
|id|leading_two_digits|
+--+------------------+
|1 |45 |
|2 |50 |
+--+------------------+
</code></pre>
<hr />
If you want to get the most common leading two digits across all rows, you can use:
<pre><code>WITH data_rows(cpv_values) AS (
 VALUES (ARRAY ['45331110', '50721000', '45251250','42160000','39715000','45315000', '09323000','71321200','45331100', '50720000']),
 (ARRAY ['45'])
)
SELECT left(code, 2) AS leading_two_digits
FROM data_rows, unnest(cpv_values) AS f(code)
GROUP BY left(code, 2)
ORDER BY COUNT(*) DESC
LIMIT 1
</code></pre>

blocks|key|2684213|text|这个查询可以满足您的需要。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|2684214|select+substr(t,+1,+2)+mc
+from+unnest(array['45331110',+'50721000',+'45251250',+'42160000',+'39715000',+'45315000',+'09323000',+'71321200',+'45331100',+'50720000'])+t+
+group+by+mc
+order+by+count(1)+desc
+limit+1;|code-block|syntax|javascript|2684215|结果：|2684216|Name%7CValue%7C
----%7C-----%7C
mc++%7C45+++%7C|2684217|您可以使用上面的thie作为子查询来提取每行最常见的子字符串。|2684218|entityMap^0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|O|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|Q|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|R|8|@]|9|@]|A|$E|F]]|$1|K|3|L|5|6|7|S|8|@]|9|@]|A|$]]|$1|M|3|-4|5|6|7|T|8|@]|9|@]|A|$]]]|N|$]]

This query does what you need.
<pre class="lang-sql prettyprint-override"><code>select substr(t, 1, 2) mc
 from unnest(array['45331110', '50721000', '45251250', '42160000', '39715000', '45315000', '09323000', '71321200', '45331100', '50720000']) t 
 group by mc
 order by count(1) desc
 limit 1;
</code></pre>
Result:
<pre><code>Name|Value|
----|-----|
mc |45 |
</code></pre>
You may use thie above as a subquery to extract the most common substring per row.

I need to find a way to determine the most common substring from within an array in PostgreSQL.
I've got a single dimension array in a column in PostgreSQL that is storing CPV values (a nested classification vocabulary - <a href="https://simap.ted.europa.eu/cpv" rel="nofollow noreferrer">https://simap.ted.europa.eu/cpv</a>). The codes made up of numeric characters, but stored as varchar as some records have a leading zero, like this:
<code>[&quot;45331110&quot;, &quot;50721000&quot;, &quot;45251250&quot;, &quot;42160000&quot;, &quot;39715000&quot;, &quot;45315000&quot;, &quot;09323000&quot;, &quot;71321200&quot;, &quot;45331100&quot;, &quot;50720000&quot;]</code>
I want to extract the most common leading two digits from this array using PostgreSQL, which in the example case would be <code>45</code>.

PostgreSQL - Finding the most common substring() in an array

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

我需要找到一种方法从PostgreSQL中的数组中确定最常见的子字符串。我在PostgreSQL的一个列中有一个单维数组，它存储CPV值(一个嵌套的分类词汇表- )。这些代码由数字字符组成，但作为varchar存储，因为某些记录具有前导零，如下所示：["45331110", "50721000", "45251250"...

问PostgreSQL -在数组中查找最常见的子字符串()
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问PostgreSQL -在数组中查找最常见的子字符串()EN