我有一个表项名称,如下所示:
Microsoft Word
Adobe Premiere
Paint
Mozila Firefox
Adobe Photoshop CS7
Windows Movie Maker
我想选择如下数据(表产品、列名):
Microsoft
Word
Microsoft Word
Adobe
PremiereF
Adobe Premier
Paint
Mozila firefox
Adobe
Photoshop
CS7
Adobe Photoshop
Photoshop CS7
Windows
Movie
Maker
我正在使用Postgres...有没有可能这样做呢?
发布于 2018-09-10 22:26:31
我真的不清楚你的预期结果是什么。
对于Adobe Photoshop CS7
,您的结果是:
Adobe
Photoshop
CS7
Adobe Photoshop
Photoshop CS7
那么原始字符串Adobe Photoshop CS7
呢?对于解决方案,我期待你想要所有的子短语在正确的顺序。因此,解决方案应该包括Adobe Photoshop CS7
结果。这是由您的其他结果指示的,其中包括原始字符串。
(1)第一步:从头开始获取所有子短语:
String: A B C D E
A
A B
A B C
A B C D
A B C D E
查询
WITH single_words AS (
SELECT *, row_number() OVER (PARTITION BY id) AS nth_word FROM ( -- B
SELECT id, regexp_split_to_table(phrase, '\s') as word FROM phrases -- A
)s
)
SELECT
array_agg(word) OVER (PARTITION BY id ORDER BY nth_word) as phrase_part -- C
FROM single_words;
答:WITH
查询使查询更简单,只需编写同一个子查询一次(在(2)中使用)。regexp_split_to_table
函数在空格处拆分字符串,并将每个单词放入一行中。
B:窗口函数row_number
向单词添加一个计数器,该计数器指示原始字符串(https://www.postgresql.org/docs/current/static/tutorial-window.html)中的起始位置。
C:窗口函数array_agg() OVER (... ORDER BY nth_word)
将单词聚合到一个列表中。ORDER BY
用于获取由原始单词位置指示的升序单词列表(如果没有ORDER BY
,array_agg
将添加短语中的所有单词,从而获得所有word
行的原始字符串)
(2)第二步:从所有起始点获取所有子短语:
String: A B C D E
A
B
C
D
E
A B
B C
C D
D E
A B C
B C D
C D E
A B C D
B C D E
A B C D E
查询
WITH single_words AS ( -- A
SELECT *, row_number() OVER (PARTITION BY id) AS nth_word FROM (
SELECT id, regexp_split_to_table(phrase, '\s') as word FROM phrases
)s
)
SELECT
*,
array_agg(b.word) OVER (PARTITION BY a.id, a.nth_word ORDER BY a.id, a.nth_word, b.nth_word) as phrase_part -- C
FROM single_words a -- B
JOIN single_words b
ON (a.id = b.id AND a.nth_word <= b.nth_word)
A:同(1)
B:将这些短语与它们自己交叉连接;更好地说:将同一短语的每个后面的单词连接起来
C:这个窗口函数将短语单词聚合到给定的结果。
如果您不喜欢该数组,可以使用函数array_to_string(phrase_part, ' ')
将结果转换为字符串
发布于 2017-11-02 01:11:28
您可以使用regexp_split_to_array
CREATE TABLE s(c TEXT);
INSERT INTO s(c) VALUES('Microsoft Word'), ('Adobe Premiere');
SELECT unnest(regexp_split_to_array(s.c, '\s+'))
FROM s
UNION ALL
SELECT c
FROM s;
编辑:
要获得您可以使用的每种组合:
WITH src AS (
SELECT id,name, rn::int, (MAX(rn) OVER(PARTITION BY id))::int AS m_rn
FROM s,
unnest(regexp_split_to_array(s.c, '\s+')) WITH ORDINALITY AS sub(name,rn)
)
SELECT id, string_agg(b.Name ,' ' ORDER BY rn) AS combination
FROM (SELECT p.id, p.Name, p.rn, RIGHT(o.n::bit(16)::text, m_rn) AS bitmap
FROM src AS p
CROSS JOIN generate_series(1, 100000) AS o(n)
WHERE o.n < 2 ^ m_rn) b
WHERE SUBSTRING(b.bitmap, b.rn, 1) = '1'
GROUP BY b.id, b.bitmap
ORDER BY id, b.bitmap;
https://stackoverflow.com/questions/47060138
复制相似问题