我有一个字符串值数组,有时会形成重复值模式('a','b','c','d')
$array = array(
'a', 'b', 'c', 'd',
'a', 'b', 'c', 'd',
'c', 'd',
);
我想根据数组顺序找到重复的模式,并按相同的顺序对它们进行分组(以维护它)。
$patterns = array(
array('number' => 2, 'values' => array('a', 'b', 'c', 'd')),
array('number' => 1, 'values' => array('c'))
array('number' => 1, 'values' => array('d'))
);
请注意,a、b、b、c、c、d本身不是模式,因为它们位于较大的a、b、c、d模式中,并且最后一个c、d集合只出现一次,因此它也不是模式--只是单个值'c‘和'd’
另一个例子:
$array = array(
'x', 'x', 'y', 'x', 'b', 'x', 'b', 'a'
//[.......] [.] [[......] [......]] [.]
);
它会产生
$patterns = array(
array('number' => 2, 'values' => array('x')),
array('number' => 1, 'values' => array('y')),
array('number' => 2, 'values' => array('x', 'b')),
array('number' => 1, 'values' => array('a'))
);
我该怎么做呢?
发布于 2016-01-26 05:36:28
字符数组就是字符串。正则表达式是字符串模式匹配的王者。添加递归,解决方案非常优雅,即使从字符数组来回转换也是如此:
function findPattern($str){
$results = array();
if(is_array($str)){
$str = implode($str);
}
if(strlen($str) == 0){ //reached the end
return $results;
}
if(preg_match_all('/^(.+)\1+(.*?)$/',$str,$matches)){ //pattern found
$results[] = array('number' => (strlen($str) - strlen($matches[2][0])) / strlen($matches[1][0]), 'values' => str_split($matches[1][0]));
return array_merge($results,findPattern($matches[2][0]));
}
//no pattern found
$results[] = array('number' => 1, 'values' => array(substr($str, 0, 1)));
return array_merge($results,findPattern(substr($str, 1)));
}
发布于 2014-01-23 07:19:56
如果c和d可以分组,这是我的代码:
<?php
$array = array(
'a', 'b', 'c', 'd',
'a', 'b', 'c', 'd',
'c', 'd',
);
$res = array();
foreach ($array AS $value) {
if (!isset($res[$value])) {
$res[$value] = 0;
}
$res[$value]++;
}
foreach ($res AS $key => $value) {
$fArray[$value][] = $key;
for ($i = $value - 1; $i > 0; $i--) {
$fArray[$i][] = $key;
}
}
$res = array();
foreach($fArray AS $key => $value) {
if (!isset($res[serialize($value)])) {
$res[serialize($value)] = 0;
}
$res[serialize($value)]++;
}
$fArray = array();
foreach($res AS $key => $value) {
$fArray[] = array('number' => $value, 'values' => unserialize($key));
}
echo '<pre>';
var_dump($fArray);
echo '</pre>';
最终结果是:
array (size=2)
0 =>
array (size=2)
'number' => int 2
'values' =>
array (size=4)
0 => string 'a' (length=1)
1 => string 'b' (length=1)
2 => string 'c' (length=1)
3 => string 'd' (length=1)
1 =>
array (size=2)
'number' => int 1
'values' =>
array (size=2)
0 => string 'c' (length=1)
1 => string 'd' (length=1)
发布于 2016-01-25 08:09:57
好的,这是我的观点,下面的代码将整个原始数组分成最长的相邻非重叠块。
所以在这样的情况下
'a', 'b', 'a', 'b', 'a', 'b', 'a', 'b', 'c', 'd'
[ ] [ ] [ ] [ ] <-- use 2 long groups
[ ] [ ] [ ] [ ] [ ] [ ] <-- and not 4 short
它将更喜欢2个长组而不是4个较短的组。
更新:也用另一个答案中的例子进行了测试,也适用于这些情况:
one, two, one, two, one, two, one, two
[one two one two], [one two one two]
'one' 'two' 'one' 'two' 'three' 'four' 'one' 'two' 'three' 'four'
['one'] ['two'] ['one' 'two' 'three' 'four'] ['one' 'two' 'three' 'four']
以下是代码和测试:
<?php
/*
* Splits an $array into chunks of $chunk_size.
* Returns number of repeats, start index and chunk which has
* max number of ajacent repeats.
*/
function getRepeatCount($array, $chunk_size) {
$parts = array_chunk($array, $chunk_size);
$maxRepeats = 1;
$maxIdx = 0;
$repeats = 1;
$len = count($parts);
for ($i = 0; $i < $len-1; $i++) {
if ($parts[$i] === $parts[$i+1]) {
$repeats += 1;
if ($repeats > $maxRepeats) {
$maxRepeats = $repeats;
$maxIdx = $i - ($repeats-2);
}
} else {
$repeats = 1;
}
}
return array($maxRepeats, $maxIdx*$chunk_size, $parts[$maxIdx]);
}
/*
* Finds longest pattern in the $array.
* Returns number of repeats, start index and pattern itself.
*/
function findLongestPattern($array) {
$len = count($array);
for ($window = floor($len/2); $window >= 1; $window--) {
$num_chunks = ceil($len/$window);
for ($i = 0; $i < $num_chunks; $i++) {
list($repeats, $idx, $pattern) = getRepeatCount(
array_slice($array, $i), $window
);
if ($repeats > 1) {
return array($repeats, $idx+$i, $pattern);
}
}
}
return array(1, 0, [$array[0]]);
}
/*
* Splits $array into longest adjacent non-overlapping parts.
*/
function splitToPatterns($array) {
if (count($array) < 1) {
return $array;
}
list($repeats, $start, $pattern) = findLongestPattern($array);
$end = $start + count($pattern) * $repeats;
return array_merge(
splitToPatterns(array_slice($array, 0, $start)),
array(
array('number'=>$repeats, 'values' => $pattern)
),
splitToPatterns(array_slice($array, $end))
);
}
测试:
function isEquals($expected, $actual) {
$exp_str = json_encode($expected);
$act_str = json_encode($actual);
$equals = $exp_str === $act_str;
if (!$equals) {
echo 'Equals check failed'.PHP_EOL;
echo 'expected: '.$exp_str.PHP_EOL;
echo 'actual : '.$act_str.PHP_EOL;
}
return $equals;
}
assert(isEquals(
array(1, 0, ['a']), getRepeatCount(['a','b','c'], 1)
));
assert(isEquals(
array(1, 0, ['a']), getRepeatCount(['a','b','a','b','c'], 1)
));
assert(isEquals(
array(2, 0, ['a','b']), getRepeatCount(['a','b','a','b','c'], 2)
));
assert(isEquals(
array(1, 0, ['a','b','a']), getRepeatCount(['a','b','a','b','c'], 3)
));
assert(isEquals(
array(3, 0, ['a','b']), getRepeatCount(['a','b','a','b','a','b','a'], 2)
));
assert(isEquals(
array(2, 2, ['a','c']), getRepeatCount(['x','c','a','c','a','c'], 2)
));
assert(isEquals(
array(1, 0, ['x','c','a']), getRepeatCount(['x','c','a','c','a','c'], 3)
));
assert(isEquals(
array(2, 0, ['a','b','c','d']),
getRepeatCount(['a','b','c','d','a','b','c','d','c','d'],4)
));
assert(isEquals(
array(2, 2, ['a','c']), findLongestPattern(['x','c','a','c','a','c'])
));
assert(isEquals(
array(1, 0, ['a']), findLongestPattern(['a','b','c'])
));
assert(isEquals(
array(2, 2, ['c','a']),
findLongestPattern(['a','b','c','a','c','a'])
));
assert(isEquals(
array(2, 0, ['a','b','c','d']),
findLongestPattern(['a','b','c','d','a','b','c','d','c','d'])
));
// Find longest adjacent non-overlapping patterns
assert(isEquals(
array(
array('number'=>1, 'values'=>array('a')),
array('number'=>1, 'values'=>array('b')),
array('number'=>1, 'values'=>array('c')),
),
splitToPatterns(['a','b','c'])
));
assert(isEquals(
array(
array('number'=>1, 'values'=>array('a')),
array('number'=>1, 'values'=>array('b')),
array('number'=>2, 'values'=>array('c','a')),
),
splitToPatterns(['a','b','c','a','c','a'])
));
assert(isEquals(
array(
array('number'=>2, 'values'=>array('a','b','c','d')),
array('number'=>1, 'values'=>array('c')),
array('number'=>1, 'values'=>array('d')),
),
splitToPatterns(['a','b','c','d','a','b','c','d','c','d'])
));
/* 'a', 'b', 'a', 'b', 'a', 'b', 'a', 'b', 'c', 'd', */
/* [ ] [ ] [ ] [ ] */
/* NOT [ ] [ ] [ ] [ ] [ ] [ ] */
assert(isEquals(
array(
array('number'=>2, 'values'=>array('a','b','a','b')),
array('number'=>1, 'values'=>array('c')),
array('number'=>1, 'values'=>array('d')),
),
splitToPatterns(['a','b','a','b','a','b','a','b','c','d'])
));
/* 'x', 'x', 'y', 'x', 'b', 'x', 'b', 'a' */
/* // [ ] [ ] [ ] [ ] [ ] [ ] */
assert(isEquals(
array(
array('number'=>2, 'values'=>array('x')),
array('number'=>1, 'values'=>array('y')),
array('number'=>2, 'values'=>array('x','b')),
array('number'=>1, 'values'=>array('a')),
),
splitToPatterns(['x','x','y','x','b','x','b','a'])
));
// one, two, one, two, one, two, one, two
// [ ] [ ]
assert(isEquals(
array(
array('number'=>2, 'values'=>array('one', 'two', 'one', 'two')),
),
splitToPatterns(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])
));
// 'one', 'two', 'one', 'two', 'three', 'four', 'one', 'two', 'three', 'four'
// [ ] [ ] [ ] [ ]
assert(isEquals(
array(
array('number'=>1, 'values'=>array('one')),
array('number'=>1, 'values'=>array('two')),
array('number'=>2, 'values'=>array('one','two','three','four')),
),
splitToPatterns(['one', 'two', 'one', 'two', 'three', 'four', 'one', 'two', 'three','four'])
));
/* 'a', 'a', 'b', 'a', 'b', 'a', 'b', 'a', 'b', 'c', */
/* [ ] [ ] [ ] [ ] */
assert(isEquals(
array(
array('number'=>1, 'values'=>array('a')),
array('number'=>2, 'values'=>array('a','b','a','b')),
array('number'=>1, 'values'=>array('c')),
),
splitToPatterns(['a','a','b','a','b','a','b','a','b','c'])
));
/* 'a', 'b', 'a', 'b', 'c', 'd', 'a', 'b', 'a', 'b', 'a', 'b' */
// [ ] [ ] [ ] [ ] [ ] [ ] [ ]
assert(isEquals(
array(
array('number'=>2, 'values'=>array('a', 'b')),
array('number'=>1, 'values'=>array('c')),
array('number'=>1, 'values'=>array('d')),
array('number'=>3, 'values'=>array('a','b')),
),
splitToPatterns(['a', 'b', 'a', 'b', 'c', 'd', 'a', 'b', 'a', 'b', 'a', 'b'])
));
/* 'a', 'c', 'd', 'a', 'b', 'a', 'b', 'a', 'b', 'a', 'b', 'c', */
/* [ ] [ ] [ ] [ ] [ ] [ ] */
assert(isEquals(
array(
array('number'=>1, 'values'=>array('a')),
array('number'=>2, 'values'=>array('a','b','a','b')),
array('number'=>1, 'values'=>array('c')),
),
splitToPatterns(['a','a','b','a','b','a','b','a','b','c'])
));
https://stackoverflow.com/questions/21295384
复制相似问题