这是一个非常简单的例子
$Test = @('ae','æ')
$Test | Select-Object -Unique输出量
ae这里发生了什么,我该如何避免。显然,我不希望"ae“等于”ae“。
发布于 2022-06-15 10:01:17
正如注释中提到的,您当前的区域性设置将ae和æ标识为相等,因此它只返回输入数组中的第一个。
如果您倒序,您将得到æ:
$Test = @('æ','ae')
$Test | Select-Object -Unique
# æ您可以检查PowerShell与以下内容一起使用的区域性:
PS> Get-Culture
LCID Name DisplayName
---- ---- -----------
2057 en-GB English (United Kingdom)尽管请注意,在@mclement0的评论中,PowerShell并没有一致地使用这种文化.
事实证明,当前的文化确实适用于Select(它目前也出人意料地(总是)区分大小写)。似乎PowerShell在文化不变性方面具有分裂的个性:字符串转换、字符串内插和与字符串相关的操作符(除了>)使用不变的区域性,而cmdlet使用当前的区域性。
无论如何,要了解更多的详细信息,请参阅序号串运算,而不是一种文化感知的比较,它听起来像是您所追求的是“序号”比较。
序数比较是指在没有语言解释的情况下对每个字符串的每个字节进行比较的字符串比较;例如,"Windows“与"windows”不匹配。
(扩展而言,ae并不等于æ)
在PowerShell中,我找不到一种惯用的方法来做到这一点(您可以用Set-Culture改变文化,但我尝试过的所有方法仍然将ae与æ等同对待),但是如果您想要更多地控制如何比较值,您可以这样进入Linq:
PS> $data = @( "ae", "æ" )
PS> [System.Linq.Enumerable]::Distinct([string[]]$data, [System.StringComparer]::Ordinal )
ae
æ然后,您就有了大量不同的比较字符串的方法:
https://learn.microsoft.com/en-us/dotnet/api/system.stringcomparer?view=net-6.0#properties
你甚至可以实现你自己的
class FirstLetterComparer : System.Collections.Generic.IEqualityComparer[string] {
[bool]Equals([string]$x, [string]$y) { return $x[0] -eq $y[0]; }
[int]GetHashCode([string] $x) { return $x[0].GetHashCode(); }
}
# returns the first item in the list that starts with each distinct character.
# note that "abb" is omitted because it starts with the same first letter as "aaa"
# so it's not "first letter distinct".
$data = @( "aaa", "abb", "bbb" )
[System.Linq.Enumerable]::Distinct([string[]]$data, [FirstLetterComparer]::new() )
# aaa
# bbb发布于 2022-06-16 00:11:02
若要添加到mclayton的好答案,请提供背景信息
-eq PowerShell )确实使用当前区域性,但存在使用不变区域性的上下文,特别是 / PowerShellE 118操作符E 219--参见这个答案。- _Windows PowerShell_, the legacy, ships-with-Windows edition, whose latest and final version is 5.1, which builds on the legacy, Windows-only _.NET Framework_, which uses [**NSL (National Language Support)**](https://learn.microsoft.com/en-us/windows/win32/intl/national-language-support) for culture-specific information.- [_PowerShell (Core) 7+_](https://github.com/PowerShell/PowerShell/blob/master/README.md), which builds on the cross-platform _.NET 5+_ edition, which now uses the [**ICU (International Components for Unicode) library**](https://icu.unicode.org/) _by default_ - though [_on Windows_ you can opt-into still using NLS](https://learn.microsoft.com/en-us/dotnet/core/extensions/globalization-icu).继续阅读,了解细节。
- The **ligature** **`æ`** **is considered** _**equivalent**_ **to the sequence of its** _**constituent letters**_ **in** _**most**_ **cultures**, _**except**_ in those: - where `æ` is in use as a character in its own right ...
- _and_ is _not_ considered equivalent to the sequence of its constituent letters.- These exceptions are (only the so-called _neutral_ (non-nation-specific) cultures are listed, not also their national varieties): - da (Danish)
- is (Icelandic)
- kl (Kalaallisut)
- nb (Norwegian Bokmål)
- nn (Norwegian Nynorsk)
- no (Norwegian)
- se (Northern Sami)
- sma (Sami (Southern))
- smj (Sami (Lule))
- smn (Sami (Inari))
- sms (Sami (Skolt))- **Other ligatures have multi-letter equivalents in** _**all**_ **cultures**, such as [`œ`](https://en.wikipedia.org/wiki/%C5%92) vs. `oe`; there are also ligatures whose multi-letter equivalent is _not_ the sequence of its constituent letters, but a modern equivalent, e.g., German [`ß`](https://en.wikipedia.org/wiki/%C3%9F) (which originated from `sz`) is considered equivalent to `ss`.- At least as of the ICU version that underlies PowerShell 7.2.4, **ligatures** _**in general**_ **are seemingly** _**never**_ **considered equivalent to their constituent letters** in string comparisons.https://stackoverflow.com/questions/72626425
复制相似问题