目前我可以像下面这样做字母表的数组
[[NSArray alloc]initWithObjects:@"A",@"B",@"C",@"D",@"E",@"F",@"G",@"H",@"I",@"J",@"K",@"L",@"M",@"N",@"O",@"P",@"Q",@"R",@"S",@"T",@"U",@"V",@"W",@"X",@"Y",@"Z",nil];
我知道这一点可以通过
[NSCharacterSet uppercaseLetterCharacterSet]
如何创建一个数组呢?
发布于 2018-09-02 11:03:17
受Satachito answer的启发,这里有一种使用bitmapRepresentation
从CharacterSet生成数组的高效方法
extension CharacterSet {
func characters() -> [Character] {
// A Unicode scalar is any Unicode code point in the range U+0000 to U+D7FF inclusive or U+E000 to U+10FFFF inclusive.
return codePoints().compactMap { UnicodeScalar($0) }.map { Character($0) }
}
func codePoints() -> [Int] {
var result: [Int] = []
var plane = 0
// following documentation at https://developer.apple.com/documentation/foundation/nscharacterset/1417719-bitmaprepresentation
for (i, w) in bitmapRepresentation.enumerated() {
let k = i % 0x2001
if k == 0x2000 {
// plane index byte
plane = Int(w) << 13
continue
}
let base = (plane + k) << 3
for j in 0 ..< 8 where w & 1 << j != 0 {
result.append(base + j)
}
}
return result
}
}
uppercaseLetters示例
let charset = CharacterSet.uppercaseLetters
let chars = charset.characters()
print(chars.count) // 1733
print(chars) // ["A", "B", "C", ... "]
不连续平面的示例
let charset = CharacterSet(charactersIn: "")
let codePoints = charset.codePoints()
print(codePoints) // [120488, 837521]
性能
非常好:这个使用bitmapRepresentation
发布的解决方案似乎比Martin R使用contains
的解决方案或Oliver Atkinson使用longCharacterIsMember
的解决方案快3到10倍。
发布于 2013-04-01 18:29:01
因为字符有一个有限的,有限的(不是太宽的)范围,你可以只测试哪些字符是给定字符集的成员(蛮力):
// this doesn't seem to be available
#define UNICHAR_MAX (1ull << (CHAR_BIT * sizeof(unichar)))
NSData *data = [[NSCharacterSet uppercaseLetterCharacterSet] bitmapRepresentation];
uint8_t *ptr = [data bytes];
NSMutableArray *allCharsInSet = [NSMutableArray array];
// following from Apple's sample code
for (unichar i = 0; i < UNICHAR_MAX; i++) {
if (ptr[i >> 3] & (1u << (i & 7))) {
[allCharsInSet addObject:[NSString stringWithCharacters:&i length:1]];
}
}
备注:由于unichar的大小和bitmapRepresentation中附加段的结构,此解决方案仅适用于字符<= 0xFFFF,不适用于更高的平面。
发布于 2015-11-25 21:24:30
我创建了Martin R算法的Swift (v2.1)版本:
let charset = NSCharacterSet.URLPathAllowedCharacterSet();
for var plane : UInt8 in 0...16 {
if charset.hasMemberInPlane( plane ) {
var c : UTF32Char;
for var c : UInt32 = UInt32( plane ) << 16; c < (UInt32(plane)+1) << 16; c++ {
if charset.longCharacterIsMember(c) {
var c1 = c.littleEndian // To make it byte-order safe
let s = NSString(bytes: &c1, length: 4, encoding: NSUTF32LittleEndianStringEncoding);
NSLog("Char: \(s)");
}
}
}
}
https://stackoverflow.com/questions/15741631
复制相似问题