【C++】“美丽的字符”：string类

用户11456817

发布于 2025-05-20 09:39:49

13600

代码可运行

文章被收录于专栏：学习学习

运行总次数：0

代码可运行

string是一个类，原模版类型叫"basic_string"，基础串。

但平时基本上不会使用"basic_string"这个类。

我们使用的是"string"这个类，string是一个第一个模板参数传"char"的"basic_string"，它是typedef出来的一个类（日常使用的是这个）

u16string，是一个传16位的char，是两字节。

u32string，是一个传32位的char，是四字节。

为什么学习string类：

在c语言中，字符串是以'\0'结尾的一些字符集合，为了操作方便，c语言提供了一些str系列的库函数，但是这些库函数是于字符串分离开的，不太符合oop思想，而且底层需要用户自己管理，稍不留神可能还会越界访问

什么是string：

string 就是字符串的意思，是 C++用来代替 char 数组 的数据结构。里面封装了一些常用的方法，方便我们对其进行一些操作，而且string的空间大小是动态变化的，大大减小了不必要的花销

string类与char*的区别

char* 是一个指针
string本质上是一个类，类的内部封装了char*，即string是一个char*型的容器
string管理char*所分配的内存，不用担心复制越界和取值越界等

标准库中的string类：

http://www.cplusplus.com/reference/string/string/?kw=string

typedef basic_string<char> string;

string底层本质上还是一个顺序表

既然底层是一个顺序表，那么我们访问string类对象时，就可以使用下标引用操作符来进行访问

string s1("hello world");
cout<<s1[1]<<endl;//输出e

class string
{
public:
  char& operator[](size_t pos)//传引用返回是为了能方便的改变str中的值
  {
     assert(pos<_szie);//防止越界
     return _str[pos];
  }//严格来说，这里应该会有两个版本，一个const版本，一个普通版本。
private:
  char*_str;
  size_t _size;
  size_t _capacity;
};
---------------------------
--------------------------------------------
#include<iostream>
#include<string>
using namespace std;
int main()
{
	string s2("111111",3);
    cout << s2 << endl;

	s2[0] = 'x';//s2中第一个字符被修改为'x',它实际上是调用一个函数：s2.operator[](0)='x';
	cout << s2 << endl;
   
	return 0;
}

auto与范围for：

auto关键字

1.是作为一个新的类型指示符来指示编译器，auto声明的变量必须由编译器在编译时期推导而得

2.用auto声明指针类型时，auto与auto*没有区别，但是当auto声明引用类型时，必须加&。

3.在同一行声明多个变量时，变量必须是相同类型的。

4.auto不能作为函数的参数，可以作为返回值，但谨慎使用

5.auto不能直接用来声明数组

为什么不能做函数的参数：因为函数参数的类型需要在函数声明时就确定下来，而auto对类型的推导是依赖于赋值时的上下文的，并且auto对类型的推导是在编译时期进行的，但编译器正确地生成函数签名并进行类型检查是在编译时期进行的，用auto做函数参数编译器就不能确定函数签名，所以auto不能作为函数参数
auto不能用来声明数组

int arr[]={1,2,3};
auto anotherArr=arr;//这里得到的不是一个新数组，而是一个指向数组首元素的指针

范围for

范围for这种操作又叫“语法糖”

对于一个有范围的集合而言，由程序员来说明循环的范围是多余的，有时候还会容易犯错误。所以c++11中引入了基于范围的for循环

int arr[5]={1,2,3,4,5};
for(auto x:arr)
{
  cout<<x<<" ";
}

string s1("hello world");
for(auto s:s1)
{
  cout<<s<<" ";
}

vs和g++下string结构的说明：

vs下stirng的结构

1.当字符串长度小于16时，使用内部固定的字符数组来存放

2.当字符串长度大于等于16时，从堆上开辟空间

其次：还有一个sizet字段保存字符串长度，一个sizet字段保存从堆上开辟空间总的容量最后：还有一个指针做一些其他事情。故总共占16+4+4+4=28个字节。

union _Bxty
 {   
// storage for small buffer or pointer to larger one
 value_type _Buf[_BUF_SIZE];
 pointer _Ptr;
 char _Alias[_BUF_SIZE]; // to permit aliasing
 } _Bx;

g++下string的结构

g++下，string是通过写时拷贝实现的，string对象总共占4个字节，内部只包含了一个指针，该指针将来指向一块堆空间，内部包含了如下字段：

空间总大小
字符串有效长度
引用计数
指向堆空间的指针，用来存储字符串。

struct _Rep_base
 {
 size_type               
size_type               
_Atomic_word            
};

string的接口：

默认构造：string();

#include<iostream>
#include<string>
using namespace std;
int main()
{
	string s1("111111");
	cout << s1 << endl;
	return 0;
}

拷贝构造：string (const string&str);

#include<iostream>
#include<string>
using namespace std;
int main()
{
	string s1("111111");
	cout << s1 << endl;
	return 0;
}

string (size_t n, char c);

#include<iostream>
#include<string>
using namespace std;
int main()
{
	string s3(100,'x');
	cout << s3 << endl;
  
	return 0;
}

string (const string& str, size_t pos, size_t len = npos);

//Copies the portion of str that begins at the character position pos and spans len characters (or until the end of str, if either str is too short or if len is string::npos).
//如果没有给出len，或者Len是npos，那么则拷贝到字符串末尾
//如果给出的len位20，但实际字符没这么长，则拷贝到字符串末尾
#include<iostream>
#include<string>
using namespace std;
int main()
{
    string s4("11112222");
    string s5(s4,3,3);

    cout << s5 << endl;
    return 0;
}
------------------------------------
#include<iostream>
#include<string>
using namespace std;
int main()
{
    string s4("11112222");
    string s6(s4,3);

    cout << s6 << endl;
    return 0;
}
---------------------------------------
#include<iostream>
#include<string>
using namespace std;
int main()
{
    string s4("11112222");
    string s7(s4,3，20);

    cout << s7 << endl;
    return 0;
}

#include<iostream>
#include<string>
using namespace std;
int main()
{
  string s1;//默认构造
  return 0;
}

假如我们要遍历string中的每个字符，我们可以调用string中size函数，size可以返回它的字符个数

#include<iostream>
#include<string>
using namespace std;
int main()
{
	string s2("111111",3);
    cout << s2 << endl;//输出111

    for(int i=0;i<s2.size();i++)
    {
      cout<<++s2[i]<<endl;//输出222
    }
   
	return 0;
}

在访问字符串时也不用担心越界的问题，因为一旦越界，就会断言报错.

int main()
{
	string s1("hello world");
	for (int i=0;i<s1.size();i++)
	{
		cout << s1[i] << " ";
	}
	cout << endl;
  
	for (int i = 0; i < s1.size(); i++)
	{
		cout << ++s1[i] << " ";
	}
	cout << endl;

	string::iterator it1 = s1.begin();//begin指向string的第一个字符，end指向string最后一个字符的下一个字符。
	while (it1!=s1.end())
	{
		cout << *it1 << " ";
		++it1;
	}
	cout << endl;
  
	it1 = s1.begin();
	while (it1 != s1.end())
	{
		cout << --(*it1) << " ";
		it1++;
	}
	cout << endl;

	return 0;
}
************************
int main()
{
	string s1("hello world");
	//string::reverse_iterator it = s1.rbegin();//反向迭代器
	auto it = s1.rbegin();
	while (it!=s1.rend())
	{
		cout << *it << " ";
		++it;
	}
	return 0;
}
**************************
int main()
{
	const string s5("hello world");
	string::const_iterator it = s5.begin();//const迭代器
	while (it!=s5.end())
	{
		cout << *it << " ";
		++it;
	}

	return 0;
}

用下标和[]能访问的string，为什么还要学迭代器？

因为这个方式只适合string和vector，它俩的内部是一块连续的空间，所以适合下标加[]，但在往后学的其他容器，就不适合这样了，就需要用到迭代器。（容器是所有容器都通用的方式）

对于迭代器，你可以理解为一个像指针一样的东西，但不能把她理解为指针。

比如在List容器中：

int main()
{
	list<int> lt;
	lt.push_back(1);
	lt.push_back(2);
	lt.push_back(3);
	lt.push_back(4);

	list<int>::iterator it = lt.begin();
	while (it!=lt.end())
	{
		cout << *it << " ";
		++it;
	}
	return 0;
}
//list容器是一个链表，不适合用下标加[]

c++中，有一个auto关键字，它能自动推导类型（对于引用类型，它推导的是引用变量保存的值的类型）

推导出i的类型是int

推导出该值的类型为double

对于引用类型，它推导的是引用变量中保存的值的类型

如果要推导引用类型，需要如下操作：

auto还支持做函数的范围

auto fun(int a,int b)
{
	return a + b;
}

int main()
{
	int a = 1;
	int b = 2;
	cout << fun(a,b) << endl;
	return 0;
}

不能用auto的情况：

auto不能做函数的形参类型，因为编译器不能对auto的类型进行推导，只有在进行函数调用时，才知道进行传参，在这之前，不知道auto的类型，对于栈帧的大小，以及给形参变量分配的空间就无法确定
auto不能用来声明数组

auto(c++11才支持)必须给初始值，因为auto自己都不知道自己是什么类型，他的类型是根据你给的初始值推导出来的。那auto真正的价值在哪？（像auto这样的，也叫做语法糖）

#include<list>
int main()
{
 list<int>lt;
  //如果不用auto，则是像下面这行代码
  list<int>::iterator it=lt.begin();
  //用了auto，则是像下面这行代码
  auto it=lt.begin();
  
  return 0;
}

像auto这样的语法糖还有“范围for”

//与迭代器和下标加[]相比，范围for是没有逻辑的
//自动判断结束，自动++
//自动获取数据赋值给左边的值
int main()
{

	string s1("hello world");

	for (char ch :s1)
	{
		cout << ch << " ";
	}
	cout << endl;
	return 0;
}
*******************************************
int main()
{

	string s1("hello world");

	for (char ch :s1)//范围for获取到的值赋值给ch//这里的char也可也用auto替换
	{
		char c = ch;//再通过ch赋值给c//这里的char也可也用auto替换
		cout << c << " ";
	}
	cout << endl;
	return 0;
}

对于范围for，底层其实就是迭代器，容器才支持范围for，因为容器才支持迭代器

string底层是类似顺序表的，但相比顺序表，它多出了一个数组buff[15]

class string
{
  private:
    char buff[16];
    char*_str;
    size_t _capacity;
    size_t _size;
};

string常用接口：

reserve()：为字符串保留空间，提前预留空间，可以减少因为扩容造成的效率低下 a.如果预留的空间比初始值小，不会有什么（size,capacity都不会变化）变化，但若大，则会进行扩容，但也不会影响到size
reverse()：反转
size()：返回字符串有效长度（不包括\0）
length()：这个不常用，因为历史原因遗留了下来，这个用size()替换
capacity()：返回空间总大小→初始空间编译器会给出15，实际是16，因为还有个'\0'，不同编译器给出的空间带下不一样
empty()：检测字符串是否为空串，若为空，返回true，否则返回false（它只起到检测的作用，不会清理字符串）
claer()：清理有效字符
reserve()：为字符串预留空间，vs下初始空间为15，如果预留的空间大于原空间则会影响到capacity()，此时编译器会进行扩容，且扩容后的空间会比你预留的空间要大
resize()：a.将字符串的有效字符个数修改成n个 , b.如果n大于原capacity，则影响到capacity，会扩容空间，c.若小于原capacity，则不会扩容。d.若n小于原来的size，则会删除数据。e.若n大于size小于capacity，则会在后面差入新的特殊字符，若显示给出了字符，则用显示的字符

string s1("helloworld");
cout << s1 << endl;
cout << s1.size() << endl;
cout << s1.capacity() << endl;

cout << endl;


s1.resize(5);// 将有效字符个数修改成5个，相当于删除了后面的数据
cout << s1.size() << endl;
cout << s1.capacity() << endl;

reverse()：将字符串反转，

operator[]，重载：和数组下标一个用法

rbegin()指向字符串最后一个元素，rend()指向第一个元素之前的位置，可以理解为-1的位置

rebegin()的返回类型是"reverse_iterator"，使用时需要注意。

push_back()：在字符串末尾尾插一个字符

string str1("hello");
str1.push_back(' ');
str1.push_back('w');
cout << str1 << endl;//输出hello w

append()：在字符串末尾追加一个字符串

string str1("hello");
str1.append(" world");
cout << str1 << endl;//输出hello world

operator+=（函数重载）：在字符串后追加字符串

string str1("hello");
str1 += " world";
cout << str1 << endl;//输出hello world

size_t find()：在str1中查找，字符串c，找到则返回第一个字符出现的位置的下标，没找到则返回npos//从str1中的n位置开始，向后查找字符c，找到返回第一个字符出现的位置的下标，没找到返回npos

  string str1("abcdefg");
  string c("ef");
  int index = str1.find(c);
  if (index!=string::npos)
	{
		cout << "找到了:" <<index<< endl;
	}
	else
	{
		cout << "没找到" << endl;
	}
**************************
	string str1("abcdefg");
	string c("ef");
	int index = str1.find(c,5);//str1.find(c,n);
	if (index!=string::npos)
	{
		cout << "找到了:" <<index<< endl;
	}
	else
	{
		cout << "没找到" << endl;
	}
****************************************
string s1("hello world");
string s2("sw");
size_t ret=s1.find_first_of(s2,7);
if (ret != string::npos)
	cout << ret;
else
	cout << "no"; 

size_t find_first_of()->在s1中查找s2中与s1相匹配的字符，只需要有一个字符像匹配即可

string (1)	
size_t find_first_of (const string& str, size_t pos = 0) const;
c-string (2)	
size_t find_first_of (const char* s, size_t pos = 0) const;
buffer (3)	
size_t find_first_of (const char* s, size_t pos, size_t n) const;
character (4)	
size_t find_first_of (char c, size_t pos = 0) const;

find与find_first_of区别：find是要字符串全部匹配，而find_first_of只需要有一个字符匹配就行

const charc_str()：返回值是const char类型的，返回c格式的字符串，也就是返回存储的字符串的首元素的地址。

const char* c_str() const;

string str1("hello");//如果想把str1的值赋值给char类型的ch变量，就需要用到c_str()
char ch[10];
strcpy(ch,str1.c_str());
cout << ch << endl;
//对于cout打印const char*类型的数据，因为重载的原因,它会自动识别类型，然后解引用得出值
//如果非要打印地址，可以把ch强转成void*
cout<<(void*)ch<<endl;

substr()：从pos位置开始，包括pos位置，向后拷贝len长度的字符。

string str1("hello world");
string str2 = str1.substr(0);
cout << str2 << endl;//输出hello world
//string str2 = str1.substr(0,5);
//cout << str2 << endl;//输出hello
*******************************
string s1("hello world");
string s2(s1.substr(2,6));
cout << s2;

max_size()：返回一个容器所能容纳的最大元素数量的值

string s1("hello world");
cout << s1.max_size();

compare()：比较两个字符串是否相等，返回值是Int

	string s1("h");
	string s2("h");
	int ret=s1.compare(s2);
	if (ret>0)
	{
		cout << "y";
	}
	else if (ret < 0)
	{
		cout << "n";
	}
	else
	{
		cout << "same";
	}

getline()：若用cin来输入，则当读取到空格时，就会停止输入，若想要读取这个空格，就可以用getline()

string s1;
getline(cin,s1);//如果用cin来输入"hello world"，则只会读取到hello,但是getline则会全部读取
cout << s1;

insert()：在pos位置之前插入字符或字符串

string s1("hello");
s1.insert(0,"world");
//在h前面插入world

assign()：对字符串赋新值

************1*************
string s1("hello world");
string s2("xxxxx");
s1.assign(s2);//把s2的值赋值给s1
cout << s1;
************2*************
string s1("hello world");
string s2("yyyxxxxx");
s1.assign(s2,0,3);//s2第0位置开始，包括0位置向后长度为3的字符串赋值给s1
cout << s1;

在c++中，在string类中提供了两个swap，在算法库中也提供了一个swap。

string内置一个swap函数，在全局定义swap函数，算法库定义一个算法库。

为什么string中要提供两个swap？当我们调用swap函数时，如果有现成的函数，则调用现成的函数，若没有，则通过模板（算法库中的swap）来实例化一个函数，而string中提高的全局swap就是为了尽可能避免调用算法库中的swap，提高了效率

算法库中的swap支持对自定义类型自行交换。

void test4()
{
	string s1("hello world");
	string s2("xxxxxxxxxxxxxxx");

	swap(s1,s2);//算法库中的swap，走的深拷贝进行交换
	cout << s1 << endl;
	cout << s2 << endl;
}
//虽然算法库里提供了一个swap，但是使用起来代价极大，因为他要多几次释放空间，开辟空间
//如果交换的是内置类型，则不需要考虑空间释放与开辟的问题

string类中的swap

类中提供的swap函数效率比算法库里的swap高

//string类中的成员函数交换的是对象的指针，避免对空间进行多出释放与申请
	void string::swap(string& s)
	{
		std::swap(_str,s._str);
		std::swap(_size,s._size);
		std::swap(_capacity,s._capacity);
	}

编码:

本质就是：文字表示

计算机底层存储的都是0101

计算机存储文字都是间接存储的，它底层实际上存储的还是0101，只不过它借助了一个叫编码表的概念

编码表：

一些比特位的值与符号的映射--- >底层用8个二进制数表示字符，读取时，通过编码表，将对于的字符显示到屏幕上

表示老美文字的表：

ascii（American Standard Code for Information Interchange）
unicode:万国码

a.UTF-8，变长编码，兼容ascii

i.第一位是0开始的，就表示ascii

ii.用两个字节来表示常见的汉字，

c.UTF-16,用两个字节来表示字符，存在很多缺陷

d.UTF-32,用4个字节表示字符，形式统一，但是浪费空间

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2025-05-19，如有侵权请联系 cloudcommunity@tencent.com 删除

函数

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

登录后参与评论

0 条评论

热度