tidb本身是一个分布式数据库,省去了分库分表的麻烦,对于mysql分库分表后数据库,如果不是分表,分库key维度的查询,需要跨多个表很麻烦,tidb是选择之一。如何把mysql数据同步到tidb?dm是tidb官方提供的存量、增量数据同步工具。本文将介绍如何在mac上实现服务搭建。
dm是经典的主从架构,dm-master负责元数据的管理和接受用户的dmctl发来的请求,dm-master自己实现了高可用主备切换,dm-worker 类似于canal伪装成mysql的slave,消费mysql的binlog,处理发送给tidb。一个mysql 只能对应一个dm worker,如果worker数量多于mysql实例的数量,worker将处于空闲状态。详细可以参考tidb的文档:https://docs.pingcap.com/zh/tidb/stable/dm-overview
首先安装并启动tidb
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
source /Users/xiazemin/.zshrc
tiup playground
CLUSTER START SUCCESSFULLY, Enjoy it ^-^
To connect TiDB: mysql --comments --host 127.0.0.1 --port 4000 -u root -p (no password)
To view the dashboard: http://127.0.0.1:2379/dashboard
PD client endpoints: [127.0.0.1:2379]
To view the Prometheus: http://127.0.0.1:9090
To view the Grafana: http://127.0.0.1:3000
按照官方的文档安装dm,默认是安装在linux上的,mac我测试的过程中遇到了如下问题:
tiup install dm dmctl
Error: component dmctl doesn't support platform darwin/amd64
没有办法只能下载源码编译
git clone https://github.com/pingcap/dm
make build
ls ./cmd
dm-ctl dm-master dm-portal dm-syncer dm-worker
尝试按照官方方式通过模板部署
tiup dm template > topology.yaml
修改配置文件后遇到了ssh连不上的问题
tiup dm deploy dm-mac-master v6.3.0 topology.yaml -u root
tiup is checking updates for component dm ...
Starting component `dm`: /Users/xiazemin/.tiup/components/dm/v1.11.0/tiup-dm deploy dm-mac-master v6.3.0 topology.yaml -u root
+ Detect CPU Arch Name
- Detecting node 127.0.0.1 Arch info ... Error
Error: failed to fetch cpu-arch or kernel-name: executor.ssh.execute_failed: Failed to execute command over SSH for 'root@127.0.0.1:22' {ssh_stderr: , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin uname -m}, cause: dial tcp 127.0.0.1:22: connect: connection refused
Verbose debug logs has been written to /Users/xiazemin/.tiup/logs/tiup-dm-debug-2022-10-29-15-19-53.log.
因为本机sshd服务没有起来,起sshd服务
% sudo launchctl load -w /System/Library/LaunchDaemons/ssh.plist
% sudo launchctl list | grep ssh
- 0 com.openssh.sshd
尝试仍然失败。于是选择手动部署的方式。也是参考官方的文档
https://github.com/pingcap/docs-dm/blob/master/zh/quick-start-with-dm.md
启动master和worker
nohup dm/bin/dm-master --master-addr='127.0.0.1:8261' --log-file=/tmp/dm-master.log --name="master1" >> /tmp/dm-master.log 2>&1 &
nohup dm/bin/dm-worker --worker-addr='127.0.0.1:8262' --log-file=/tmp/dm-worker.log --join='127.0.0.1:8261' --name="worker1" >> /tmp/dm-worker.log 2>&1 &
检查下是否启动成功
dm/bin/dmctl --master-addr=127.0.0.1:8261 list-member
{
"result": true,
"msg": "",
"members": [
{
"leader": {
"msg": "",
"name": "master1",
"addr": "127.0.0.1:8261"
}
},
{
"master": {
"msg": "",
"masters": [
{
"name": "master1",
"memberID": "11007177379717700053",
"alive": true,
"peerURLs": [
"http://127.0.0.1:8291"
],
"clientURLs": [
"http://127.0.0.1:8261"
]
}
]
}
},
{
"worker": {
"msg": "",
"workers": [
{
"name": "worker1",
"addr": "127.0.0.1:8262",
"stage": "free",
"source": ""
}
]
}
}
]
}
定义mysql数据源mysql-source-conf.yaml
# MySQL Configuration.
source-id: "mysql-replica-01"
from:
host: "127.0.0.1"
user: "root"
password: ""
port: 3306
定义同步任务testdm-task.yaml
---
name: testdm
task-mode: all
target-database:
host: "127.0.0.1"
port: 4000
user: "root"
password: "" # 如果密码不为空,则推荐使用经过 dmctl 加密的密文
mysql-instances:
- source-id: "mysql-replica-01"
block-allow-list: "ba-rule1" # 黑白名单配置名称,如果 DM 版本早于 v2.0.0-beta.2 则使用 black-white-list
filter-rules: ["filter-rule-1"] # 过滤数据源特定操作的规则,可以配置多个过滤规则
route-rules: ["route-rule-1"] # 数据源表迁移到目标 TiDB 表的路由规则,可以定义多个规则
block-allow-list:
ba-rule1:
do-dbs: ["testdm"]
do-tables:
- db-name: "testdm"
tbl-name: "t1"
- db-name: "testdm"
tbl-name: "t2"
filters: # 定义过滤数据源特定操作的规则,可以定义多个规则
filter-rule-1: # 规则名称
schema-pattern: "test*" # 匹配数据源的库名,支持通配符 "*" 和 "?"
table-pattern: "t*" # 匹配数据源的表名,支持通配符 "*" 和 "?"
events: ["truncate table"] # 匹配上 schema-pattern 和 table-pattern 的库或者表的操作类型
action: Ignore # 迁移(Do
routes:
route-rule-1:
schema-pattern: "testdm"
table-pattern: "t*"
target-schema: "test"
target-table: "t"
# route-rule-2:
# schema-pattern: "testdm"
# target-schema: "testdm"
其实这个配置文件和canal的配置文件很像,包括数据源的连接方式,实例列表,拦截、过滤、路由规则。我们可以把我们的两个分表通过路由规则合并到tidb中的一个表t中
schema-pattern: "testdm"
table-pattern: "t*"
target-schema: "test"
target-table: "t"
写完配置文件,准备启动mysql
% brew services start mysql
==> Successfully started `mysql` (label: homebrew.mxcl.mysql)
xiazemin@xiazemindeMacBook-Pro ~ % mysql -uroot
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.28 Homebrew
Copyright (c) 2000, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
创建源数据表
drop database if exists `testdm`;
create database `testdm` DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
use `testdm`;
create table t1 (id bigint, uid int, name varchar(80), info varchar(100), primary key (`id`), unique key(`uid`)) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
create table t2 (id bigint, uid int, name varchar(80), info varchar(100), primary key (`id`), unique key(`uid`)) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
insert into t1 (id, uid, name) values (1, 10001, 'Gabriel García Márquez'), (2, 10002, 'Cien años de soledad');
insert into t2 (id, uid, name) values (3, 20001, 'José Arcadio Buendía'), (4, 20002, 'Úrsula Iguarán'), (5, 20003, 'José Arcadio');
创建dm的数据源
% ./dm/bin/dmctl --master-addr=127.0.0.1:8261 operate-source create mysql-source-conf.yaml
{
"result": true,
"msg": "",
"sources": [
{
"result": true,
"msg": "",
"source": "mysql-replica-01",
"worker": "worker1"
}
]
}
建数据迁移任务
% ./dm/bin/dmctl --master-addr 127.0.0.1:8261 start-task testdm-task.yaml
{
"result": true,
"msg": "",
"sources": [
{
"result": true,
"msg": "",
"source": "mysql-replica-01",
"worker": "worker1"
}
]
}
查看迁移任务状态
./dm/bin/dmctl --master-addr 127.0.0.1:8261 query-status
{
"result": true,
"msg": "",
"tasks": [
{
"taskName": "testdm",
"taskStatus": "Error - Some error occurred in subtask. Please run `query-status testdm` to get more details.",
"sources": [
"mysql-replica-01"
]
}
]
}
% ./dm/bin/dmctl --master-addr 127.0.0.1:8261 query-status testdm
{
"result": true,
"msg": "",
"sources": [
{
"result": true,
"msg": "",
"sourceStatus": {
"source": "mysql-replica-01",
"worker": "worker1",
"result": null,
"relayStatus": null
},
"subTaskStatus": [
{
"name": "testdm",
"stage": "Paused",
"unit": "Load",
"result": {
"isCanceled": false,
"errors": [
{
"ErrCode": 10006,
"ErrClass": "database",
"ErrScope": "downstream",
"ErrLevel": "high",
"Message": "run db schema failed - dbfile ./dumped_data.testdm/testdm-schema-create.sql: execute statement failed: CREATE DATABASE `testdm` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci */ /*!80016 DEFAULT ENCRYPTION='N' */;",
"RawCause": "Error 1273: Unsupported collation when new collation is enabled: 'utf8mb4_0900_ai_ci'",
"Workaround": ""
}
],
"detail": null
},
"unresolvedDDLLockID": "",
"load": {
"finishedBytes": "0",
"totalBytes": "351",
"progress": "0.00 %",
"metaBinlog": "(mysql-bin.000010, 1893)",
"metaBinlogGTID": ""
}
}
]
}
]
}
发现报错了,原因是tidb默认支持的是mysql5.7,然而我本机的mysql是8.0版本,默认的字符排序方式是
utf8mb4_0900_ai_ci
tidb不支持,所以需要把库和表上的默认字符排序方式改下
CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
重启任务
% ./dm/bin/dmctl --master-addr 127.0.0.1:8261 stop-task testdm-task.yaml
{
"op": "Stop",
"result": true,
"msg": "",
"sources": [
{
"result": true,
"msg": "",
"source": "mysql-replica-01",
"worker": "worker1"
}
]
}
% ./dm/bin/dmctl --master-addr 127.0.0.1:8261 start-task testdm-task.yaml
{
"result": true,
"msg": "",
"sources": [
{
"result": true,
"msg": "",
"source": "mysql-replica-01",
"worker": "worker1"
}
]
}
中间还是遇到了一个错误
{
"result": true,
"msg": "",
"sources": [
{
"result": false,
"msg": "[code=38032:class=dm-master:scope=internal:level=high], Message: some error occurs in dm-worker: ErrCode:10006 ErrClass:\"database\" ErrScope:\"not-set\" ErrLevel:\"high\" Message:\"startLocation: [position: (, 0), gtid-set: ], endLocation: [position: (mysql-bin.000001, 2610), gtid-set: ]: execute statement failed: INSERT INTO `test`.`t` (`id`,`uid`,`name`,`info`) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE `id`=VALUES(`id`),`uid`=VALUES(`uid`),`name`=VALUES(`name`),`info`=VALUES(`info`)\" RawCause:\"Error 1146: Table 'test.t' doesn't exist\" , Workaround: Please execute `query-status` to check status.",
"source": "mysql-replica-01",
"worker": "worker1"
}
]
}
这是因为你改变了表名和库名,tidb在识别ddl的时候没有结合配置规则做对应的改变,导致在tidb上创建目标表失败,如果库名和表名一样就没有这个问题。
连接tidb
% mysql --comments --host 127.0.0.1 --port 4000 -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 407
Server version: 5.7.25-TiDB-v6.3.0 TiDB Server (Apache License 2.0) Community Edition, MySQL 5.7 compatible
Copyright (c) 2000, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
drop database if exists `test`;
create database `test` DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
use test;
create table t (id bigint, uid int, name varchar(80), info varchar(100), primary key (`id`), unique key(`uid`)) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
然后再重启下我们的任务,就成功了。检查下发明目标表里有我们两个表的数据
select * from t;
+----+--------+--------------------------+------+
| id | uid | name | info |
+----+--------+--------------------------+------+
| 1 | 10001 | Gabriel García Márquez | NULL |
| 2 | 10002 | Cien años de soledad | NULL |
| 3 | 20001 | José Arcadio Buendía | NULL |
| 4 | 20002 | Úrsula Iguarán | NULL |
| 5 | 20003 | José Arcadio | NULL |
| 14 | 200012 | José Arcadio Buendía | NULL |
+----+--------+--------------------------+------+
6 rows in set (0.00 sec)
监控的话可以看下tidb自带的监控,里面有binlog延迟的监控
http://127.0.0.1:3000/d/RDdDTFvZz/test-cluster-binlog?orgId=1&refresh=10s
本文分享自 golang算法架构leetcode技术php 微信公众号,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。
本文参与 腾讯云自媒体同步曝光计划 ,欢迎热爱写作的你一起参与!