分布式数据库
Doris
Doris规划
Doris常用命令
Doris Manager
X2Doris数据同步工具
DataX
DataX Web
Tidb
Tidb规划
数据库压测
TPC-H
dbsyncer 数据同步
本文档使用MrDoc发布
返回首页
-
+
DataX
2025年2月15日 07:16
admin
#Github https://github.com/alibaba/DataX/blob/master/userGuid.md --- #安装包 https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202309/datax.tar.gz --- #工具部署 ####下载后解压至本地某个目录,进入bin目录,即可运行同步作业: cd /opt/ tar -zxvf datax.tar.gz cd /opt/datax/bin python datax.py {YOUR_JOB.json} --- ####自检脚本: python /opt/datax/bin/datax.py /opt/datax/job/job.json  --- #配置示例 ##从stream读取数据并打印到控制台 ####1、创建创业的配置文件(json格式) #####可以通过命令查看配置模板: python datax.py -r {YOUR_READER} -w {YOUR_WRITER} --- cd {YOUR_DATAX_HOME}/bin #查看stream模板 python datax.py -r streamreader -w streamwriter --- DataX (UNKNOWN_DATAX_VERSION), From Alibaba ! Copyright (C) 2010-2015, Alibaba Group. All Rights Reserved. Please refer to the streamreader document: https://github.com/alibaba/DataX/blob/master/streamreader/doc/streamreader.md Please refer to the streamwriter document: https://github.com/alibaba/DataX/blob/master/streamwriter/doc/streamwriter.md Please save the following configuration as a json file and use python {DATAX_HOME}/bin/datax.py {JSON_FILE_NAME}.json to run the job. --- { "job": { "content": [ { "reader": { "name": "streamreader", "parameter": { "column": [], "sliceRecordCount": "" } }, "writer": { "name": "streamwriter", "parameter": { "encoding": "", "print": true } } } ], "setting": { "speed": { "channel": "" } } } } --- #####根据模板配置json如下: vim /opt/datax/job/stream2stream.json --- #stream2stream.json { "job": { "content": [ { "reader": { "name": "streamreader", "parameter": { "sliceRecordCount": 10, "column": [ { "type": "long", "value": "10" }, { "type": "string", "value": "hello,你好,世界-DataX" } ] } }, "writer": { "name": "streamwriter", "parameter": { "encoding": "UTF-8", "print": true } } } ], "setting": { "speed": { "channel": 5 } } } } --- ####2、启动DataX cd /opt/datax/bin python datax.py /opt/datax/job/stream2stream.json --- #同步结束,显示日志如下: ... 2015-12-17 11:20:25.263 [job-0] INFO JobContainer - 任务启动时刻 : 2015-12-17 11:20:15 任务结束时刻 : 2015-12-17 11:20:25 任务总计耗时 : 10s 任务平均流量 : 205B/s 记录写入速度 : 5rec/s 读出记录总数 : 50 读写失败总数 : 0
分享到: