Azkaban工作流调度器
2020-06-09 09:45:11 0 举报
AI智能生成
Azkaban工作流调度器
作者其他创作
大纲/内容
架构
Azkaban Web Server
提供了Web UI,是azkaban的主要管理者
project 的管理,认证,调度,对工作流执行过程的监控
Azkaban Executor Server
负责具体的工作流和任务的调度提交
Mysql
用于保存项目、日志或者执行计划之类的信息
运行模式
solo server mode(单机模式)
数据库内置的H2数据库
web server 和 executor server运行在同一个进程
two server mode
数据库为mysql
web server 和 executor server运行在不同的进程
multiple executor mode
executor server有多个
MySQL(主从结构)
实战
类型单一的job
type=command
command=echo 'hello azkaban......'
command=echo 'hello azkaban......'
将job资源文件打包成zip文件
dependencies=start1 #任务依赖start1
如果一个job有多个依赖的job,可以使用逗号隔开
HDFS操作任务
type=command
command=echo "start execute"
command.1=/opt/bigdata/hadoop/bin/hdfs dfs -mkdir /azkaban
command.2=/opt/bigdata/hadoop/bin/hdfs dfs -put /home/hadoop/source.txt /azkaban
command=echo "start execute"
command.1=/opt/bigdata/hadoop/bin/hdfs dfs -mkdir /azkaban
command.2=/opt/bigdata/hadoop/bin/hdfs dfs -put /home/hadoop/source.txt /azkaban
MAPREDUCE任务
type=command
command=/opt/bigdata/hadoop/bin/hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /wordcount/in /wordcount/out
command=/opt/bigdata/hadoop/bin/hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /wordcount/in /wordcount/out
HIVE脚本任务
use default;
create table if not exists test_azkaban(id int,name string,address string) row format delimited fields terminated by ',';
load data local inpath '/home/hadoop/azkaban/test.txt' into table test_azkaban;
create table if not exists countaddress as select address,count(*) as num from test_azkaban group by address ;
insert overwrite local directory '/home/hadoop/azkaban/out' select * from countaddress;
create table if not exists test_azkaban(id int,name string,address string) row format delimited fields terminated by ',';
load data local inpath '/home/hadoop/azkaban/test.txt' into table test_azkaban;
create table if not exists countaddress as select address,count(*) as num from test_azkaban group by address ;
insert overwrite local directory '/home/hadoop/azkaban/out' select * from countaddress;
任务定时调度
0 条评论
下一页