做门户网站服务器选择,石狮住房和城乡建设网站,网站设计前景怎样,上海外贸业务员招聘必备知识
数据类型
基本类型
类型写法字符char, varchar, string✔整数tinyint, smallint, int✔, bigint✔小数float, double, numeric(m,n), decimal(m,n)✔布尔值boolean✔时间date✔, timestamp✔
复杂类型(集合类型)
1、数组#xff1a;arrayT 面向用户提供…必备知识
数据类型
基本类型
类型写法字符char, varchar, string✔整数tinyint, smallint, int✔, bigint✔小数float, double, numeric(m,n), decimal(m,n)✔布尔值boolean✔时间date✔, timestamp✔
复杂类型(集合类型)
1、数组arrayT 面向用户提供的原始数据做结构化映射样例 [] / |156,1778,42,138| 描述同一个维度数据2、键值对mapK,V 样例 |LogicJava:88,mysql:89|3、结构体structname1:value1,name2:value2,....样例 类json格式【以{}开头结尾且结构稳定】 结构化数据【创建】表操作
一hive建表【基本语法】
语法组成
组成一建表 基本格式 行格式 额外处理
组成二上传数据*基本格式
create table if not exists TABLE_NAME(FIELD_NAME DATA_TYPE,FIELD_NAME DATA_TYPE,....
)[comment 描述备注]*行格式
形式一row format delimited
1、应用场景面向文本非结构化与半结构化数据2、模拟数据123,张三,16853210211116,true,26238.5,阅读;跑步;唱歌,java:98;mysql:54,province:南京;city:江宁3、案例演示create table if not exists TABLE_NAME(id int,name string,time bigint,isPartyMember boolean,hobby arraystring,scores mapstring,int,address structprovince:string,city:string)row format delimitedfields terminated by ,collection items terminated by ;map keys terminated by :lines terminated by \n4、讲解fields terminated by , 列分隔符【字段: id,name...】collection items terminated by ; 集合项内部间的分隔符map keys terminated by : 键值对[map]分隔符lines terminated by \n 行分隔符【默认一般可以省略】形式二row format serde ‘CLASS_PATH’
1、应用场景面向结构化数据即结构清晰的数据2、CLASS_PATH有以下几种选择选择一CSV【简单类型】数据呈现1,2,Football2,2,Soccer3,2,Baseball Softball代码create table if not exists TABLE_NAME(id string,page string,word string)row format serde org.apache.hadoop.hive.serde2.OpenCSVSerdewith serdeproperties(separatorChar,,quoteChar,escapeChar\\)选择二regex【正则】数据呈现123,张三,16853210211116,true,26238.5,阅读;跑步;唱歌,java:98;mysql:54,province:南京;city:江宁代码create table if not exists TABLE_NAME(id int,name string,time bigint,isPartyMember boolean,hobby arraystring,scores mapstring,int,address structprovince:string,city:string)row format serde org.apache.hadoop.hive.serde2.RegexSerDewith serdeproperties(input.regex^(//d),(.*?),(//d),(true|false),(\\d\\.?\\d?)$)选择三JsonSerDe数据呈现{name:henry,age:22,gender:male,phone:18014499655}代码create table if not exists json(name string,age int,gender string,phone string)row format serde org.apache.hive.hcatalog.data.JsonSerDe*额外处理
1、store【存储】基本语法stored as 存储格式存储格式textfile✔,orc,parquet,sequencefile,...案例stored as textfile2、tblproperties【表属性】(通用)案例【实际情况具体分析】tblproperties(skip.header.line.count1 【跳过表头即第一行】...)*上传数据入表
方法一【不建议用】hdfs dfs -put employee.txt /hive312/warehouse/yb12211.db/inner_table_employee方法二【有校验过程】✔需知local 表示数据在虚拟机本地缺少local 表示数据在hdfs上overload 覆盖缺少overload 追加第一种【本地虚拟机】load data local inpath /root/file/employee.txtoverwrite into table yb12211.inner_table_employee;第二种【hdfs】load data inpath /hive_data/hive_cha01/employee/employee.txtoverwrite into table yb12211.inner_table_employee;方法三【只用于【外部表】】✔基本格式location hdfs中存放文件的【目录】的路径 外部挂载针对性实践操作
案例一/*1|henry|1.81|1995-03-18|江苏,南京,玄武,北京东路68号|logicjava:88,javaoop:76,mysql:80,ssm:82|beauty,money,joke2|arill|1.59|1996-7-30|安徽,芜湖,南山,西湖东路68号|logicjava:79,javaoop:58,mysql:65,ssm:85|beauty,power,sleeping3|mary|1.72|1995-09-02|山东,青岛,长虹,天山东路68*/drop table if exists students;create table if not exists students(number int,name string,height decimal(3,2),birthday date,house structprovince:string,city:string,district:string,street:string,scores mapstring,int,hobby arraystring)row format delimitedfields terminated by |collection items terminated by ,map keys terminated by :stored as textfile;load data inpath /zhou/students.txtoverwrite into table zhou.students;案例二/*user_id,auction_id,cat_id,cat1,property,buy_mount,day
786295544,41098319944,50014866,50022520,21458:86755362;13023209:3593274;10984217:21985;122217965:3227750;21477:28695579;22061:30912;122217803:3230095,2,123434123*/drop table if exists sam_mum_baby_trade;create external table if not exists sam_mum_baby_trade(user_id bigint,auction_id bigint,cat_id bigint,cat1 bigint,property mapbigint,bigint,buy_mount int,day bigint)row format delimitedfields terminated by ,collection items terminated by ;map keys terminated by :stored as textfiletblproperties (skip.header.line.count1);load data inpath /zhou/sam_mum_baby_trade.csvinto table zhou.sam_mum_baby_trade;案例三:/*1,2,Football2,2,Soccer3,2,Baseball Softball*/drop table if exists categories;create table if not exists categories(id string,page string,word string)row format serde org.apache.hadoop.hive.serde2.OpenCSVSerdewith serdeproperties(separatorChar,,quoteChar,escapeChar\\)stored as textfile;load data inpath /zhou/categories.csvoverwrite into table zhou.categories;select * from categories;案例四/*{name:henry,age:22,gender:male,phone:18014499655}*///Jsondrop table if exists json;create table if not exists json(name string,age int,gender string,phone string)row format serde org.apache.hive.hcatalog.data.JsonSerDestored as textfile;load data inpath /zhou/json.logoverwrite into table zhou.json;案例五/*125;男;2015-9-7 1:52:22;1521.84883;男;2014-9-18 5:24:42;6391.45652;女;2014-5-4 5:56:45;9603.79*/create external table if not exists test1w(user_id int,user_gender string,order_time timestamp,order_amount decimal(6,2))row format serde org.apache.hadoop.hive.serde2.RegexSerDewith serdeproperties(input.regex(\\d);(.*?);(\\d{4}-\\d{1,2}-\\d{1,2} \\d{1,2}:\\d{1,2}:\\d{1,2});(\\d\.?\\d?))stored as textfilelocation /zhou/test1w;select * from test1w;二hive建表【高阶语法】
1CTAS
【本质】在原有表的基础上查询并创建新表
基本语法create table if not exists NEW_TABLE_NAME as select ... from OLD_TABLE_NAME ...
案例原有的表hive_ext_regex_test1w语句create table if not exists hive_ext_test_before2015 asselect * from hive_ext_regex_test1wwhere year(order_time)2015;2CTE
【本质】对表进行层层筛选最终形成新表
基本语法as with....select...
案例场景2015年之前的所有数据 以及 2015年之后男性5个以上订单数或5w以上订单总额的订单数据。原有的表hive_ext_regex_test1w语句create table hive_test_before2015_and_male_over5or5w aswithbefore2015 as (select * from hive_ext_regex_test1wwhere year(order_time)2015),agg_male_over5or5w as (select user_idfrom hive_ext_regex_test1wwhere year(order_time)2015 and user_gender男group by user_idhaving count(*)5 or sum(order_amount)50000),male_over5or5w as (select A.*from hive_ext_regex_test1w Ainner join agg_male_over5or5w Bon year(A.order_time)2015 and A.user_idB.user_id)select * from before2015union all 【注意union all 将表并在一起且不去重】select * from male_over5or5w;3CTL
【本质】复制原表的表结构
基本语法create table NEW_TABLE_NAME like OLD_TABLE_NAME;
案例create table hive_test1w_like like hive_ext_regex_test1w;【修改】表操作
提前需知
1、查看表字段基本信息desc 表名;2、查看表字段详细信息desc formatted 表名; 由此可查看表中可修改的属性3、查看建表流程show create 表名;基本语法
alter table TABLE_NAMErename to NEW_NAME;set tblproperties(keyvalue) -- 修改表属性包括各种分隔符,SerDe,...ser fileformat FORMAT; -- 修改文件格式change old_name new_name TYPE; -- 修改字段名column(field_name TYPE) -- 添加列