Hive静态分区(一级&二级)&动态分区 | 每天，遇到更好的你

1.创建分区表

1
2
3

hive> create table testpartition(ordernumber string,event_time string)
    > partitioned by(event_month string)
    > row format delimited fields terminated by '\t';

2.在/opt/modules/hive-1.2.1/demo目录下创建order_created.txt文件，并将数据加载进表中

1	[root@node1 demo]# vi order_created.txt

107030072674882015-06-01 06:01:12.334+01
101010435050962015-06-01 07:28:12.342+01
101030435097472015-06-01 07:50:12.33+01
101030435015752015-06-01 09:27:12.33+01
101040435140612015-06-01 09:03:12.324+01

1
2
3

hive> load data local inpath '/opt/modules/hive-1.2.1/demo/order_created.txt'
    > overwrite into table testpartition
    > partition(event_month='2015-06');

3.查询testpartition表中的全部数据

1	select * 操作不执行mapreduce

1	hive> select * from testpartition;

4.查看testpartition表结构

1	hive> desc formatted testpartition;

5.加载HDFS数据到分区表中

5.1.创建order_created2.txt文件，并将文件上传到HDFS文件系统上

1	[root@node1 demo]# vi order_created2.txt

107030072674882015-07-01 06:01:12.334+01
101010435050962015-07-11 07:28:12.342+01
101030435097472015-07-21 07:50:12.33+01
101030435015752015-07-31 09:27:12.33+01
101040435140612015-07-11 09:03:12.324+01

5.2.创建HDFS文件系统上的目录

1 2	[root@node1 hadoop-2.5.1]# ./bin/hdfs dfs -mkdir -p /usr/hive-1.2.1/warehouse/testpartition/event_month=2015-07

5.3.将order_created2.txt文件上传到HDFS文件系统上

1 2	[root@node1 hadoop-2.5.1]# ./bin/hdfs dfs -put /opt/modules/hive-1.2.1/demo/order_created2.txt /usr/hive-1.2.1/warehouse/testpartition/event_month=2015-07

5.4.查询testpartition表中数据，发现只有2015-06的记录，并没有2015-07的记录

1	hive> select * from testpartition;

5.5.将手工维护的分区信息刷新hive元数据中(mysql中)

1	hive> msck repair table testpartition;

5.6.重新查询testpartition表中的数据，发现有2015-07的记录了

1	hive> select * from testpartition;

6.insert/overwrite方式去添加分区表的数据

6.1.创建表

1 2	hive> create table testpartition2(ordername string,event_time string) > row format delimited fields terminated by '\t';

6.2.导入数据到testpartition2表中

1	hive> load data local inpath '/opt/modules/hive-1.2.1/demo/order_created.txt' overwrite into table testpartition2;

6.3.查询testpartition2表的数据

1	hive> select * from testpartition2;

1
2
3

再次执行后，order_created+partition(event_month='2014-07')记录会翻倍

insert into table testpartition2 partition(event_month='2015-06') select * from testpartition;