1 假设Hadoop已经安装并配置正确,MySQL已经正确安装 2 为支持Hive的多用户多会话需求,需要使用一个独立的数据库存储元数据。 这里选择MySQL存储Hive的元数据,现在为Hive创建元数据库: mysql> create database hive; mysql> create user 'hive' identified by '123456'; mysql> grant all privileges on *.* to 'hive'@'%' with grant option; flush privileges; 刷新权限重新加载授权表 或 mysql> create database metastore_db; mysql> create user 'hive'@'localhost' identified by 'hive'; mysql> grant all on metastore_db.* to 'hive'@'localhost'; flush privileges; Hive配置信息: Hive仅需要在Master节点安装配置即可 将hive配置到系统配置文件中,基本上任何软件都需要在此进行配置才可运行 vi /etc/profile #export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE HISTCONTROL export JAVA_HOME=/usr/java/jdk1.7.0_51 配置到java的安装目录即可 export PATH=$JAVA_HOME/bin:$PATH:/usr/hive/hive-0.11.0 配置到hadoop的安装目录即可 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export HIVE_HOME=/usr/hive/hive-0.11.0 配置到hive的安装目录即可 将hadoop路径添加到hive配置文件中,以使hive可以找到hadoop,因hive是基于hadoop的,同时hadoop又是基于java虚拟的。修改bin目录下hive-config.sh文件: HIVE_CONF_DIR="${HIVE_CONF_DIR:-$HIVE_HOME/conf}" export HADOOP_HOME=/home/hadoop/hadoop/hadoop-1.2.1 export HIVE_CONF_DIR=$HIVE_CONF_DIR export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH 作为附加的, 你必须在创建Hive库表前,在HDFS上创建/tmp和/user/hive/warehouse,并且将它们的权限设置为chmod g+w,修改hdfs文件 cd /usr/local/hadoop 进入hadoop操作目录,对hdfs文件系统进行操作: bin/hadoop fs -mkdir /tmp bin/hadoop fs -mkdir /user/hive/warehouse bin/hadoop fs -chmod g+w /tmp bin/hadoop fs -chmod g+w /user/hive/warehouse cp hive-default.xml.template hive-site.xml vi conf/hive-site.xml <configuration> <property> <name>hive.metastore.local</name> <value>true</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/metastore_db?createDatabaseIfNotExist=true</value> <description>JDBC connect string FOR a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name FOR a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>username TOUSE against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password TOUSE against metastore database</description> </property> </configuration> # MySQL 驱动package: 下载mysql-connector-java-5.1.18-bin.jar文件,并放到$HIVE_HOME/lib目录下 # Hive测试(进入hive的bin文件输入hive进入hive环境): [root@sysimages bin]# hive Logging initialized using configuration in jar:file:/opt/hive-0.9.0/lib/hive-common-0.9.0.jar!/hive-log4j.properties Hive history file=/tmp/root/hive_job_log_root_201211061923_2139056245.txt hive> show tables; OK Time taken: 4.837 seconds # Hive 创建数据库表、加载数据、查询数据: hive> create table records (year string, temperature int, quality int) # 创建数据库表 > row format delimited > fields terminated by ','; OK Time taken: 0.292 seconds hive> load data local inpath '/opt/hive-0.9.0/temp.txt' # 加载数据 > overwrite into table records; Copying data from file:/opt/hive-0.9.0/temp.txt Copying file: file:/opt/hive-0.9.0/temp.txt Loading data to table default.records Deleted hdfs://sysimages:9100/user/hive/warehouse/records OK Time taken: 0.486 seconds hive> select year, max(temperature) # 查询数据 > from records > where temperature != 9999 > group by year; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201210151705_0018, Tracking URL = http://sysimages:50030/jobdetails.jsp?jobid=job_201210151705_0018 Kill Command = /opt/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=sysimages:9200 -kill job_201210151705_0018 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2012-11-06 19:43:23,897 Stage-1 map = 0%, reduce = 0% 2012-11-06 19:43:26,923 Stage-1 map = 100%, reduce = 0% 2012-11-06 19:43:35,978 Stage-1 map = 100%, reduce = 100% Ended Job = job_201210151705_0018 MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 HDFS Read: 79 HDFS Write: 53 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK drfdf 234234 drwe 234234 rwerwer 42342 sdfsdf 234234 Time taken: 590.055 seconds hive>