我已经对此进行了编码以创建一个外部表:
CREATE EXTERNAL TABLE IF NOT EXISTS carss (
> maker STRING,
> model STRING,
> mileage FLOAT,
> manufacture_year INT,
> engine_displacement FLOAT,
> engine_power STRING,
> body_type STRING,
> color_slug STRING,
> stk_year FLOAT,
> transmission STRING,
> door_count INT,
> seat_count INT,
> fuel_type STRING,
> date_created DATE,
> date_last_seen DATE,
> price_eur FLOAT)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
> LOCATION '/BigData/Project2/'
> TBLPROPERTIES ("skip.header.line.count"="1", 'creator'='Janina', 'created_on'='2020-11-05', 'description'='dataset for classified Ads of cars in Germany and Czech Republic');当我运行查询SELECT * FROM carss LIMIT 1时,它返回空值和奇怪的字符
hive> SELECT * FROM carss LIMIT 1;OK��T�;��9fu7z�C�WHqd��Y�P�c��/�^�B�4���*G����Ç�ǿN�y�z~>����Ǘ�?�Oo��ӿ�r�ݷ����|�N����r�o�2}�����x=�ʗ����/�||;�9�߯�z~:���~��\���/��㏟�vZ)5�5��_i�4����erS�>�O��D��I�O����տ�?D���?�o��d��1�_V�K�?�h����.�|��<��ң^w��X���c�Ӕ���S���F$z��J�FywP�.����X�S��T��CM6lE9�^��j�h�NULL所用时间: 0.059秒,已提取:1行hive>
发布于 2020-11-07 10:22:31
你的/BigData/Project2/文件是用utf-8编码的吗?如果没有,您可能需要在创建外部表时指定底层文件的编码:
create external table carss
...
row format
serde 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
with serdeproperties("serialization.encoding"='WINDOWS-1252')
location
...此article可能会有所帮助。
https://stackoverflow.com/questions/64709225
复制相似问题