开发者

how to convert unix epoch time to date string in hive

I have a log file which contains timestamp column. The timestamp is in unix epoch time format.

I want to create a partition based on a timestamp with partitions year, month and day.

So far I开发者_运维技巧 have done this but it is throwing an error.

PARSE ERROR cannot recognize input '(' in column type

Here is my code.

from (
      from raw_data
            MAP  ${PREFIX}raw_data.line
            USING 's3://scripts/clean.py'
            AS (timestamp STRING, name STRING)
      ) map_out
INSERT OVERWRITE TABLE date_base_data_temp PARTITION(year(timestamp), month(timestamp)), day(timestamp))) 
    select map_out.name;


Oof, that looks ugly. Try using this function in Hive:

SELECT from_unixtime(unix_timestamp) as new_timestamp from raw_data ...

Or if timestamp is in ms instead of seconds:

SELECT from_unixtime(unix_timestamp DIV 1000) as new_timestamp from raw_data ...

That converts a unix timestamp into a YYYY-MM-DD HH:MM:SS format, then you can use the following functions to get the year, month, and day:

SELECT year(new_timestamp) as year, month(new_timestamp) as month, day(new_timestamp) as day ...


With more recent releases of Hive and SparkSQL, data type of date and type casting options are available. Following should work in Hive as well as Spark SQL

SELECT cast(from_unixtime(epoch_datetime) as date) from myHiveTable


If you need to convert the date in custom format, use this:

select date_format(from_unixtime(epoch_datetime),'yyyyMM') as formatted_date from myHiveTable;


which will return the date as yearMonth e.g. 201708


Adding this query to the list where the timestamp needs to be converted to date string yyyy-MM-dd for a string partition:

hive> select date_format(from_unixtime(epoch_datetime), 'yyyy-MM-dd') as day from table_name limit 20;

-- If required, remove the millis precision for timestamps
hive> select date_format(from_unixtime(cast(epoch_datetime/1000 as bigint)), 'yyyy-MM-dd') as day from table_name limit 20;


select order_id, date_format(from_unixtime(order_date/1000),'yyy-MM-dd') as order_date ,order_customer_id,order_status
from orders

or if you see any error on the same , try to use select order_id, date_format(from_unixtime(order_date DIV 1000),'yyy-MM-dd') as order_date ,order_customer_id,order_status from orders

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜