Quantcast
Channel: My Tech Notes
Viewing all articles
Browse latest Browse all 90

Impala-shell may have control sequence in its output

$
0
0
Assume you have a table has three rows, what is the result in the output file /tmp/my_table_count? 3? Actually it is not. There is a control sequence "ESC[?1034h" on my terminal.

$ impala-shell -B -q "select count(1) from my_table"> /tmp/my_table_count
$ xxd /tmp/my_table_count
0000000: 1b5b 3f31 3033 3468 320a .[?1034h2.
It will cause a problem when you use the result in a script which tries to update a partition's numRows in Impala.

local a=$(impala-shell -B -q "select count(1) from my_table where part_col='2014-08-28'")
impala-shell -q "alter table my_table partition(part_col='2014-08-20') set tblproperties('numRows'='$a')"
If you run this script, you will get the wrong value for #Rows due to the escape control sequence.

Query: show table stats my_table
+------------+-------+--------+--------+--------------+---------+
| part_col | #Rows | #Files | Size | Bytes Cached | Format |
+------------+-------+--------+--------+--------------+---------+
| 2014-08-28 | -1 | 2 | 2.87KB | NOT CACHED | PARQUET |
| Total | -1 | 2 | 2.87KB | 0B | |
+------------+-------+--------+--------+--------------+---------+
Returned 2 row(s) in 0.06s
You can fix it by unsetting TERM like this:

local a=$(TERM= impala-shell -B -q "select count(1) from my_table where part_col='2014-08-28'")
impala-shell -q "alter table my_table partition(part_col='2014-08-20') set tblproperties('numRows'='$a')"

Viewing all articles
Browse latest Browse all 90

Trending Articles