Avro

It took me a while to figure out how to write an Avro file which can be imported into Hive and Impala.

There are a lot of OutputFormat in avro 1.7.3: AvroOutputFormat, AvroKeyOutputFormat, AvroKeyValueOutputFormat and AvroSequenceFileOutputFormat. Which one can be imported into Hive? You should use AvroKeyOutputFormat in MapReduce Job to output Avro Container Files.
You cannot specified any above output format in hive create table "stored as" clause because they don't implement HiveOutputFormat.
Follow the example on this page: https://cwiki.apache.org/confluence/display/Hive/AvroSerDe. Unfortunately if you use "Avro Hive" to search, google shows your this page https://cwiki.apache.org/Hive/avroserde-working-with-avro-from-hive.html which has a wrong example, and you will get error message like:
```
FAILED: Error in metadata: Cannot validate serde: org.apache.hadoop.hive.serde2.AvroSerDe
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
```
What's wrong? The serde name should be org.apache.hadoop.hive.serde2.avro.AvroSerDe
You don't have to define the columns because it can get from avro schema.