Run MapReduce in Play development mode
When you want to invoke a MapReduce job in Play development mode, the first problem you have will be that the jar file is not generated yet. so job.setJarByClass(classOf[Mapper]) won't work. You can...
View ArticleInstall Impala 1.0 in Cloudera Manager 4.5.0
If you don't want to upgrade to 4.5.2, you can change impala parcel URL to get the impala parcel. http://archive.cloudera.com/impala/parcels/latest/
View ArticleHow Play set up ivy repository?
Play sets ivy repository to ${PLAY_HOME}/repository instead of using the default Ivy home $HOME/.ivy2. Checkout the file ${PLAY_HOME}/play and you will find this at the end "$JAVA"...
View ArticlePlay 2.1.1 Jdbc is not compatible with hive-jdbc-0.10.0-cdh4.2.0
If you want to access a Hive/Impala server in Play 2.1.1, you will encounter a strange error saying "Cannot connect to database [xxx]. Unfortunately no further information can help you identify where...
View ArticlePlay run in DEV mode and "ClassNotFoundException"
Play "~run" command makes development much easier. You can change the code and test it without building, packaging and deploying. But it also causes annoying "ClassNotFoundException".Our application...
View ArticleAvro Schema Evolution
When I added a new nullable field into my avro schema, Avro reported an AvroTypeException. SchemaEvolutionSpec:Avro can read the file using the old schema - when adding a new field *** FAILED ***...
View ArticleBuild boost for Impala in CentOS 6.3
CentOS 6.3 has only rpm for boost_1.41.0 at the time I made the build. I had to build boost from source by myself. Clean up the old installation. Find all boost installations, then remove all old...
View ArticleImpala build steps on CentOS 6.3
Build boost-1.42.0Before building impala change be/CMakeLists.txt. I removed all boost RPMs and built boost libraries from sources. boost_date_time will be /usr/local/lib/libboost_date_time-mt.*. The...
View ArticleFix "value parse is not a member of object org.joda.time.DateTime"
When you search this error message, you will find the answer that you missed joda-convert in your dependencies. The answer is correct.[error] MySpec.scala:27: value parse is not a member of object...
View ArticleAvro
It took me a while to figure out how to write an Avro file which can be imported into Hive and Impala.There are a lot of OutputFormat in avro 1.7.3: AvroOutputFormat, AvroKeyOutputFormat,...
View ArticleAvro-mapred-1.7.3-hadoop2 for AvroMultipleOutputs
I got the following error message in my MapReduce job when I ran it in CDH 4.2.0 cluster:2013-06-18 12:50:11,095 FATAL org.apache.hadoop.mapred.Child: Error running child :...
View ArticleOutput a stream into multiple files in the specified percentages.
I recently finished a project which outputs JDBC results randomly into multiple files in the specified percentages. For example, I want to generate three files, which have 10%, 20%, and 70% of the...
View ArticleDon't call filesystem.close in Hadoop
Recently the MapReduce jobs in my project suddenly failed. It turned out that my colleague added a line in the code which closes the filesystem.val path = new Path("/user/bewang/data")val fs =...
View ArticleScala and Java collection interoperability in MapReduce job
In Scala, you can import scala.collections.JavaConversions._ to make collections interoperabable between Scala and Java. for example scala.collection.Iterable <=> java.lang.Iterable Usually I...
View ArticleFedora 19 XBMC Autologin in XFCE
Create user xbmc and set password useradd -g media xbmcpasswd xbmc log on as xbmc, choose XBMC as session instead of Xfce sessionmodify /etc/lightdm/lightdm.conf, I added the following into section...
View ArticleMake Eclipse Font Smaller in Gnome/Xfce
There are a lot of posts on the Internet. Here are the one working for me:To make the menu font smaller create ~/.gtkrc-eclipsestyle "eclipse" { font_name = "Sans 9"}class "GtkWidget" style "eclipse"If...
View ArticleDeploy LZO for YARN in CDH4
This page Using the LZO Parcel is only for MRv1, not for YARN. It took me a while to figure out how to set up LZO in YARN correctly.You may experience different error messages if you do not configure...
View ArticleConvert a bag of key-value pairs to map in Pig
Here is my input file: data.txt. id_1 key_1 v_i1_a1id_1 key_2 v_i1_a2id_2 key_1 v_i2_a1id_1 key_3 v_i1_a3id_2 key_3 v_i2_a3id_1 key_4 v_i1_a4 I want to get a map of key-value pairs for an ID like this:...
View ArticleMaven Test
Don't run unit tests, run integration tests directly $ mvn test-compile failsafe:integration-testRun a single integration test in multiple modules projects. $ mvn -am -pl :sub-module...
View Articlestart screen with different windows with different working directories
Create ~/.screenrc with the following lines chdir $HOME/workspace/core-projectscreen -t corechdir $HOME/workspace/hadoop-projectscreen -t hdp When you run screen, there are two windows named "core" and...
View Article