Hive Convert Array to String
It is simple if it is an array, there is an UDF called concat_ws(SEP, array). But concat_ws can only handle array, not for array, array. Hive doesn't support cast array to array or string. You cannot...
View ArticleMark down test
Hellotopic 1abc def taste-profile { recommend.users = 20 join.on.artist.tasks = 600 distinct.user.pair.tasks = 600 top.artists.tasks = 800 count.artists.tasks = 64 groupby.user.tasks = 64...
View ArticleUsing HiveContext to read Hive Tables
I just tried to use Spark HiveContext to use the tables in HiveMetastore. It is very simple, but there are still a lot of things not documented clearly yet.Here is what I recorded to make it work. I...
View ArticleUsing Hadoop distcp copy files from a SFTP server
Here are the steps how I use "hadoop distcp" to copy files from a SFTP server to HDFS: Clone hadoop-filesystem-sftp at https://github.com/wnagele/hadoop-filesystem-sftp.gitThere is a bug in...
View ArticleHow to resolve spark-cassandra-connector Guava version conflicts in Yarn...
When you use spark-cassandra-connector, you will encounter this problem “Guava version conflicts” when you submit your job using Yarn cluster mode. spark-cassandra-connector usually use the latest...
View ArticleHow to resolve spark-cassandra-connector's Guava version conflict in spark-shell
In my blog How to resolve spark-cassandra-connector Guava version conflicts in Yarn cluster mode, I explained how to resolve Guava version issue in Yarn cluster mode. This blog covers how to do it in...
View ArticleSpark Cassandra Connector and DataFrame
When you write a DataFrame to a cassandra table, be careful to use SaveMode.Overwrite. In spark-cassandra-connector-1.6.0-M2, TRUNCATE $keyspace.$table will be called. See the code in...
View ArticleHow to make @timestamp using GMT when using Fluentd, Elasticsearch and Kibana?
My log is a JSON one-liner output by a Node.js application, there is a field called “time” which is GMT time. { "req": {}, "time":"2016-05-12T19:18:38.123Z"}I want to keep the timestamp in GMT in...
View ArticleHow to create .epub and .mobi version of Gradle User Guide?
Gradle User Guide is written using docbook, and gradle build already have single HTML and pdf built. But I really want to load it into my kindle. Because docbook supports converting docbook to epub and...
View ArticleBring back google-chrome after upgrading to CentOS 6.8 and Chrome 51.
I don’t know which one is root cause: upgrading to CenOS 6.8 or Chrome 51. I used install_chrome.sh on http://chrome.richardlloyd.org.uk/ to install Google chrome on my CentOS 6 VirtualBox VM. It...
View ArticleMake Spark read Teradata directly.
Spark SQL supports read JDBC resource, but the paragraph in the latest 2.0 document is not really helpful:The JDBC driver class must be visible to the primordial class loader on the client session and...
View ArticleRecord Request and Reponse of HTTP Samplers in JMeter
Add “Simple Data Writer”Give the file name like “result.xml”Click Configure button.Check the radio buttons for Save As XMLSave URLSave Response Data (XML)Start the jmeter testCheck the XML fileIf you...
View ArticleGradle show dependencies of the specified configuration
I record the method here in case I forget how to show the dependencies for ONLY one configuration like runtime again.Show dependencies for one configuration ONLY: ./gradlew dependencies --configuration...
View ArticleDon't set limit nofile to unlimited
After I set nofile to unlimited in /etc/security/limits.d/90-nproc.conf like this,* hard nproc unlimited* hard nofile unlimited* soft nproc unlimited* soft nofile unlimitedI could not boot into Gnome...
View ArticleVNC Viewer
I recently built a desktop for development in my company’s CORP network. By using VNC, I can access the same Gnome session even from home. Just simply connect to VPN, and start a vnc viewer, I can...
View ArticleBuild Git RPM on CentOS 7
I want to use core.hooksPath which supports since 2.9, but the default Git version of CentOS 7.3 is still 1.8.3:$ yum list gitAvailable Packagesgit.x86_64 1.8.3.1-6.el7_2.1 baseI have to compile from...
View ArticleInstall Docker Toolbox for Windows Automatically
Docker Toolbox for Windows have command line arguments which allows you to install it without user’s involvement.You can run this command in Windows Command Prompt to get those arguments:>...
View ArticleInstall Cygwin in a script
I built an Intellij docker image based on CentOS 7. To allow my colleagues to use it on Windows, I need to help them to setup a X window system. This post describe how to setup Cygwin/X using a script...
View ArticleScreen in docker with error "Must be connected to a terminal"
I have a running docker container with command /bin/bash --login. When I run the command$ docker exec -it devsh /bin/bash --loginI can access the container’s Bash, but when I run screen, I got this...
View ArticleGnome Terminal Profile Export
I usually run screen on a remote server in PROD environment. Whenever I want to access the server, I SSH to that host, and run screen -D -R to resume the session I leave before. Using screen, you don’t...
View Article