Custom JSON serialization for enums using Jackson

Jackson JSON is an easy to use package for serializing classes into, well JSON. No surprises there, however, you can also customize how an enum gets serialized which makes it quite powerful.

Now why would you want to customize the serialization of enums? Well, if you are working with a lot of data, serializing is going to take up a lot of space. For example, consider working with a document store like MongoDB where you can have millions of documents. Reducing the Strings used by a few characters here and there can save a ton of space which in turn lets you fit more documents into memory or work with smaller SSDs.

There are few bits that you need to worry about when adding this customization to your enum:

  1. Add an attribute to the enum to contain the shorten String value; this value should have fewer characters than the enum value it represents ;-)
  2. Override the enum’s toString() to return the shorten String
  3. Use the @JsonCreator annotation on a method that will handle deserializing the shorten String into the correct enum value

As per usual, below is a code sample:

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

import org.codehaus.jackson.annotate.JsonCreator;
import org.codehaus.jackson.map.ObjectMapper;
import org.codehaus.jackson.type.TypeReference;

enum HelloEnum {
    HELLO("h"),
    WORLD("w");

    private String shortName;

    HelloEnum (String shortName) {
        this.shortName = shortName;
    }

    @Override
    public String toString() {
        return shortName;
    }

    @JsonCreator
    public static HelloEnum create (String value) {
        if(value == null) {
            throw new IllegalArgumentException();
        }
        for(HelloEnum v : values()) {
            if(value.equals(v.getShortName())) {
                return v;
            }
        }
        throw new IllegalArgumentException();
    }

    public String getShortName() {
        return shortName;
    }
}

public class ExampleHelloEnum {
    public static void main(String args[]) throws IOException {
        ObjectMapper objectMapper = new ObjectMapper();

        Map<HelloEnum,String> testMap = new HashMap<HelloEnum,String>();
        testMap.put(HelloEnum.HELLO, "hello string");
        testMap.put(HelloEnum.WORLD, "world string");

System.out.println(objectMapper.writeValueAsString(testMap));

         Map<HelloEnum,String> newTestMap = objectMapper.readValue(objectMapper.writeValueAsString(testMap), new TypeReference<Map<HelloEnum,String>>() {});

         System.out.println(newTestMap.get(HelloEnum.HELLO));
         System.out.println(newTestMap.get(HelloEnum.WORLD));
     }
}

And here is the output:

{"h":"hello string","w":"world string"}
hello string
world string

Using Apache commons lang ToStringBuilder to generate tab delimited Strings

ToString methods tend to be tedious to create. In fact, that is one of the reasons why a lot of IDEs like Eclipse provide tools for creating and maintaining them. The problem is that it is still code and it can be pretty bloaty. ToStringBuilder pretty much eliminates the need to write code for a toString method.


You can use ToStringBuilder right out of the box to generate a basic String representation of your objects. If you want though, it is possible to tailor those representations to look how you want by applying a ToStringStyle. In the example below, I have created an instance of StandardToStringStyle that enables the printing of a tab delimited String; these instances are meant to be used as singletons. ToStringBuilder is a pretty useful utility if you have to generate TSV or CSV files from POJOs.

import org.apache.commons.lang3.builder.StandardToStringStyle;
import org.apache.commons.lang3.builder.ToStringBuilder;

public class SampleBean {

    private String a;
    private int b;
    private String c;

    private static StandardToStringStyle toStringStyle;

    static {
        toStringStyle = new StandardToStringStyle();
        toStringStyle.setUseClassName(false);
        toStringStyle.setUseIdentityHashCode(false);
        toStringStyle.setContentStart("");
        toStringStyle.setContentEnd("");
        toStringStyle.setFieldSeparator("\t");
    }

    public static void main (String args[]) {
        SampleBean sampleBean = new SampleBean();

        sampleBean.setA("hello");
        sampleBean.setB(42);
        sampleBean.setC("world");

        System.out.println(sampleBean);

        System.out.println(sampleBean.tabbedToString());
    }

    @Override
    public String toString() {
        return ToStringBuilder.reflectionToString(this, toStringStyle);
    }

    public String tabbedToString() {
        return new ToStringBuilder(this, toStringStyle)
            .append(c)
            .append(a)
            .toString();
    }

    public void setA(String a) {
        this.a = a;
    }

    public void setB(int b) {
        this.b = b;
    }

    public void setC(String c) {
        this.c = c;
    }

}

The output from the above code should look like the following

a=hello    b=42    c=world
world    hello

Static hosting using AWS S3

Oh boy, it has been a while since I posted here. Amazon has had me really busy over the past year. Let’s see if I can get back to posting fun bits of tech wisdom.

I recently started hosting my static content (images, pdfs, and tar balls) on S3. It is pretty easy to setup. I basically just followed the example in the AWS documentation. It is nice because if your site has low traffic (like mine) then the hosting is free. Even if you have a lot of traffic, the cost of hosting on S3 is mere peanuts compared to what you would pay for a virtual server.

How to create a kiosk with a Mac mini

It has been a while since I last posted. The new city and new job have certainly kept me busy but things are starting to get back to normal.

An interesting little hack I put together for my team at Amazon is a Mac mini kiosk. We use it to display in our area dashboards, trouble ticket queues, and feature demos. It is a great little thing to have as it keeps the team aware of what is going on and gives people an interesting platform to show off features that they have been developing. It is particularly good for people that pass by because they get to see these cool things too.

Setting up a kiosk is not that hard but there are a few things you should have in order to make it safe and easy to use. In this post, I will be outlining some of the bits that we have in ours.

A mac mini, tv, and tv stand

For the hardware, we use a Mac mini. It is cheap and small and can run forever. As well, it is fairly easy to secure one with a transparent screensaver. You do not need to use a Mac mini however, it is just easier with one.

A transparent screensaver

In order to secure your kiosk, you will need to install a transparent screensaver or at least one that acts like it. Pellucid is an open source screensaver that captures and displays screenshots of a Mac’s desktop. You might need to modify the code though. Screenshots are taken every tenth of a second (line 22) and there is a grey transparency (line 75-79) that you probably want to remove. As side from that, it is pretty much ready to go. After you make your desired changes, compile it with Xcode and you should be able to install it as a screensaver.

http://code.google.com/p/kaincode/source/browse/trunk/Pellucid/?r=127

Firefox

We display content on our kiosk with Firefox. It is pretty convenient to use a Web browser as, generally speaking, your ticket queues and dashboards are accessible over the Web, internal or otherwise. Firefox has a series of extensions that allow you to rotate through a set of Web pages and customize them.

Tab Slideshow

To rotate through Web pages, or rather tabs, you can use the extension Tab Slideshow. With this extension, you can set the rate which tabs are rotated and have them refreshed just before being displayed.

https://addons.mozilla.org/en-US/firefox/addon/tab-slideshow/

GreaseMonkey

Often times the demo or dashboard you want to display could use some tweaking or you might want to combine it with another page. GreaseMonkey is a great extension for doing just that. With a little Javascript, you can make custom dashboards or mash-ups and not have to host them.

https://addons.mozilla.org/en-US/firefox/addon/greasemonkey/

After having worked at Gilt for a couple years, I think I can safely say that Seattle has a different sense of fashion.

After having worked at Gilt for a couple years, I think I can safely say that Seattle has a different sense of fashion.

So I have been in Seattle for about 24 hours now and I am starting to figure out the quirks about where I am living. There are drastically fewer people here than in New York which is great during the day but makes me worried about zombies at night (thank you AMC Walking Dead). I discovered this morning while shopping for home goods that there is a monorail that goes between the downtown core and my place. Magnets now hurdle me between where I sleep and where I shop.

So I have been in Seattle for about 24 hours now and I am starting to figure out the quirks about where I am living. There are drastically fewer people here than in New York which is great during the day but makes me worried about zombies at night (thank you AMC Walking Dead). I discovered this morning while shopping for home goods that there is a monorail that goes between the downtown core and my place. Magnets now hurdle me between where I sleep and where I shop.

Configuring Apache Tika’s HtmlParser

So in my previous post about Apache Tika, I showed off a small Hello World program that demonstrated how you can quickly use it to parse HTML files. One of the first issues you will probably encounter using Tika though is that its HtmlParser does not immediately handle all tags. For example, the code tag is not recognized. To deal with that, you need to create a custom HtmlMapper. In the code example below, I created an HtmlMapper that accepts all tags. In addition to expanding the number of tags that the HtmlParser can handle, custom HtmlMappers are great for isolate specific blocks that you are interested in by discarding ones that you do not care about.

import java.io.InputStream;
import java.net.URL;

import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.html.HtmlMapper;
import org.apache.tika.parser.html.HtmlParser;
import org.apache.tika.sax.ToHTMLContentHandler;

public class HelloApacheTika2 {
    public static void main (String args[]) throws Exception {
        URL url = new URL("http://chrisjordan.ca/post/15345467825/configuring-apache-tikas-htmlparser");
        InputStream input = url.openStream();

        ToHTMLContentHandler toHTMLHandler = new ToHTMLContentHandler();
        Metadata metadata = new Metadata();
        ParseContext parseContext = new ParseContext();
        parseContext.set(HtmlMapper.class, AllTagMapper.class.newInstance());
        HtmlParser parser = new HtmlParser();

        parser.parse(input, toHTMLHandler, metadata, parseContext);
        System.out.println(toHTMLHandler.toString());
    }
}

/**
 * A HtmlMapper that accepts all tags and tributes. 
 *
 */
class AllTagMapper implements HtmlMapper {
 
    @Override
    public String mapSafeElement(String name) {
        return name.toLowerCase();
    }

    @Override
    public boolean isDiscardElement(String name) {
        return false;
    }

    @Override
    public String mapSafeAttribute(String elementName, String attributeName) {
        return attributeName.toLowerCase();
    }

}

Parsing HTML with Apache Tika

Every now and then, I have to parse some HTML files. There are a lot of ways you can go about doing that. Recently, I have started using Apache Tika and it does a pretty reasonable job (i.e. better than what I have done before). There is not a lot of documentation on Tika so I had to do a bit of hacking to get my head around it.

A good start is this quick Hello World Tika program I put together. It parses this article. The TeeContentHandler passes data from the HtmlParser to ContentHandlers that it has been initialized with. For the purposes of this example, I am showing off three different handlers. The LinkContentHandler is great for extracting links; useful for crawlers. The ContentHandler strips out all the text on a page; useful for indexers. The ToHTMLContentHandler produces XHTML; useful for extracting specific blocks of text which is also good for indexers. One thing to be aware of when using the HtmlParser is that natively, it does not support all tags. For example, it currently skips over the code tag. My next post will explain how to configure the HtmlParser to not do that :-)

import java.io.InputStream;
import java.net.URL;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.html.HtmlParser;
import org.apache.tika.sax.BodyContentHandler;
import org.apache.tika.sax.LinkContentHandler;
import org.apache.tika.sax.TeeContentHandler;
import org.apache.tika.sax.ToHTMLContentHandler;
import org.xml.sax.ContentHandler;

public class HelloApacheTika {

    public static void main (String args[]) throws Exception {
        URL url = new URL("http://chrisjordan.ca/post/15219674437/parsing-html-with-apache-tika");
        InputStream input = url.openStream();
        LinkContentHandler linkHandler = new LinkContentHandler();
        ContentHandler textHandler = new BodyContentHandler();
        ToHTMLContentHandler toHTMLHandler = new ToHTMLContentHandler();
        TeeContentHandler teeHandler = new TeeContentHandler(linkHandler, textHandler, toHTMLHandler);
        Metadata metadata = new Metadata();
        ParseContext parseContext = new ParseContext();
        HtmlParser parser = new HtmlParser();
        parser.parse(input, teeHandler, metadata, parseContext);
        System.out.println("title:\n" + metadata.get("title"));
        System.out.println("links:\n" + linkHandler.getLinks());
        System.out.println("text:\n" + textHandler.toString());
        System.out.println("html:\n" + toHTMLHandler.toString());
    }
}

If you are using Maven, you need to add the following dependencies:

<!-- Apache Tika -->
<dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-core</artifactId>
    <version>1.0</version>
</dependency>

<dependency>
     <groupId>org.apache.tika</groupId>
     <artifactId>tika-parsers</artifactId>
     <version>1.0</version>
</dependency>