January 2012
6 posts
2 tags
1 tag
1 tag
Get the size of a Postgresql table →
From time to time, you need to figure out how much data is in a table. This bit of SQL, specific to Postgres, gets you the size of the one that you specify.
SELECT pg_size_pretty(pg_relation_size('your_table'));
3 tags
Configuring Apache Tika's HtmlParser
So in my previous post about Apache Tika, I showed off a small Hello World program that demonstrated how you can quickly use it to parse HTML files. One of the first issues you will probably encounter using Tika though is that its HtmlParser does not immediately handle all tags. For example, the code tag is not recognized. To deal with that, you need to create a custom HtmlMapper. In the code...
3 tags
Parsing HTML with Apache Tika
Every now and then, I have to parse some HTML files. There are a lot of ways you can go about doing that. Recently, I have started using Apache Tika and it does a pretty reasonable job (i.e. better than what I have done before). There is not a lot of documentation on Tika so I had to do a bit of hacking to get my head around it.
A good start is this quick Hello World Tika program I put together....
2 tags
Characteristics of slow SQL Queries →
I actually forgot I posted this answer on Stack Overflow until today when my cousin complimented me on it :-)