Using Dependency Injection To Incorporate A/B Testing Into Your Applications
I posted this article to my company’s tech blog a few weeks back. I am reposting it here because… well… I wrote it :-)
http://tech.gilt.com/post/8391205906/using-dependency-injection-to-incorporate-a-b-testing
If you work at an e-commerce company, chances are you’ve probably come across the term “A/B testing”. We all know that it has something to do with testing out new features on users and seeing which ones are “better”. A/B testing is really the practice of comparing the effect that a feature (“treatment” in statistical terms) has on different groups of users. For instance, you might be developing a new navigation bar for your Website. You can see if it improves how long users stay on your site by testing it on a group of users (A) and comparing them to users in another group (B) that are using the original navigation bar. Again in statistical lingo, group B would be considered your “control group”. It’s important to note that you can test a variety of navigation bars here and not violate the principles behind A/B testing since your treatment is the type of bar you are exposing.
Now you are probably wondering how dependency injection (DI) factors into A/B testing. DI is the design principle where your objects are assigned references to their dependencies at runtime. Generally speaking, objects using DI refer to their dependencies through interfaces or super classes so they are indifferent to how those dependencies are actually implemented and instantiated. There are a few frameworks in Java that leverage DI. For the work that we do at Gilt, we use Java Spring. The example in this post is in Spring but it’s general enough that it can be applied to other frameworks.
So now that we know what DI is, let’s apply it to make A/B testing a part of your applications. In A/B testing, groups of users are exposed to a given treatment. That means we need to have a mapping between groups of users and the treatment that they are receiving. Java Spring makes that very easy; in your Spring configuration, you can create such a mapping using the map tag. For example:
<util:map id="ABTestMapping">
<entry>
<key>
<value>group 1</value>
</key>
<ref bean="baseline_nav" />
</entry>
<entry>
<key>
<value>group 2</value>
</key>
<ref bean="test_nav1" />
</entry>
<entry>
<key>
<value>group 3</value>
</key>
<ref bean="test_nav2" />
</entry>
</util:map>
<bean id="baseline_nav" class="com.gilt.examples.BaselineNavBar"/>
<bean id="test_nav1" class="com.gilt.examples.NewNavBar1"/>
<bean id="test_nav2" class="com.gilt.examples.NewNavBar2"/>
So in this map, our keys are different test groups of users and their corresponding values are the different navigation bar we will expose to them. This example is a bit contrived but it illustrates the point quite well. Instead of having to code A/B testing into your application, you can externalize it in a configuration. Java Spring instantiates this map and your application can use it to determine which navigation bar a user should see. A nifty trick in Spring is to use the import tag in your configuration.
<import resource="file:test_buckets.xml"/>
This tag allows you to put your A/B testing configuration into a separate file which is very powerful. Doing so allows you to change which groups receive various treatments without having to recompile or redeploy your code.
Simple. Fast. Very Useful.
Now in Java 1.7 - Support for Strings in switch statements
With the release of Java 1.7, some cool new features have been added to the language. One of them is the support for Strings in switch statements (finally eh!). Prior to 1.7, the cases labels could be a byte, short, char, or int and their corresponding wrapper classes. Support for enum types is implied and have made it possible to provide more meaningful representations for primitive data types however, sometimes you just want to use String case labels and now you finally can.
http://download.oracle.com/javase/tutorial/java/nutsandbolts/switch.html
Product Recommendations at Gilt
A while back, I posted an article on the Gilt Groupe tech blog. It is about the recommendation engine I developed. I am pretty proud of it as it is based on some of my PhD work. Below is a re-blog. The original article with images can be found here.
Product recommendations at Gilt work a little differently than they do at other companies. For example, at Amazon they enjoy the benefit of having a relatively static and large inventory so they can do things like collaborative filter – where you can recommend a product based on what other people have bought or looked at. Gilt is unique because our inventory is in constant flux. The products we have one week are gone the next and there is a chance we won’t have them again.
At Gilt, we’ve employed a technique called contextual retrieval. Contextual retrieval is a search method where we take elements from the user context to help them conduct a search. When someone sees a product that is sold out and they decide to waitlist it, we can infer a lot of things about the user. One is that they really want that product. Another is that we know everything about the product they are interested in. In fact, we use that product as a search query to find related ones that we have in our inventory whether they are currently on sale or not.
So the steps involved in setting up a contextual retrieval based recommendation service are as follows:
· First, you are going to have to create a search index of your products.
Apache Lucene is a great piece of open source technology for doing just that. You are going to want to create an index that contains all the fields that you have describing your products. Those fields are basically the metadata that you need to find similar ones.
· The second step is search query manufacturing.
This is where the recommendation magic really happens. When you construct your search from a product, not all fields are going to be equally important. In fact, it is very likely that different genres of products will have different fields that are going to be more valuable than others. To that end, you need to devise a weighting scheme where you boost the value of some fields over others.
· The third step is caching those recommendations.
You can quite easily get away with caching all your recommendations because unlike a search engine, you already know all the search queries that you are going to encounter – they are the products you have for sale. At Gilt, since we have new products every day, we generate and cache our recommendations once a day.
· The fourth (and most rewarding) step is using those recommendations in ways to help your users.
One of our customer pain points is that the product at Gilt sells out quickly. When a user has decided that they want an item so much that they are willing to sign up for it on a waitlist, we want to do everything in our power to try to find them another product that they will be happy with.
That’s pretty much what we aim to achieve at Gilt – Simple, Fast and Fun!
Woo hoo - my first Mahout contributions
So ya, it has been crazy busy at work and I have been neglecting my blog :-/ Tragic I know for all of the 3 people that actually check it (me, myself, and I). Anyhow, I thought I would throw up a quick post here and hopefully have a big one over the weekend.
I made my first contributions to Apache Mahout this week. I have been using it at work to do our text mining and there were a couple things I developed to make it more Java framework friendly. Here are my contributions for the interested three:
https://issues.apache.org/jira/browse/MAHOUT-671
https://issues.apache.org/jira/browse/MAHOUT-675
Setting up postfix to relay through your gmail account
So this is kind of an odd topic. Why would you want to setup postfix to relay through your gmail. Well if you are using Verizon as an ISP for your home internet, you will find that they do not allow you send email from a locally running SMTP server, like postfix on your linux box or mac. They just block it. It is probably to prevent spammers however, if you need to write some email code, you will have to setup a relay through an SMTP server that you have access to, like gmail.
Here are the steps involved:
1. Create a Simple Authentication and Security Layer (SASL) password file at /etc/postfix/sasl_passwd. Enter in your gmail account info as it is shown below:smtp.gmail.com:587 enter_account@gmail.com:enter_password
2. Create a Postfix lookup table for your password filesudo postmap /etc/postfix/sasl_passwd
3. Add the following to your /etc/postfix/main.cf. Depending on your installation, these configurations may be commented out so all you need to do is uncomment them and enter in the appropriate values # Minimum Postfix-specific configurations.
mydomain_fallback = localhost
mail_owner = _postfix
setgid_group = _postdrop
relayhost=smtp.gmail.com:587
# Enable SASL authentication in the Postfix SMTP client.
smtp_sasl_auth_enable=yes
smtp_sasl_password_maps=hash:/etc/postfix/sasl_passwd
smtp_sasl_security_options=
# Enable Transport Layer Security (TLS), i.e. SSL.
smtp_use_tls=yes
smtp_tls_security_level=encrypt
tls_random_source=dev:/dev/urandom
4. Start postfixsudo postfix start
You can test out your configuration by sending yourself a quick email from the command line.echo "Hello World" | mail -s Hello enter_your_address@here.com
If you do not get the email, you can check to see if has actually sent using the following commandmailq
To clear your mail queue, use this commandsudo postsuper -d ALL
Using the ClassLoader to access files in your classpath
Here is a quick tid-bit about how to access files in Java that are in your classpath. For example, suppose you have a text file that you want to parse and it is packaged in your jar. You can access it through the ClassLoader using either the method getResource(), that returns an instance of URL, or getResourceAsStream, that returns an instance of InputStream. Below is a simple coding example:
import java.util.Scanner;
public class ExampleFileLoader {
public static void main (String args[]) {
Scanner scan = new Scanner (ExampleFileLoader.class.getClassLoader().getResourceAsStream("test_file.txt"));
while (scan.hasNextLine())
System.out.println(scan.nextLine());
}
}
JDBC fetch size and Postgresql
Every now and then, you need to pull a massive amount of data from a database, more than can fit into memory reasonably. To this end, you can set the fetch size for your statement so that the database driver will pull back more manageable chunks of data. For example, if you set the fetch size to 100, the driver should pull back 100 row chunks. A new chunk is pulled when needed as you iterator over the corresponding result set.
Now setting the fetch size for most databases should be sufficient however, not for Postgresql. If you want to enable fetching, you also have to turn auto commit off as well. The reason has to deal with how Postgresql fetches chunks of data; it requires a transaction block and if auto commit is on, it cannot get one. You would think that the Postgresql JDBC driver would be smart enough to turn auto commit off if the fetch size is set to anything other than 0. Anyhow, as per usual, some example code is below:
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Properties;
public class BigQueryExample {
public static void main (String args[]) {
Properties connectionProps = new Properties();
connectionProps.put("user", "enter username");
connectionProps.put("password", "enter password");
Connection conn = null;
Statement statement = null;
ResultSet rs = null;
try {
conn = DriverManager.getConnection("enter url", connectionProps);
conn.setAutoCommit(false);
statement = conn.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
statement.setFetchSize(100);
rs = statement.executeQuery("SELECT name FROM giant_table_of_users");
while (rs.next()) {
System.out.println("Hi " + rs.getString("name"));
}
}
catch (SQLException se) {
System.err.println("some sort of jdbc error encountered");
throw new RuntimeException(("some sort of jdbc error encountered", se);
}
finally {
rs.close();
statement.close();
conn.close();
}
}
}
How to setup SSL on Apache
Setting up SSL on your Apache server is a pretty good idea even if you are only just hosting your own website with a CMS like drupal. With SSL enabled, you can now securely login, make updates, and post blog entries like me :-). Here is what you have to do:
Step 1. Generate an SSL certificate. All you really need is a self signed certificate unless of course you are doing something for work. The command below should do the trick. Just fill out the fields that it prompts you for.apache2-ssl-certificate
Step 2. Enable the SSL module. The following command should do it:a2enmod ssl
Step 3. Configure SSL in your virtual host. You will need to add these two lines to your virtual host configuration. SSLEngine on
SSLCertificateFile /path/to/your.pem
Here is an example of what a virtual host configuration using SSL: NameVirtualHost *:443
NameVirtualHost *:80
<VirtualHost *:80>
ServerName your.domain.com
DocumentRoot /var/www/
ErrorLog /var/log/apache2/error.log
CustomLog /var/log/apache2/access.log combined
</VirtualHost>
<VirtualHost *:443>
ServerName your.domain.com
DocumentRoot /var/www/
ErrorLog /var/log/apache2/error.log
CustomLog /var/log/apache2/access.log combined
SSLEngine on
SSLCertificateFile /path/to/your.pem
</VirtualHost>
External Spring Config Files
Most of the time when you are creating a Spring app, you end up packaging the XML config files with the war/jar. Sometimes though, it is quite beneficial to have a configuration file external to your built package. That allows you to configure your Spring app without having to rebuild or redeploy it; trust me, your system engineers/admins will love you for that. Using an external spring config is quite easy. You can use the import tag.
For example:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">
<import resource="file:/path/to/external/config.xml"/>
</beans>
The above Spring config will import /path/to/external/config.xml. Having an external file will allow you to configure beans without having to rebuilding your main war/jar. Furthermore, an external Spring config allows you to create map configurations, something that you cannot do in traditional unix style property files. Below is an example map instance you can create in your external config:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:util="http://www.springframework.org/schema/util"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util-3.0.xsd">
<util:map id="sampleMap">
<entry>
<key>
<value>Hello</value>
</key>
<value>World</value>
</entry>
</util:map>
</beans>
Programatically logging a user out in Spring Security
So I use Spring Security to handle user authentication in most of my Web applications. Every now and then, you need to log a user out programmatically. For example, users perform some sort of operation that redirects to a success page and logs them out. Logging a user out is quite simple. You need use the logout method for the relevant LogoutHandlers in your application. You are always going to have to use the SecurityContextLogoutHandler. I generally use the “remember me token” so I also have to use the PersistentTokenBasedRememberMeServices. Below is a sample method that you could have in your controller.
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.springframework.security.Authentication;
import org.springframework.security.context.SecurityContextHolder;
import org.springframework.security.ui.logout.SecurityContextLogoutHandler;
import org.springframework.security.ui.rememberme.PersistentTokenBasedRememberMeServices; import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
@Controller
public class SampleController {
@RequestMapping(value="/some_page.htm")
public String somePage (HttpServletRequest request, HttpServletResponse response) {
/* some business logic code */
Authentication auth = SecurityContextHolder.getContext().getAuthentication();
if (auth != null){
new SecurityContextLogoutHandler().logout(request, response, auth);
new PersistentTokenBasedRememberMeServices().logout(request, response, auth);
}
}
}