DB Technology Review (with pretty charts too!)

Recently I’ve been working on loading a set of data (51 gig) into a rather simple db for use with generating some data for later on down the application pipeline. Speed was very important as we only have a certain time window to process the information. To store the data we originally looked at an Oracle solution (since we have a good bit of it already in-house) but decided against it because of cost. The decision was then made to go with SQL Server.

Let me tangent here and state that in many corporations the choice of things such as DB are outside the scope of the development team because the product has to be easily supported by the corporate infrastructure. This does make a lot of sense in that you want people who already know what they are doing to be able to support the system. Hiring more people or retraining is sometimes just not an option. We wanted to go free-ware (MySql, PostgreSql, etc) but we had no one to support it.

Ok, back to the story. So we write the code, optimize the heck out of the code (caching with EhCache, pre-loading things, staging tables, etc) and the performance is decent. Our bottleneck at this point becomes the db itself. We are only getting about 1000 rows a second inserted. The reads are decent but mainly because we optimize the reads so it occurs infrequently.

Well, what other options do we have? And how should we test?

The Test

For my test I did the following:

  • All tests will be local to avoid network latency
  • Writes will be tested by loading 36,000 rows of data in 1000 row batch chunks
  • Reads will be tested for reading one record and for reading all records
  • Tests were conducted on an Intel i5 2.5GHz Windows 7 PC with 4GB ram


SQL Server

Not much to say here really. A standard RDBMS that’s more about one beefy server than scalability and fault tolerance across distributed machines.

The Install

Easy. Download the free developer version and then just run the install program. There were some minor issues with the port configuration but some quick Googling took care of it.

The API

JDBC (via SQL or JPA). Both very well supported and easy to use. I used straight SQL (in a Hibernate Session.doWork()) for the test

Basic code:

Session session = ((Session)entityManagerFactory.createEntityManager().getDelegate());
session.doWork(new Work() {
  @Override
  public void execute(Connection conn) throws SQLException {
    PreparedStatement stmt = null;
    try {
      stmt = conn.prepareStatement(SQL_INSERT);
      for (Data l : items) {
        stmt.setString(1, l.getA());
        stmt.setString(2, l.getB());
        stmt.setString(3, l.getC());
        stmt.setString(4, l.getD());
        stmt.setInt(5, l.getE());
        stmt.setLong(6, l.getF());
        stmt.addBatch();
      }
      stmt.executeBatch();
    }
    finally {
      try {stmt.close();} catch (Exception e) {}
    }
  }
});

The Times (in seconds)

Writes (36,000) Reads (1) Reads (36,000)
  1. 19.21605219
  2. 18.62159318
  3. 20.66406651
  4. 19.03778229
  5. 18.85667787
  6. 19.12387419
  1. 0.077489734
  2. 0.004414084
  3. 0.004225884
  4. 0.004325326
  5. 0.004428877
  6. 0.004171643
  1. 0.077756419
  2. 0.069109466
  3. 0.070978322
  4. 0.067906708
  5. 0.072227103
  6. 0.078686327

Notes:

SQL Server has something behind the scenes going on that if you don’t hit the DB for 20 minutes or so the first read is significantly slower than the following reads (as is evident above in the first read of a single row). I was able to replicate this behavior multiple times so it wasn’t a fluke.


Cassandra

Used by many big shops (NetFlix,Twitter,Digg, etc). Scalable, fault tolerant, and perhaps the leader in NoSQL based db’s.

The Install

Simple. Extract it, follow the basic instructions online, you’re good to go. I did have to modify a config file as well as do some research to tweak the performance.

The API

Cassandra does not come with a standard API for Java as it’s more about the open source DB side of things than just a all-in-one solution that gives you a specific libraries to use.

I found two libraries that appeared to be popular: Hector and Pelops, so I tried them both 🙂

Pelops

Pelops was fairly straightforward to use. Nothing too complex.


Basic code:

Cluster cluster = new Cluster("localhost", 9160);
Pelops.addPool(POOL, cluster, KEY_SPACE);
Mutator mutator = Pelops.createMutator(POOL);
for (Data l : items) {
  mutator.writeColumns(COLUMN_FAMILY, l.getA(),
    mutator.newColumnList(
      mutator.newColumn(B, l.getB()),
      mutator.newColumn(C, l.getC()),
      mutator.newColumn(D, l.getD()),
      mutator.newColumn(E, l.getE()),
      mutator.newColumn(F, l.getF())
      ));
}
mutator.execute(ConsistencyLevel.ANY);

The Times (in seconds) – Pelops

Writes (36,000) Reads (1) Reads (36,000)
  1. 5.620654231
  2. 5.225606865
  3. 4.825454661
  4. 4.4617883
  5. 4.17149797
  6. 5.599271688
  1. 0.189712743
  2. 0.117286345
  3. 0.22210995
  4. 0.863588697
  5. 0.072843069
  6. 0.126848003
All over 40 seconds.
Not worth mentioning
Hector

Hector had the same basic idea as Pelops. I likes this API a bit more because if followed a more true-to-form Builder methodology.

Basic code:

Cluster cluster = HFactory.getOrCreateCluster(POOL,"localhost:9160");
Keyspace ksp = HFactory.createKeyspace(KEY_SPACE, cluster);
ksp.setConsistencyLevelPolicy(new ConsistencyLevelPolicy() {
  @Override
  public HConsistencyLevel get(OperationType op) {
    return HConsistencyLevel.ONE;
  }
  @Override
  public HConsistencyLevel get(OperationType op, String cfName) {
    return HConsistencyLevel.ONE;
  }
});
ColumnFamilyTemplate<String, String> template template = new ThriftColumnFamilyTemplate<String, String>(
  ksp, COLUMN_FAMILY, StringSerializer.get(), StringSerializer.get());
Mutator<String> mutator = template.createMutator();
String chanId = String.valueOf(channelId);
for (Data l : items) {
  String key =  l.getA();
  mutator.addInsertion(key, COLUMN_FAMILY, HFactory.createStringColumn(B, l.getB()))
    .addInsertion(key, COLUMN_FAMILY, HFactory.createStringColumn(C, l.getC()))
    .addInsertion(key, COLUMN_FAMILY, HFactory.createStringColumn(D, l.getD()))
    .addInsertion(key, COLUMN_FAMILY, HFactory.createStringColumn(E, l.getE()))
    .addInsertion(key, COLUMN_FAMILY, HFactory.createStringColumn(F, l.getF()));
}
mutator.execute();

The Times (in seconds) – Hector

Writes (36,000) Reads (1) Reads (36,000)
  1. 5.620654231
  2. 5.225606865
  3. 4.825454661
  4. 4.4617883
  5. 4.17149797
  6. 5.599271688
  1. 0.009469611
  2. 0.012270841
  3. 0.009479885
  4. 0.009816426
  5. 0.014133534
  6. 0.010328019
  1. 3.090225733
  2. 3.369627963
  3. 2.376282785
  4. 2.047866225
  5. 2.054561727
  6. 2.149591917

Notes:

Pelops performed so poorly against Hector that I tried my best to figure out what it was doing differently. Honestly, I couldn’t find any real difference. Pelops generated way more activity on the server than Hector but I couldn’t tell you why.


Mongo DB

Mongo DB is a Document based DB. Think of it (and I’m really simplifying it) as a store where the value is a JSON object behind the scenes. It’s main advantage of Cassandra from what I could find is that is supports N number of indexes while Cassandra only supports a max of 2 (the Column and SuperColumn I believe).

The Install

Simple. Download, install.

The API

I have to say that out of all the API’s I think I like Mongo’s the best. It’s very straightforward and very easy to follow.


Basic code:

Mongo m = new Mongo();
DB db = m.getDB(DB_NAME);
DBCollection dbColl = db.getCollection(TABLE_NAME);
List<DBObject> dbos = new ArrayList<DBObject>(1000);
for (Datal : items) {
  BasicDBObject doc = new BasicDBObject();
  doc.put(A, l.getA());
  doc.put(B, l.getB());
  doc.put(C, l.getC());
  doc.put(D, l.getD());
  doc.put(E, l.getE());
  doc.put(F, l.getF());
  dbos.add(doc);
}
dbColl.insert(dbos);

The Times (in seconds)

Writes (36,000) Reads (1) Reads (36,000)
  1. 0.802098477
  2. 0.793614659
  3. 0.880601124
  4. 0.9565569
  5. 0.964250933
  6. 0.73751281
  1. 0.021040248
  2. 0.019918441
  3. 0.019541629
  4. 0.022051107
  5. 0.019392465
  6. 0.020577965
  1. 1.766084368
  2. 1.627224814
  3. 1.541414397
  4. 1.770766781
  5. 1.570970515
  6. 1.761228957

Notes:

Wow does it write fast. It was so fast that before I even did my read tests I went into the console to make sure that the data was actually in there.


Redis

Redis is a true key/value pair store. They did just introduce hash’s for storing things a little more conveniently but really you could think of it as a disk backed cache.

The Install

Another easy one to install. Download, install. Nice and easy.

The API

Like Cassandra, Redis has a couple different API’s to choose from. The one I found that seem to have the most support and was the most mature was Jedis.

I have to say that the API methods and the Redis method names themselves are not very straightforward. The names are things like hget, mget, get and while the get is straightforward you would be guessing at what hget and mget are without having to look at the API’s. Maybe it’s just me but I like to be able to tell what something is doing by just looking at it when possible.

I should note now that I couldn’t find a really good way to easily get all the records for the all test. I wanted to treat it as something that would not just have one type of data in it and the only way to do that is via the way you name the keys (since it is a true key/value only system). This meant that the only way I could retrieve all the data for my data set was to look up all the keys that matched my key pattern and then get the individual data per key.

Basic code:

JedisPool pool = new JedisPool(new JedisPoolConfig(), "localhost");
Jedis jedis = pool.getResource();
BinaryTransaction transaction = jedis.multi();
String key = null;
for (Data l : items) {
  key = "data:"+l.getA();
  // A, B, etc are actually static final strings with .getBytes() called in this case
  transaction.hset(key.getBytes(), A, l.getA().getBytes());
  transaction.hset(key.getBytes(), B, l.getB().getBytes());
  transaction.hset(key.getBytes(), C, l.getC().getBytes());
  transaction.hset(key.getBytes(), D, l.getD ().getBytes());
  transaction.hset(key.getBytes(), E,String.valueOf(l.getE()).getBytes());
  transaction.hset(key.getBytes(), F, String.valueOf(l.getF()).getBytes());
}
transaction.exec();

The Times (in seconds)

Writes (36,000) Reads (1) Reads (36,000)
  1. 6.935773064
  2. 6.965057977
  3. 6.406715391
  4. 6.725023361
  5. 6.408184013
  6. 6.597113956
  1. 0.001316993
  2. 0.00140945
  3. 0.001447665
  4. 0.001058525
  5. 0.00127097
  6. 0.001395068
  1. 5.980350711
  2. 5.426398786
  3. 5.43906904
  4. 5.381988384
  5. 5.891128439
  6. 6.083566804

Notes:

As stated above I had to come up with a funky way to do the all test. This may have impacted the performance as you can see the single read was amazingly fast but the all was rather slow. I did my research and that was the only way I could see it being done. Hopefully in a future version (or if someone tells me another way) they will address this.

Couch DB

A Document oriented DB like Mongo.

The Install

This was nightmarish. First, the distro is basically just the source and you have to build it yourself. I’m on a Windows box and don’t have a compiler so I had to search for a distro I could use. I found two. The first one didn’t even work. The second worked but geeze, because CouchDB is written in Erlang, it installs a lot (so does SQL Server to be fair). Once I got it up and running it gives you a rather nice Web interface to work with it though.

The API

I gave up. Seriously. I spent a few hours just trying to get my demo code to even connect to the server. The documentation was light about setup and the examples I found online were not very helpful. The code was a bit confusing and overly complex as well. Can you tell it just rubbed me the wrong way?

Notes

I wanted to give this a shot as I had heard a lot of good things about it but it was so painful that I just stopped. I also didn’t like the idea of having to make views and register them with the DB so we can use them in code. This is supposed to be radically different than an RDBMS but I’m not so sure.

The Results

I’d like to say there was a clear winner…… but there wasn’t.

This is what the tests showed:
(Pelops is not listed, it was just too far out there)
(Y axis = Seconds, X axis = Execution #):

Writes (36,000)

Reads (1)

Note: Notice that first read with SQL Server….

Reads (All)

From the above we can see that:

  • Mongo has the fastest writes
  • Redis had the fastest read for one record
  • SQL Server had the fastest read for all records (after the initial first read)

What one would I go with? Mongo. It was great for writing, decent for reads, and very easy to use. The API was very straighforward, the install was simple, the cmd line tools easy to use and understand.

It really comes down to what you are going to use the system for. If the data needs to be very well organized and strict, stick with a RDBMS like SQL Server. If you have some more freedom with your data then I think it comes down to determining if you are going to be reading or writing more. If reading and you are going to be getting specific things (such as data based on a user id) then Redis may be the best solution. If you are going to be grabbing more robust data and it will vary between single or multiple records, Mongo or Cassandra would work (I would favor Mongo).

Posted in Database, Java | Tagged , , | 2 Comments

Arc Collision

I’ve been working on an Android game for the past few months and at one point I needed to detect if a point (P) fell within the confines of an arc. I was surprised to find that Android’s API’s didn’t have anything to support this. After some Googling I still couldn’t find anything so I figured I’d post how I do it (partially in the hope that someone else might present a faster way ;)).

First, lets look at our arc:

You can see that in my representation I actually have a cone (2d, not 3d) and I need to detect collision in the cone. So, not only do I need to check if the P falls on the arc, but also if it is within the cone. This means I have to things to check:

  • Distance from start of cone (O) to end of arc
  • If P is between the start and end of the arc (A and B)

Range Check

Well, the distance part is rather easy, since I know position A and I know position O, I know the radius. Now I just need to remember that an arc in my case is just a part of a uniform circle so the distance is really just the radius (r) of the circle. Basic math here:

  • Known Diameter: r = D / 2
  • Known Circumferance: r = C / (2 * PI)

So now, if P is > 0 and P <= r, I know that P is in range.

Arc Collision

Next comes the check to see if P is between A and B. For this we again need to remember that the arc is really just a slice of a circle. So, the first thing we need to determine is the degree at which P is from O (lets call it dP).

  • Take the position of O
  • Take the position of P
  • Use Math.atan(P.y-O.y, P.x-O.x) (please note that I’m using a GFX library for this part, so this may not be 100% correct as I haven’t tested it)

Now we know where P actually falls, degree wise, in relation to O. All we have to test now is if P is between A and B. But, that’s not as easy as you think. Sure, if A = 5 and B = 20 and dP = 10, we know it is in between. What happens when A = 320 (remember that a circle goes to 360), B = 20, and dP = 10? Well, the A < dP would fail because A > dP but for us, dP is still inside the arc.

This is what I came up with and, to be honest, where I’m hoping someone can tell me a better way (if there is one):

if (((B < A) &&  ((A <= dP && dP < 360) || (dP <= B && dP >= 0)))
    ||  ( A <= dP && dP <= B))
{
    return true;
}
return false;                

Basically, if B < A that means that B has crossed over the 360/0 point of the circle since A is greater that B. If this has occurred then check if A <= dP while dP < 360 OR check if dP <= B while dP >= 0. This covers any issues around the 360/0 point on the circle. If we don’t have the 360/0 point issue its just a simple compare of A <= dP and dP < B.

You now know if P is within your cone!

Posted in Programming | Tagged , | 2 Comments

Map usage optimization

You know the situation. You have a data structure that is a Map and the key is a simple object and the value is some complex object (lets say an ArrayList). You can’t just get() the ArrayList and start using it because it might not be there. First you need to check to see if it exists and then use it.

For years, I was doing this like so:

if (!map.containsKey(key)) {
    map.put(key, new ArrayList<String>());
}
map.get(key).addAll(someList);

It’s pretty basic. If the map doesn’t contain the key, put in the new ArrayList then use it. If the Map already contains the key, it just uses it.

A week or so ago I was writing a chunk of code just like this and stopped and thought: Is this really the best way to do this?

Lets break it down:

First Run:

  • 1 method call to containsKey
  • key is hashed
  • 1 object creation of ArrayList
  • 1 method call to put
  • key is hashed
  • 1 method call to containsKey (done by the put)
  • 1 method call to get
  • key is hashed
  • one method call to addAll

Second Run:

  • 1 method call to containsKey
  • key is hashed
  • 1 method call to get
  • key is hashed
  • one method call to addAll

Wow, that’s actually a good bit of work (not even including the Map’s underlying implementation when hash’s collide and it does things like traverse a LinkedList in the underlying array or has to allocate more space).

So, what would be a better way? Well, I decided that I would try a small experiment. I’d post to StackOverflow. I’ve used the site in the past (it’s very helpful for Android development) and I was curious what they would make of my question.

The answer, provided by Kublai Khan was quite simple and a bit of a “oh yeah, duh.” to me.

The solution:

List<String> existingList = map.get(key);
if (existingList == null){
   existingList = new ArrayList<String>();
   map.put(key, existingList);
}
existingList.addAll(someList);

Lets break it down:

First Run:

  • 1 method call to get
  • key is hashed
  • existingList is allocated
  • 1 null check (faster than a method call)
  • 1 object creation of ArrayList
  • 1 method call to put
  • key is hashed
  • 1 method call to containsKey (done by the put)
  • 1 method call to addAll

Second Run:

  • 1 method call to get
  • key is hashed
  • 1 method call to addAll

Notice that the first time in we are doing about the same work, but from them on there is significantly less work. Now, imagine if (like in my code) this happens a few thousand times each time the program is run. This little change can really add up time wise.

Where was my DOH! moment? Well, what was suggested leverages that since Java treats everything as a reference, since we placed in the object in the map, we can still work with it even though we didn’t go and get it again from the map. We have the reference from when we created it so we can just use that.

Posted in Java, Programming | Tagged , , | Leave a comment

PGP Encryption/Decryption in Java

A new version of the code (and a new post) has been posted at :
http://sloanseaman.com/wordpress/2012/05/13/revisited-pgp-encryptiondecryption-in-java/
Please note that if you are using Bouncy Castles libraries < 1.47 you should use the code in this post

I have a requirement where I need to decrypt a file that is PGP encrypted for a project I am working on. At the same time I need a unit test for my code so I need to encrypt a file as well.

I started digging around and found that there is really no documentation I could find that covered the whole process. Some postings only covered encryption, some decryption, but none covered both as well as how to make the public/private keys.

So, here you go 🙂

The Downloads/Installs

Before you even start coding we need to discuss Java Security. The JRE by default ships with security for cryptography called Java Cryptography Extension (JCE). The JCE will support PGP but because import/export of cryptography can be sketchy, it only supports weak keys by default (think keys that wouldn’t be too hard to crack). PGP won’t work under this environment so you need to first address this.

JCE Update

Credit where due: This part is taken mainly from this article on jce policy files

Go to the Java download page and go all the way to the bottom and download the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files. The number at the end (currently 6 or 7) is just the JRE version.

Extract the files and copy the two jars to your JRE’s /lib/security directory. This will override two existing jars and will allow you to use strong keys in security.

PGP Tools

Next you need to create you public and private key files. Since Symantec bought PGP in 2010 most of the freeware stuff has disappeared and most links lead to Symantec’s website. I found an old freeware version of PGP (Windows only) at http://www.pgpi.org/products/pgp/versions/freeware/win32/6.5.8/. Please note that newer version don’t work on Windows 7 or are owned by Symantac.

Install the program and then launch it by running PGPKeys.exe. To use it:

  1. Select Keys -> New Key
  2. Click Next
  3. Enter your information
  4. Click Next
  5. Use the Diffie-Hellman/DSS key Pair Type
  6. Click Next
  7. Select your Key Pair Size (I just used 1024 since I’m not that worried about it for now)
  8. Click Next
  9. Leave your Key Expiration as never expires
  10. Click Next
  11. Enter the passphrase (password) you want to use
  12. Click Next
  13. If you entered less than 8 characters you will be warned. Click Next
  14. Key will be generated
  15. Click Next
  16. Click Finish

The files that were generated and that you will need to use in your code can be found by going to Edit -> Options -> Files.
Find these files and copy them to someplace where your code can reach them.

Side note: If you just need a basic public/private key you can generate ones at https://www.igolder.com/pgp/generate-key/

BouncyCastle’s PGP Libraries

I’m using the PGP library from Bouncy Castle. Honestly it’s poorly documented and rather complex but it’s the best and most common one that I could find.

Either download the latest from http://www.bouncycastle.org/latest_releases.html or use maven like so:

    <dependency>
	<groupId>org.bouncycastle</groupId>
	<artifactId>bcpg-jdk16</artifactId>
        <version>1.45</version>
    </dependency>

You are now (finally) ready to actually code something!

The Code

First lets write a general util that will handle any InputStreams (shouldn’t we all be using Readers by now?) and then lets write one that wraps the first one and will handle files for us.

Here is the first util which will handle any type of InputStream:

import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.security.NoSuchProviderException;
import java.security.SecureRandom;
import java.security.Security;
import java.util.Iterator;

import org.bouncycastle.bcpg.ArmoredOutputStream;
import org.bouncycastle.jce.provider.BouncyCastleProvider;
import org.bouncycastle.openpgp.PGPCompressedData;
import org.bouncycastle.openpgp.PGPCompressedDataGenerator;
import org.bouncycastle.openpgp.PGPEncryptedData;
import org.bouncycastle.openpgp.PGPEncryptedDataGenerator;
import org.bouncycastle.openpgp.PGPEncryptedDataList;
import org.bouncycastle.openpgp.PGPException;
import org.bouncycastle.openpgp.PGPLiteralData;
import org.bouncycastle.openpgp.PGPObjectFactory;
import org.bouncycastle.openpgp.PGPOnePassSignatureList;
import org.bouncycastle.openpgp.PGPPrivateKey;
import org.bouncycastle.openpgp.PGPPublicKey;
import org.bouncycastle.openpgp.PGPPublicKeyEncryptedData;
import org.bouncycastle.openpgp.PGPPublicKeyRing;
import org.bouncycastle.openpgp.PGPPublicKeyRingCollection;
import org.bouncycastle.openpgp.PGPSecretKey;
import org.bouncycastle.openpgp.PGPSecretKeyRingCollection;

/**
 * Taken from org.bouncycastle.openpgp.examples
 *
 * @author seamans
 *
 */
public class PGPUtil {

	@SuppressWarnings("unchecked")
	public static PGPPublicKey readPublicKey(InputStream in) throws IOException, PGPException {
        in = org.bouncycastle.openpgp.PGPUtil.getDecoderStream(in);

        PGPPublicKeyRingCollection pgpPub = new PGPPublicKeyRingCollection(in);

        //
        // we just loop through the collection till we find a key suitable for encryption, in the real
        // world you would probably want to be a bit smarter about this.
        //
        PGPPublicKey key = null;

        //
        // iterate through the key rings.
        //
        Iterator<PGPPublicKeyRing> rIt = pgpPub.getKeyRings();

        while (key == null && rIt.hasNext()) {
            PGPPublicKeyRing kRing = rIt.next();
            Iterator<PGPPublicKey> kIt = kRing.getPublicKeys();
            while (key == null && kIt.hasNext()) {
                PGPPublicKey k = kIt.next();

                if (k.isEncryptionKey()) {
                    key = k;
                }
            }
        }

        if (key == null) {
            throw new IllegalArgumentException("Can't find encryption key in key ring.");
        }

        return key;
    }

    /**
     * Load a secret key ring collection from keyIn and find the secret key corresponding to
     * keyID if it exists.
     *
     * @param keyIn input stream representing a key ring collection.
     * @param keyID keyID we want.
     * @param pass passphrase to decrypt secret key with.
     * @return
     * @throws IOException
     * @throws PGPException
     * @throws NoSuchProviderException
     */
    private static PGPPrivateKey findSecretKey(InputStream keyIn, long keyID, char[] pass)
    	throws IOException, PGPException, NoSuchProviderException
    {
        PGPSecretKeyRingCollection pgpSec = new PGPSecretKeyRingCollection(
        	org.bouncycastle.openpgp.PGPUtil.getDecoderStream(keyIn));

        PGPSecretKey pgpSecKey = pgpSec.getSecretKey(keyID);

        if (pgpSecKey == null) {
            return null;
        }

        return pgpSecKey.extractPrivateKey(pass, "BC");
    }

    /**
     * decrypt the passed in message stream
     */
    @SuppressWarnings("unchecked")
	public static void decryptFile(InputStream in, OutputStream out, InputStream keyIn, char[] passwd)
    	throws Exception
    {
    	Security.addProvider(new BouncyCastleProvider());

        in = org.bouncycastle.openpgp.PGPUtil.getDecoderStream(in);

        PGPObjectFactory pgpF = new PGPObjectFactory(in);
        PGPEncryptedDataList enc;

        Object o = pgpF.nextObject();
        //
        // the first object might be a PGP marker packet.
        //
        if (o instanceof  PGPEncryptedDataList) {
            enc = (PGPEncryptedDataList) o;
        } else {
            enc = (PGPEncryptedDataList) pgpF.nextObject();
        }

        //
        // find the secret key
        //
        Iterator<PGPPublicKeyEncryptedData> it = enc.getEncryptedDataObjects();
        PGPPrivateKey sKey = null;
        PGPPublicKeyEncryptedData pbe = null;

        while (sKey == null && it.hasNext()) {
            pbe = it.next();

            sKey = findSecretKey(keyIn, pbe.getKeyID(), passwd);
        }

        if (sKey == null) {
            throw new IllegalArgumentException("Secret key for message not found.");
        }

        InputStream clear = pbe.getDataStream(sKey, "BC");

        PGPObjectFactory plainFact = new PGPObjectFactory(clear);

        Object message = plainFact.nextObject();

        if (message instanceof  PGPCompressedData) {
            PGPCompressedData cData = (PGPCompressedData) message;
            PGPObjectFactory pgpFact = new PGPObjectFactory(cData.getDataStream());

            message = pgpFact.nextObject();
        }

        if (message instanceof  PGPLiteralData) {
            PGPLiteralData ld = (PGPLiteralData) message;

            InputStream unc = ld.getInputStream();
            int ch;

            while ((ch = unc.read()) >= 0) {
                out.write(ch);
            }
        } else if (message instanceof  PGPOnePassSignatureList) {
            throw new PGPException("Encrypted message contains a signed message - not literal data.");
        } else {
            throw new PGPException("Message is not a simple encrypted file - type unknown.");
        }

        if (pbe.isIntegrityProtected()) {
            if (!pbe.verify()) {
            	throw new PGPException("Message failed integrity check");
            }
        }
    }

    public static void encryptFile(OutputStream out, String fileName,
        PGPPublicKey encKey, boolean armor, boolean withIntegrityCheck)
        throws IOException, NoSuchProviderException, PGPException
    {
    	Security.addProvider(new BouncyCastleProvider());

        if (armor) {
            out = new ArmoredOutputStream(out);
        }

        ByteArrayOutputStream bOut = new ByteArrayOutputStream();

        PGPCompressedDataGenerator comData = new PGPCompressedDataGenerator(
            PGPCompressedData.ZIP);

        org.bouncycastle.openpgp.PGPUtil.writeFileToLiteralData(comData.open(bOut),
            PGPLiteralData.BINARY, new File(fileName));

        comData.close();

        PGPEncryptedDataGenerator cPk = new PGPEncryptedDataGenerator(
            PGPEncryptedData.CAST5, withIntegrityCheck,
            new SecureRandom(), "BC");

        cPk.addMethod(encKey);

        byte[] bytes = bOut.toByteArray();

        OutputStream cOut = cPk.open(out, bytes.length);

        cOut.write(bytes);

        cOut.close();

        out.close();
    }

}

I’m not going to go over it line-by-line because, honestly, I haven’t looked at it that hard so I’m not sure what it’s all doing 😉 Sorry.

Next let’s write a util that allows us to easily work with files (and Spring, if you are into that sort of thing :))

import java.io.FileInputStream;
import java.io.FileOutputStream;

public class PGPFileProcessor {

	private String passphrase;

	private String keyFile;

	private String inputFile;

	private String outputFile;

	private boolean asciiArmored = false;

	private boolean integrityCheck = true;

	public boolean encrypt() throws Exception {
		FileInputStream keyIn = new FileInputStream(keyFile);
        FileOutputStream out = new FileOutputStream(outputFile);
        PGPUtil.encryptFile(out, inputFile, PGPUtil.readPublicKey(keyIn),
        	asciiArmored, integrityCheck);
        out.close();
        keyIn.close();
        return true;
	}

	public boolean decrypt() throws Exception {
		 FileInputStream in = new FileInputStream(inputFile);
         FileInputStream keyIn = new FileInputStream(keyFile);
         FileOutputStream out = new FileOutputStream(outputFile);
         PGPUtil.decryptFile(in, out, keyIn, passphrase.toCharArray());
         in.close();
         out.close();
         keyIn.close();
         return true;
	}

	public boolean isAsciiArmored() {
		return asciiArmored;
	}

	public void setAsciiArmored(boolean asciiArmored) {
		this.asciiArmored = asciiArmored;
	}

	public boolean isIntegrityCheck() {
		return integrityCheck;
	}

	public void setIntegrityCheck(boolean integrityCheck) {
		this.integrityCheck = integrityCheck;
	}

	public String getPassphrase() {
		return passphrase;
	}

	public void setPassphrase(String passphrase) {
		this.passphrase = passphrase;
	}

	public String getKeyFile() {
		return keyFile;
	}

	public void setKeyFile(String keyFile) {
		this.keyFile = keyFile;
	}

	public String getInputFile() {
		return inputFile;
	}

	public void setInputFile(String inputFile) {
		this.inputFile = inputFile;
	}

	public String getOutputFile() {
		return outputFile;
	}

	public void setOutputFile(String outputFile) {
		this.outputFile = outputFile;
	}

}

Remember that if you use the PGPFileProcessor in Spring and intend on changing things programmatically via the get/sets you need to make the scope of the bean “prototype” otherwise you will have issues in a multi-threaded environment.

That should do it! You should be able to encrypt and decrypt files (streams really) using PGP!

Support for signed files

If you wish to support signed files, Mateusz Klos posted an excellent comment below that modifies the above code to support signed files.

I didn’t incorporate it into my code because I think he should get full credit for the modifications 🙂

Common Errors

Illegal key size or default parameters

It’s yelling at you because the default JRE security can’t handle PGP.
You didn’t follow the instructions (above) on updating the JCE jars in your JRE’s lib/security directory.

Secret key for message not found

Regenerate you pubring and secring using the PGP tool. Odds are something is wrong with them

The Comments

(Note: Added 4/2/2010)
There had been a lot of great activity in the comments as well as some very kind readers who have posted code snippets to do things that my initial library did not do. I highly suggest you dig through the comments if what you are looking for is not in my original post.

Posted in Java, Programming | Tagged , , , | 66 Comments

STS Dashboard behind a Proxy

Just a quick little note:

If you are ever behind a proxy and want to get an Extension from the STS (SpringSource Tool Suite) and it isn’t working (there is a bug in the code where it won’t use your proxy settings), you can still get the extensions by going to HELP->Install New Software and then select SpringSource Update Site for Eclipse then open Extensions / STS and select the extension there.

I needed it to install the GRAILS plugin and it was a pain to find so I thought someone might need it.

Posted in Uncategorized | 2 Comments