Saturday, July 18, 2009

String Permutations II

I thought of a few ways I could optimise the permutation routines from yesterday, which were relatively simple, so I thought I'd try them out. I'm always interested in ways I can make code faster, and enjoy testing things like this out. Most of the improvements this time round were to the iterative method:
 1: public static List<String> getAllPermutationsIteratively(String s) {
 2:    int length = s.length();
 3:    if(length <= 1) {
 4:       return Arrays.asList(s);
 5:    }
 6: 
 7:    int numPermutations = factorial(length);
 8: 
 9:    List<String> permutations = new ArrayList<String>(numPermutations);
10: 
11:    for(int permNo = 0; permNo < numPermutations; permNo += 2) {
12:       char[] charsLeft = s.toCharArray();
13:       
14:       int size = length;
15:       
16:       char[] charArray = new char[length];
17:       
18:       int i, factorialSize = factorial(size);
19:       for(i = 0; i < length - 2; i++) {
20:          // The division means that the index only increases after so many iterations
21:          // The remainder reduces the number range to the number of iterations the index stayed
22:          // the same for in the previous division
23:          int factorialSizeMinusOne = factorial(--size);
24:          int index = (permNo % factorialSize) / factorialSizeMinusOne;
25:          factorialSize = factorialSizeMinusOne;
26:          
27:          charArray[i] = charsLeft[index];
28:          
29:          // Copy all elements from (index+1) onwards back one place, i.e. remove the element at
30:          // index. Note that size was decremented earlier.
31:          System.arraycopy(charsLeft, index+1, charsLeft, index, size - index);
32:       }
33:       
34:       charArray[i] = charsLeft[0];
35:       charArray[i + 1] = charsLeft[1];
36:       permutations.add(new String(charArray));
37: 
38:       charArray[i] = charsLeft[1];
39:       charArray[i + 1] = charsLeft[0];
40:       permutations.add(new String(charArray));
41:    }
42:    return permutations;
43: }
I got a new source code to html formatter by the way which I think gives better results. The first change isn't for performance, but just ensures it works properly for strings of length 0 (they could just as well throw an exception) and 1, which would've worked in my old iterative version, but not in my old recursive version (I think).

Next, I changed the loop so it increments by two each time, and did the final two letters e.g. AB and BA manually, which was a big performance improvement. The next one was saving the calculated factorial values, which saves one factorial calculation through iteration of the inner loop, which is again nice. I tried various ways of making this faster, but couldn't make it faster than it is here. Then instead of using an ArrayList of Characters, I changed it to use an array of chars, which performs better, but is a little trickier. This brought the speed of this version to be roughly equivalent to the recursive version.

Now I changed the recursive version to:
 1: public static List<String> getAllPermutationsRecursively(String s) {
 2:    int length = s.length();
 3:    if(length <= 1) {
 4:       return Arrays.asList(s);
 5:    }
 6: 
 7:    List<String> permutations = new ArrayList<String>(factorial(length));
 8: 
 9:    char[][] charArray = getAllPermutationsRecursively(s.toCharArray(), length, length - 1);
10:    for(int i = 0; i < charArray.length; i++) {
11:       permutations.add(new String(charArray[i]));
12:       charArray[i] = null; // Let the gc be able to clear this up
13:    }
14: 
15:    return permutations;
16: }
17: 
18: // length is the length if the original String
19: // pos is the current position in the char[] which a character will be added to. On the first
20: // call of this it should be one less than length
21: private static char[][] getAllPermutationsRecursively(char[] chars, int length, int pos) {
22:    char[][] toReturn = new char[factorial(pos + 1)][];
23: 
24:    // Though the == 1 case is simpler, it's faster if this is the base case
25:    if(chars.length == 2) {
26:       // There are two combinations, [0, 1] and [1, 0]
27:       toReturn[0] = new char[length];
28:       toReturn[1] = new char[length];
29: 
30:       toReturn[0][0] = toReturn[1][1] = chars[1];
31:       toReturn[0][1] = toReturn[1][0] = chars[0];
32:    } else {
33:       int toReturnPos = 0;
34:       for(int i = 0; i < chars.length; i++) {
35:          char c = chars[i]; // The current char
36: 
37:          // Make a copy of the array without c in it
38:          char[] newChars = new char[chars.length - 1];
39:          System.arraycopy(chars, 0, newChars, 0, i);
40:          System.arraycopy(chars, i + 1, newChars, i, chars.length - i - 1);
41: 
42:          // Add c to each of the returned arrays of chars
43:          for(char[] charArray : getAllPermutationsRecursively(newChars, length, pos - 1)) {
44:             charArray[pos] = c;
45:             toReturn[toReturnPos++] = charArray;
46:          }
47:       }
48:    }
49:    return toReturn;
50: }
This one only has a couple of minor changes, nothing major, it seems I did it pretty well the first time.

While doing this process I burnt my fingers a little with relatively shoddy benchmarking. At first I was getting wildly erratic results, even though I would run each version say 5 times, the time would vary by 50% (or more) between runs, which I thought was pretty crazy. The first thing I thought of doing to combat this was set my cpu governor to performance rather than dynamic, which means that it'll sit at full speed all the time, rather than throttling down. This made a remarkable difference, it seems there was a bit of lag in it going to full speed. I also have backround programs running, which you shouldn't really, but as I only run this in one thread, it shouldn't make too much of a difference.

The second main thing was that the JVM wasn't getting enough memory, so the one running second would run slower as the gc would run a lot more. By setting the starting and finishing memory to 512 MB (more than it should need, but I wanted to be on the safe size), I eliminated this discrepancy to a large degree. This also speed up the speed of execution by as much as two times, which is pretty remarkable. What brought this to my attention was running on "PARTICLEX" it would crash with an out of memory error when I changed the code and ran the iterative version first. It must've been right on the cusp of running out of memory before.

For a bit of analysis - the performance of the two in terms of speed is very close now. However in terms of memory, the iterative version should use less, but after the couple of changes I made to the recursive one that shouldn't be too bad. The iterative version is slightly shorter, so would probably get a slight preference by me, but hey. (Not that I can see any practical purpose for this).

No comments:

Post a Comment