Finding every possible combination of array entries from multiple lists with unknown bounds in Java

I am sure this post will resonate with pretty much any developer that reads it. Have you ever had a problem where you had to really step back and think about for a long time? A problem where the wealth of the internet helps a little but just isn’t what you were looking for? One big problem is I had recursion on my brain and I was convinced for a long time recursion would be needed. Seeing many similar problems solved with recursion I thought it was the way to go.. “Much to learn you still have…my old padawan.” – Yoda.

A long time for me is hours or even days. Usually I have some level of experience in the past that helps me quickly figure out coding tasks. Well for some reason this thing got me thinking for almost an entire day. And when I say entire day I mean I still do other things but its embedded in the back of my brain. I had my notebook with me the entire day drawing out boxes, arrows, and visualizing walking through loops. It wasn’t until later that night when I realized what I had to do. This was that kind of problem.

Head shots thoughtful, thinking, finding solution man

Continue reading

Advertisements

Java sorting – Part 3

I had not intended to post about the Java sort again but after working on a project today where I had to create a Comparator within my class I decided I would add this to the madness.   The problem is, my sort was messed up so the bug must have been in my compareTo() method.  Some background, I have a class that implements Comparator<TreeEntry> and in the compareTo() method I simply compare the labels in a tree to sort them alphabetically.  Btw, I love generics in Java!

Well, when I was debugging my compareTo() method I noticed the call stack and I thought, hhm, this would be some good information to share since I just wrote about sorting arrays!

Hitting my break point in the debugger I quickly noticed the “Arrays.mergeSort()” calls in the stack.

Check out the stack:

Arrays.sort() Java Callstack

So this is interesting!  It looks like Java, under the covers, uses some variation of the Merge sort algorithm.  Now, once again, I have not looked at the Java code but looking at the signature of the “merge” method it is identical to the classic Merge sort I referenced in the original post.

Which Java sort is faster? Part 2

After taking some feedback and hearing further arguments I went ahead and automated a program that runs many permutations of the sort comparisons.  For these results I ran against an array that has 150,000 random strings that are sent into each different sort approach as I showed in the previous post.   I then wrapped these calls into a loop and ran that loop 50 times.  The results are a little different than the first set of tests but in the end the Java sort is much faster.  Maybe the next exercise would be to code a search that is as fast as the Java sort – without looking at the Java sort code…

Here are the average numbers.  Each set below is the average for a 150,000 element array running 50 times.  I actually ran this twice just to get another set of numbers.

Set 1

150,000 elements run 50 times
Average Merge Time = 176
Average Quick Time = 190
Average Java Time = 161

Set 2

150,000 elements run 50 times
Average Merge Time = 182
Average Quick Time = 200
Average Java Time = 165

Which Java sort is faster – MergeSort, QuickSort or Arrays.sort?

I had a discussion the other day about writing your own sorting code versus using the built in Java sort.  So tonight I decided to do a little geek research for the fun of it.  Now, I will say, I have rarely ever rolled my own sorting algorithm in Java (if ever).  In C, yes, C++, maybe – until STL came around.  So I decided to do a test.  I scoured the internet (ok, a couple of Google searches) for a good sample of a Quick Sort(link here) and a Merge Sort(from here) and then I compared them to the built in Java sort (Arrays.sort).  The two code samples came from Java Tips, great site, check it out.

I did see some interesting numbers when I changed the number of elements or the kind of elements I sorted.  I ended up using a string concatenated with the current time in milliseconds to fill the array.  I ended up running a series of tests where the sample ranged from 10,000 entries to 200,000 entries.  Is this likely in the real world? Probably not, but it makes for some good geek discussion!

Code to fill the array:

private static void fillArray(List<String> array) {
     Random generator2 = new Random( System.currentTimeMillis() );

     for (int x = 0; x<50000; x++){
         array.add("Value = " + generator2.nextInt());
     }
}

The following is a series of results  after running the code 10 times.  The results are the time it took to make each sort call in milliseconds.

I think the key takeaway from this table is you can consistently see the Java (Arrays.sort) consistently faster across the different element counts.

I do not know this for sure, I could look into the Java source to verify but I think the Arrays.sort must be using some kind of Introsort algorithm where it recognizes the recursion depth and dynamically switches to a heap sort when the recursion depth gets too high.  One issue that is not shown here, is the Quick Sort and Merge Sort actually run out of memory on sets that are over a million (in a basic configuration).  So this recursion optimization is actually a really big deal with very large sets and in short the Arrays.sort should probably be used in most cases.

The reality is, most data has some kind of sort out of the box and is rarely this random.  The key however, is to really understand the various things going on within your sort code and running a few tests like this may in the end save you a lot of programming time.