Welcome

My name is Jason and I am a software developer in the Bay Area. For fun I like to hike, run, shoot some photos, and take advantage of the many other activities California state has to offer. To the right you will see my resume.

Recent Books
  • Head First Design Patterns
    Head First Design Patterns
    by Elisabeth Freeman, Eric Freeman, Bert Bates, Kathy Sierra, Elisabeth Robson
Sunday
Sep012013

Using a Class Factory With Singletons

A singleton is an object that there is only 1 instance of. The way I learned how to pull this off was by making the constructor private and then having a static public method that would return the instance like so.

Code: C#

public class FooClass
{
  private static FooClass _fooClass = null;

  // Public static method
  public static FooClass GetInstance()
  {
    if (_fooClass == null)
      _fooClass = new FooClass();
    return _fooClass;
  }

  // Private constructor
  private FooClass()
  {
  }
}

But I was recently reading The Productive Programmer by Neal Ford and he discussed how this is no longer considered the best practice. The issue here is that a class should only be doing one thing. With the above pattern the class is doing two things; it's defining the behavior of FooClass and it's instantiating an instance of itself that it keeps track of. The second part is more commonly done with a class factory. So that's the current best practice. Define a singleton class with a private constructor and then create a class factory for it that handles the instance.

... But wait. The constructor is private? Yes. You'll use reflection to gain access to the constructor. I know this is not as easy, and depending on what you're doing you may still want to use the older method, but it's good to know, doesn't really add that much more code, and creates clear separation of functionality.

I went looking for examples of this in C# and didn't find any. So I threw the following example together for others that may be looking for the same. This isn't thread safe and any time I'm doing reflection I feel there should be try/catch logic surrounding it, but this is just a simple example to get you going.

public class FooClass
{
  private FooClass()
  {
  }
}

public static class FooClassFactory
{
  private static FooClass _fooClass = null;
  public static FooClass GetInstance()
  {
    if (_fooClass == null)
    {
      // Indicates that we're searching for a non-public instance constructor
      BindingFlags bindingFlags = BindingFlags.Instance | BindingFlags.NonPublic;
      // Using null causes the DefaultBinder to be used
      Binder binder = null;
      // The types of the constructor's arguments, which there are none with this example
      Type[] argumentTypes = new Type[] {};
      // The DefaultBinder ignores the array of ParameterModifiers provided, so we'll just use null
      ParameterModifier[] parameterModifiers = null;

      Type fooClassType = typeof(FooClass);
      ConstructorInfo ctorInfo =
        fooClassType.GetConstructor(bindingFlags,
                                    binder,
                                    argumentTypes,
                                    parameterModifiers);
      _fooClass = (FooClass)ctorInfo.Invoke(new object[] {});
    }
    return _fooClass;
  }
}

The actual Type.GetConstructor() call could be done in a single line instead of all the setup I did with the BindingFlag, Binder, Type[], and ParameterModifier variables. I broke it down this way just to help understand what was happening.

I hope you found this useful. How would you change it to make it thread safe? Would you add a try/catch block? Leave a comment with your thoughts!

Sunday
Apr212013

Quick Fix: File count of a directory

Recently I had put together a DOS batch script that would monitor the number of PDFs in a directory. Once that count got to or above 500 it would then turn on some services to process the PDFs. Now there's probably numerous ways to determine the number of files in a given path, but I wanted to do it in a single line and ideally exercise existing DOS functionality where possible. That's how I came up with the following single line command chain.

for /f "tokens=1" %%A in ('dir %pdfDir%\*.pdf 2^>nul ^| find /i "File(s)" ^|^| echo 0') do (set fileCount=%%A)

First, let me explain why all the carrots “^” are present. If you didn't know, this is how you escape special characters, like pipes “|” and redirects “>”  in DOS. Since we're using the FOR command, we want to escape these characters because the FOR command will take the entire string given and run it internally to process the output. Before this happens though DOS will read the entire line, process/parse any variables and/or special characters, like pipes and redirects, and then run the command. This parsing can get confused when it sees the pipes and redirects without carrots and thinks you'll want to execute these actions before passing them to the FOR command. This is the same as to why you have to use double percent signs “%%” in a script for the FOR command, but not when using it on the command line. The double percent signs is how you escape percent signs so the DOS batch file doesn't treat it like a variable it should process, but let's the FOR command process it instead.

C:\TechBlog>echo Redirect = ^>  Pipe = ^|  Double pipe = ^|^|
Redirect = >  Pipe = |  Double pipe = ||

If you don't understand the “2>nul” syntax, please see my earlier post “Redirection of the output streams in DOS scripts”.

So let's break down what's happening here. I'm going to simplify the full command to the following to help explain what's going on and later describe why the “echo 0” is present.

for /f "tokens=1" %%A in ('dir %pdfDir%\*.pdf 2^>nul ^| find /i "File(s)"') do (set fileCount=%%A)

When you do a DIR on a directory you get output like the following.

C:\TechBlog>dir
 Volume in drive C has no label.
 Volume Serial Number is A48F-FE2B

 Directory of C:\TechBlog

04/21/2013  12:00 PM    <DIR>          .
04/21/2013  12:00 PM    <DIR>          ..
04/21/2013  12:00 PM               391 PdfFile1.pdf
04/21/2013  12:00 PM               444 PdfFile2.pdf
04/21/2013  12:00 PM               338 TextFile.txt
               3 File(s)          1,173 bytes
               2 Dir(s)  16,957,591,552 bytes free

Most of this information we don't care about except for the line stating the file count. We want to isolate this line so that we can use the FOR command to parse it. To do this we pipe the output of the DIR into the FIND command. The FIND is setup to only return lines that have the text “File(s)” in them. This line of text is returned and the FOR command parses it. Because we're using the FOR command with “tokens=1” we're saying that after the FOR command parses the line, which by default is by whitespaces, we only want the first token found. That token will be the number of files. So now when the FOR command runs the command (set fileCount=%%A) it's setting fileCount to the number of files found in the directory. We have our file count in a variable and ready to use!

Ah, but there's a catch. There's always a catch. Sadly when you do a DIR on something specific, like “dir *.doc” and there are no DOCs present it doesn't simply return “0 File(s)”, but the message “File Not Found” which isn't really useful for what we're trying to accomplish. We need some error handling.

C:\TechBlog>dir *.doc
 Volume in drive C has no label.
 Volume Serial Number is A48F-FE2B

 Directory of C:\TechBlog

File Not Found

This is where the “echo 0” comes in. Going back to the original full command, when DIR doesn't pipe any text into FIND that contains “File(s)”, the FIND command returns a failed (non-zero) exit code. Adding “||” to the end of a command means only run the following command if the previous has failed. Since FIND has failed, the “echo 0” command is ran returning the text “0” which FOR then parse. This then resolves our “File Not Found” scenario and we're good to go!

Do you know of another single line command/command chain to determine the number of files in a path? If so, share it! I'd love to see how others have accomplished this.

Sunday
Nov112012

Interview question: Find the most used character using a Hashtable

In my last post I discussed how to find the most used character in an ASCII string using only a INT[]. An interviewer would mostly ask you about that scenario to test your knowledge of programming to see if you knew of how to implement the algorithm using the least amount of memory. In this example, the immediate resources are not the biggest concern and we want to be able to handle Unicode characters. The interviewer is wanting to test your knowledge of the types the framework provides to you. 

The method for doing this will be similar to the previous post accept we're using a Hashtable this time. An advantage of this will be that we won't have to determine the number of keys before we start parsing the string as the Hashtable object will handle this automatically. This may mean that more memory will be wasted and that there maybe more overhead needed when a resizing of the table is needed. But, this isn't the concern right now. Just make sure to mention this and that a way to keep the resizing overhead to a minimum would be to indicate the initial number of keys needed when initializing the Hashtable. In this example we're only worried about English alphabetical characters, so a size of 32 would be large enough (*why 32 was used i explained below).

Overall the algorithm will function the same as the FindMostUsedUnicodeChar() method we wrote in the last post. We're going to have 2 FOR loops with the first parsing the string and counting the number of instances for each character. The second loop will go through the Hashtable to determine the character that had the most instances. In the end, the character with the most hits will be returned.

/**
 * Determines the first alphabetical character that has the most instances within the string.
 * The search is case insensitive.
 * @param givenString The string.
 * @return The character or null if none are found.
 * @throws IllegalArgumentException
 */
public static char FindMostUsedUnicodeChar(final String givenString)
      throws IllegalArgumentException {

    // Verifying the string is not NULL.
    if (givenString == null)
      throw new IllegalArgumentException("The parameter givenString cannot be null.");

    // At this time we're only concerned with the letters A-Z., the 26 letter in the English
    // alphabet, for this example. This is why we're initializing the hashtable to a size of 32;
    // the smallest 2^Nth value that's greater than 26.
    Hashtable<Character, Integer> hashtable = new Hashtable<Character, Integer>(32);

    // Looping through the string and counting the number of times a alphabetical character
    // appears. We're doing a case insensitive search.
    final char givenChars[] = givenString.toUpperCase().toCharArray();

    for (int index = 0; index < givenChars.length; index++) {
      final char key = givenChars[index];
      if (key < 'A' || 'Z' < key)  continue;

      int charCount = 1;
      if (hashtable.containsKey(key)) charCount = hashtable.get(key) + 1;
      hashtable.put(key, charCount);
    }

    // Finding the first letter to have the highest use count by the order that they appeared.
    char mostUsedChar = '0';
    int highestCharCount = 0;
    for (int index = 0; index < givenChars.length; index++) {
      char key = givenChars[index];
      if (key < 'A' || 'Z' < key) continue;

      int currentCount = hashtable.get(key);
      if (highestCharCount < currentCount) {
        mostUsedChar = key;
        highestCharCount = currentCount;
      }
    }

    // Done
    return mostUsedChar;
}

As before, you may want to ask what behavior is expected if there are an equal number of 'a' and 'c' characters. This implementation, like the last, will return the first letter found in the string with the highest count. You will need to know that this algorithm is O(n) because the FOR loops are in series, not nest like you would see with O(n^2) solutions. Some test cases you will want to consider are how to handle a null string, an empty string, and a string with no alphabetical characters.

Is there anything you would change in this method? Have any tips from interviews you've had in which this question was asked? If so, share your experience and leave a comment!

Full source code here.

* I used 32 instead of 26 because memory is allocated in blocks of 2^Nth. It's the smallest 2^Nth value that's large enough to contain the 26 keys.

Sunday
Jan012012

Interview question: Finding the most used ASCII character

This question has come up in a few interviews and there's 2 ways to solve it depending on if you're using ASCII or Unicode character sets. This first post is to example how to solve the problem for the ASCII character set. 

The problems goes like this: You're to make a method that will return the most used character in a given string. The character set of the string is ASCII. How would you go about doing this? 

Hearing this you know you'll have some type of FOR loop happening here as you move through the string counting each character. Now how do you plan on storing the count of each character? The easiest way would be to use some type of hash table where the character is the key and the value is the number of times the character has been found. 

Great! You've got a plan. But, there should be something nagging you. Why did they emphasizing that the string consists of ASCII characters? This is a hint on how they want you to solve the problem and they're testing your knowledge a bit here. In most languages a CHAR can be used as a INT and vice versa. This means that you can create a simple hash table using an INT[] where the position in the array, or “key”, you want is determined by the value of the current CHAR. So if you're currently looking an A character, whose INT value is 65, you would increment the 65th element of the INT[]. Any time you found the letter D, you would increment the 68th element of the INT[]. And so on. 

About here is where you should ask if this algorithm should be case sensitive or not. Either way, it shouldn't affect your implementation too much, but it just shows you're thinking ahead for test cases. You should also know how the language you're using handles the creation of INT[]. Does it zero out all the elements on creation? Or do you have to do this manually? Knowing things like this shows your mastery of the language. 

So this is what the algorithm looks like. Just to make is a little more interesting I made it case insensitive and only counting alphabetical characters. 

public static char FindMostUsedAsciiChar(final String givenString)
      throws IllegalArgumentException {

   // Verifying the string is not NULL.
   if (givenString == null) {
       throw new IllegalArgumentException("The parameter givenString cannot be null.");
   }

   // To reduce the size of the int[] that's storing the count information I'm only making it
   // 26 elements in size. This means that any time the key char is used that its value must
   // be adjusted accordingly.
   int charHash[] = new int[26];

   // Looping through the string and counting the number of times a alphabetical character
   // appears. We're doing a case insensitive search.
   final char givenChars[] = givenString.toUpperCase().toCharArray();
   for (int index = 0; index < givenChars.length; index++) {
      int key = givenChars[index];
      if (65 <= key && key <= 90) {
         charHash[key - 65]++;
      }
   }

   // Finding the first letter to have the highest use count by the order that they appeared.
   int mostUsedChar = 0;
   int highestCharCount = 0;
   for (int index = 0; index < givenChars.length; index++) {
      int key = givenChars[index];
      if (key < 65 || 90 < key) {
         continue;
      }
      int currentCount = charHash[key - 65];
      if (highestCharCount < currentCount) {
         mostUsedChar = key;
         highestCharCount = currentCount;
      }
   }

   // Done
   return (char)mostUsedChar;
}

Things to possibly take into consideration is how the most used character is found in this scenario. I'm returning the first largest value. So if the string used was “caca” with both 'a' and 'c' having the same number of instances, 'c' would be returned as the character with the most hits. Your interviewer may want the results returned by alphabetical order, meaning the result would be 'a' instead of 'c'. But generally speaking, they don't care. They're just seeing if you understand how your algorithm works. If this comes up it's for the purpose of how to test the method and the expected results in this scenario. 

Another question that may come up, or is at least good to bring to up to once again show your knowledge, is what Big O notation this performs at. Although we are accessing the INT[] in O(1) fashion, the bottle necks are the FOR loops. We do have 2 of them, but since they're not nested our algorithm is O(n). 

In the next post I'll explain how to implement this using the Unicode character set. 

Would you have written this any differently? Are there any possible optimizations that you would have added?

Full source code here.

Sunday
Dec112011

Quick Fix: “Invalid number” error in DOS batch script

I ran into this the other day. After a little searching around I found the answer and thought I'd share the way I resolved it. Hopefully this will shorten someone else's search time.

The error message I was getting was the following: 

Invalid number. Numeric constants are either decimal (17),
hexadecimal (0x11), or octal (021).

It turns out this is happening on the script line that's doing some math with the “set /a” command. It could be happening in other commands as well, but this is where I've seen it.

So here's what you need to know to resolve this. As the error message indicates, it treats values that have leading 0's as octal. Each digit of a octal value is represented with the numbers 0-7. For example octal value 022 is the same as decimal value 18. Why you're getting the error message is because you have a value that has a leading zero, indicating an octal value, but contains at least a digit that is either an 8 or 9. Neither number conforms with how an octal value is represented and this is the source of the error.

Here's the quick solution I came up with. Since I was dealing with dates (months and nth day of the month) I knew that I would have values that are at most 2 digits long and always positive. So before doing any arithmetic on these values I would do a quick check for a leading zero and remove it if present. 

@echo off
(set month=08)
echo The month should be 08: %month%
 
REM The error message should appear. Bummer!
(set /a nextMonth=%month% + 1)
echo The next month should be 9: %nextMonth%
 
REM Checking for leading zero and removing it if present.
if "%month:~0,1%"=="0" (set month=%month:~1%)
echo The month should now be 8: %month%
 
REM No error message. Hooray!
(set /a nextMonth=%month% + 1)
echo The next month should be 9: %nextMonth% 

To break it down, what's happening in the first part of the IF statement is we're obtaining a substring from month whose index starts at 0 and we want the substring to be 1 character long. This means that the code “%month:~0,1%” is just returning the first character of month, which in this case is a zero. Because the IF statement is true, we now set the value of month so that it equals the substring starting from the index of 1 and then all following characters. This means that “%month:~1%” will return the 8. And now the leading zero has been removed and we're good to go!

If you know you're values are only going to be 2 digits long, you could get away with an IF statement like the following:

if %month% lss 10 (set month=%month:~-1%)

I prefer the substring method I initially showed because you could later add some looping logic for removing multiple leading zeros if you were dealing with values like 00789.

Remember, this solution does not handle negative values or values that have more than 1 leading zero. Please take that into consideration when implementing this workaround in your script.

I hope this helps unblock you and clears up any issues you may have been having. If you have another way you handle related scenarios, post it in the comments. I'd love you hear other ways to accomplish this!