TranSystems Software Developer Tips and Tricks: 2008

Tuesday, September 30, 2008

Using Java “enums” in AnyLogic

Editor's note: A huge Thank You to Kevin Bennett, Senior Analyst in our San Diego office, for authoring this month's blog entry!

We all use constants in our code and models. In our simulation models, we typically use an all uppercase variable name, and assign it an integer value, so we can use it as an index in an array or use for comparisons. Here’s an example of what we might declare currently:


public static final int STATE_IDLE = 1;
public static final int STATE_CLASSIFYING = 2;
public static final int STATE_INSPECTING = 3;

This structure makes it easy to index into an array. For example, if you want to track statistics on a track, you could do this:


dTrackTimeInState[ STATE_IDLE ] += dTimeInState;

So what’s the problem? Well, constants have some basic issues when used this way:

Not type-safe - Since a State is just an integer you can pass in any other integer value where a State is required, or add two States together (which makes no sense).

No namespace - You frequently have to prefix constants with a string (in this case STATE_) to avoid collisions with other constants (i.e., TRACKSTATE_IDLE, ENGINESTATE_IDLE, CRANE_IDLE, etc.).

Printed values are uninformative - Because they are just integers, if you print one out all you get is a number, which tells you nothing about what it represents, or even what type it is.

Well, in the eternal search for something new and better, I started looking into Java enums. Similar to constants, but a bit more structured. What I found was that Java enums are very powerful!
The simple case would be to replicate what we did above:


public enum TrackStates
{
    IDLE,
    INSPECTING,
    CLASSIFYING;
};

OK, so that doesn’t look very different, and doesn’t act too differently either. You can assign variables equal to the enums as before, although since the variable has to be of the enum type, it’s a little better documented and type-safe:


TracksStates iCurrentState;
iCurrentState = IDLE;
iCurrentState = 1;   // FAILS TO COMPILE, BECAUSE iCurrentState IS NOT AN INTEGER.

So why use enums? Well sometimes the constants don’t work quite the way we want them to. For example, let’s say that you wanted to print out the state of each track. One way of doing this would be like this:


for( int i = 0; i < MAX_TRACKS; i++ )
{
    switch iCurrentState[ i ]
    {
        case STATE_IDLE:
            System.out.println( "Track " + i + ", state = IDLE");
            break;

        case STATE_CLASSIFYING:
            System.out.println( "Track " + i + ", state = CLASSIFYING");
            break;

        case STATE_INSPECTING:
            System.out.println( "Track " + i + ", state = INSPECTING ");
            break;
    }
}

OK, it works, but not too elegant. And what if you have 10 or 20 constants? Ugh. Let’s see what happens with enums:


for( int i = 0; i < MAX_TRACKS; i++ )
{
System.out.println( "Track " + I + ", state = " + iCurrentState [ i ] );
}

As Keanu Reeves would say, "Whoa." If iCurrentState[] is declared as a "TrackStates" variable array, then the enums know how to print their own value! And it won’t just print 1, 2, or 3, it will print the string "IDLE", "CLASSIFYING", or "INSPECTING"!

What if you don’t want to print the constant name? Suppose you want to print something different. That’s easy; just define the enum slightly differently:


public enum TrackStates
{
    IDLE( "Track is Idle"),
    INSPECTING( "Track is Inspecting"),
    CLASSIFYING( "Track is Classifying");

    String sName;
    TrackStates (String name) { sName = name; }
    public String toString() { return sName; }

};

OK, so you’ve added 3 more lines, but you only have to do that once, wherever you define your enums. Now when you print the state "IDLE", it will show up as "Track is Idle".

This also hints at an important fact about enums: they are actual Java classes. This means anything you can do in a Java class, you can do in an enum. That toString() method could easily be expanded to print anything you want based on system conditions.

It also means that you can add other properties and methods to your enums. Here a very important example. One downside to using enums is that, since an enum is a class, there is no default integer value for each element. So when you use an enum as an index into an array, you need to convert it to an integer value. There is a built-in method called ordinal() which will return the numeric order of the enum in its class. Unfortunately, it doesn’t look quite as clean as using a standard constant:


dTrackTimeInState[ IDLE.ordinal() ] += dTimeInState;

One thing that I’ve done in my last project is to make a new method called id(). This, at least, shortens the name I have to use when accessing arrays. In the enum, it looks like this:


public enum TestEnums
{
    IDLE( "Track is Idle"),
    INSPECTING( "Track is Inspecting"),
    CLASSIFYING( "Track is Classifying");

    String sName;
    TrackStates (String name) { sName = name; }
    public String toString() { return sName; }
    public int id() { return this.ordinal(); }

};

dTrackTimeInState[ IDLE.id() ] += dTimeInState;

Here’s one more neat example. Suppose you have your Track States already defined in your model, and used everywhere. Maybe you need to initialize all of the Track State times at the beginning of the model. Previously you might loop through all the constants like this:


for(  int i = 0; i < MAX_TRACK_STATES; i++ )
{
    iTrackStateTime[ i ] = 0;
}

This is OK, but if you add a Track State, you need to make sure to go back and update the MAX_TRACK_STATES constant. Using enums, you can "short circuit" the For-loop, and there’s no need to update any constants:


for( TrackStates state : TrackStates.values())
{
    iTrackStateTime[ state.id() ] = 0;
}

Not any shorter, but nothing to update if you add constants (other than the array size, which you’d have to do in either case). This also demonstrates the use of the Java equivalent of For-Each. The VB.NET version of that statement would look like this:


For each state in TrackStates
    iTrackStateTime( state.id ) = 0

This turns out to be quite useful to allow you to iterate through enums easily.

This is just the start. Java enums can be very powerful, and almost limitless in their application. Try it in your AnyLogic model, and let us all know what new uses you come up with for enums!

Jim adds: Thanks Kevin! Enumerated types have been a useful programming construct -- it’s a shame that not all of the simulation languages have them. For example, while we’ve historically used Nicknames within Arena as a substitute for constants, we’ve had to rely on naming conventions as a substitute for true enumerated types.

I recommend using the enumerated type primarily to create a type-safe list of categorical values, and try to limit the amount of extra methods and member variables within the class itself. One thing Kevin and I discussed: Are there alternate ways of defining the dTrackTimeInState[] structure such that you don’t need to use the IDLE.id() method which converts it to an integer? Should we really be using the enum as an index into an array, or is there a better way?

For further background on enumerated types and their benefits, see http://en.wikipedia.org/wiki/Enumerated_type.

Thursday, August 21, 2008

Excel Compatibility Checker and the Modeling Studio

Back from my not-so-brief blogging vacation!

If the Modeling Studio just appears to “hang” when loading reports, it may have to do with Excel’s Compatibility Checker. This just happened to poor Amy, and it’s worth remembering for everyone using Modeling Studio on your projects. (This is all of you, right?) Here's what's going on.

By now you’ve probably gotten used to seeing (and probably ignoring) this warning dialog when you save an Excel workbook:

Sometimes this can be useful, if you're using some advanced feature of Excel 2007 that just won't work in previous versions of Excel. However, most of the time it means that you've used formatting or colors that may show up differently on your customer's machine.

The kicker about this dialog is that it won't show you exactly where the problems are. It must know, right? I mean it was somehow able to count 28 instances. "Ha ha, we know where all the problems are but we won't tell you!" Maybe this feature will come with a future Service Pack or something. But I digress...

The Modeling Studio also needs to save the Excel outputs workbook whenever it loads new reports. This happens when you either check Load Reports within the Simulation Run Control dialog, or when you define a link type as LoadReportsAndOpenExcel. The Modeling Studio engine uses the same functionality as you would by clicking the Save button, so if the compatibility checker dialog pops up for you, it’ll also pop up for the Modeling Studio.

Unfortunately, as reports are being loaded, Excel is invisible – so you might not even see that dialog pop up. This has the sad side effect of making it appear that your Modeling Studio is “hanging” and your reports never get loaded.

What’s happening is that the invisible Excel process is waiting for a response from you. If you ALT-Tab, you’ll be able to find this dialog, probably behind all of your other windows, and click Continue.

Obviously this is not ideal! But we haven’t yet figured out a way to always suppress this dialog in the code. If anyone knows of any tricks, please let our Modeling Studio team know. As I mentioned to Brian (remember him?), there are two choices here to solve the proble

Find and fix the compatibility issue.
Disable the compatibility checker for this specific workbook.

You can disable the Compatibility Checker by clearing the checkbox in the dialog, then saving the workbook. If you ever want to run it again, you can do so by clicking the Office Ribbon icon (top left corner) / Prepare / Run Compatibility Checker.

Note that our customers would never see this issue, unless they are using Office 2007 as well.

Friday, May 9, 2008

Standard vs. Custom Modeling Studio

Sometimes weird bugs pop up in the Modeling Studio that just completely baffle us, like this one:

"Call was rejected by callee"? What the heck does that even mean? I didn't callee anything, I just clicked on this link here to open Excel.

When you contact your friendly neighborhood Modeling Studio Development Team for support, one of the first things we'll ask you is: "Is this a standard or a custom Modeling Studio?"

And you might ask me, "What's the difference?" So let's clear up any confusion.

Standard Modeling Studio

A “standard” Modeling Studio is a project that is based on the files located in the Project Template folder in the SDProjects SourceSafe database. This folder always contains the latest official release of the Modeling Studio. As of May 2008, the current version is 1.3. You can check this by clicking on the TranSystems logo in the bottom left corner of the Modeling Studio. An About box pops up where you can see the version number you're using.

A developer / project manager gets the latest version, uses the Modeling Studio Admin utility to configure the links to their specific Excel inputs and outputs workbooks, and specify images unique to their customer. The Modeling Studio functionality itself runs without modification. This is the recommended (and fastest and easiest and most maintainable) way to set up new projects.

Custom Modeling Studio

Sometimes, you may need or want added functionality in your application. You may want to add code that automatically upgrades a scenario as your project versions change. You may want to add custom forms within the Modeling Studio user interface that store data in a database. You may want some unique project-specific code to run when the user clicks on a link. To do this, you'd need to create a custom Modeling Studio. Sometimes we call this a "Level 2" Modeling Studio for mysterious reasons which cannot be disclosed at this time.

The Modeling Studio was designed to give you this level of flexibility as well. A developer gets the latest version from SourceSafe, this time from the AAI.ModelingStudio/Modeling Studio Client Template folder, and opens it within Visual Studio. The developer adds in whatever special code they want, and recompiles the executable application within Visual Studio. The resulting executable is customized for the project, yet still contains the basic Modeling Studio functionality like scenario management and simulation run control. Examples of this are the OCD AutoVue project and the IBM / IRS project.

How do I decide on standard vs. custom for my project?

Basically, the default should be to always use a standard Modeling Studio if at all possible. There are several advantages to this:

It's the fastest way to set up a project.
You don't need Visual Studio on your computer.
You don't need to know anything about .NET programming.
You can upgrade your project quickly when a new Modeling Studio version comes out, just by copying new files into your project's System folder.
We can give you technical support more easily if a problem arises.
We can provide bug fixes or patches for your project as they become available.

But sometimes, there are reasons to build your own custom Modeling Studio. Common reasons include:

I don't like Excel launching as a separate application, I want my user interface within the Modeling Studio itself.
I want to store my data in a relational database and use forms and grid controls to access it.
I want to automatically upgrade my client's scenarios as I release a new version of my project, and I want to do this within the Scenario Manager when the user tries to open an older scenario.
I like to be different from others. (You know who I'm talking to!)

While these are good reasons, keep in mind they increase the burden on your project team's part, because you now have to provide technical support for the user interface part of your project as well. More code = more things that could go wrong.

As long as you've planned for both the development time and the support cost for this, a custom Modeling Studio can be a good way of setting up a user interface for your project. It was designed from the outset to support both styles of working.

Hope this helps clear up any confusion on the issue!

Thursday, February 28, 2008

Wise Tip #1

(as in Wise Installer of course)

I learned something new about relative paths in Wise, thought you might find it useful.

When you're working on an installation package in Wise Installer, it remembers the location of the source files you added, so that you can quickly recompile the package whenever one of the files changes. Usually convenient, but there a couple of cases where this causes problems.

You branch the project to a new milestone and now it's in a folder named "Milestone2" instead of "Milestone1" (You are using milestones in your project, right?)
You're working with another developer who doesn't have the exact same folder structure as you.

One solution is that everyone working on the project should have identical folder structures. In general, this is a good idea and can eliminate confusion on the development team. Our SourceSafe databases are mostly structured in a Client Name/Project Name format that seems to have worked well for us over the years, and a lot of us have set up our systems to use D:/Projects/Client Name/Project Name.

But sometimes this isn't practical. For instance, some folks don't have a "D:\" partition for project files, so they have to use "C:\Projects" instead.

The solution

Wise has another solution for us: Select Tools menu > Convert Source Paths. This lets you clean up the paths to all of the source files you added.

The trick is that you can define all paths to source files relative to the location of the WSI. So the best option is to put the .WSI in the root folder of your project. That way, all of your project folders are automatically relative, e.g. .\Model, .\Storage, .\System, .\Template, .\Working.

As you branch to new milestones, the .WSI comes with you. You're probably updating the version number or adding new files anyway, so modifications to the .WSI would also be expected in the new milestone folder.

Hope this works for you!

Thursday, February 14, 2008

YAGNI

"You Aren't Gonna Need It."

We don't use this term much in our offices, but maybe we should.

YAGNI comes from the Agile / Extreme Programming (XP) development mindset. The idea is that for every feature request or "cool idea" that you or your manager or your customer might come up with, you ask yourself:

Do I really really need this now?
If you're sure you really really need this right now, then do it.
If you're considering it because you think there's a chance you might need it later, don't do it!

Sounds simple, right? Don't build code that you don't need right now. Or more accurately -- don't build code that does not serve to satisfy your customer's need right now.

But how often are we tempted to add in "cool ideas" to our models and projects? Or how often do we come up with a super-abstract "generic" version of something that we might need one day for another scenario?

Perhaps we add in a lot of output statistics to our model that we may think somebody will be interested in seeing one day. Or we quest for the ultimate generic reusable flexible abstract module for X (sorters, process flow, trains moving, etc. etc. etc.) that we know we're going to need for some future project that we may sell someday.

That's what YAGNI is for. It trains you to resist this temptation. Focus on the code that you need right now. Focus on the deliverable that you promised to your customer right now. Everything else can likely be put off until a later milestone - if at all.

Do you really really need this now?

Originally this principle was targeted at programmers who would try to put their computer science education to work unnecessarily -- designing complex object hierarchies, layers of abstraction, "generic" versions of functions that would work with future to-be-determined classes.

For example, Ron Jeffries, one of the founding fathers of XP, writes:

You find that you need a getter for some instance variable. Fine, write it. Don’t write the setter because "we’re going to need it". Don’t write getters for other instance variables because "we’re going to need them".

Sometimes I am in the habit of writing getters and setters for my private member variables in my AnyLogic Java classes. But why am I spending the time to do this? Because I might need it someday?

YAGNI.

Balance

YAGNI is different from "don't design anything" or blindly ignoring all feature requests from your project manager / creative team / customer. You definitely want to make sure you're looking at the big picture, knowing the end goal of your simulation model or software project.

There are other concerns you need to balance where you may choose to write code that initially seems unneeded: things like refactoring for readability, or designing for upcoming (known) milestones. And it's certainly good practice to design (and comment, and code) as if someone else might reuse your code someday. Reusability is good in this sense.

For example, this article talks about a developer who implemented serialization of his objects to flat files, because it was the fastest and easiest approach to get the milestone done. The problem was that in the very next milestone, these objects were supposed to be serialized to a database, and the mechanisms to write to that database had already been developed. By focusing blindly on what he considered to be the simplest way to solve the problem, he created additional work for the rest of the team down the road to rewrite his ostensibly "simpler" version.

That's Bad YAGNI.

The Big Bad COM Scenario Manager

On one of the intermediate releases of the Excel UI, Jonathan (whom some of you may remember) and I developed a super-abstract version of the Data Set Manager.

It had COM interface classes neatly separated from its implementation classes so that we could re-implement data set management within a different application if we ever wanted to. At the time we had been experimenting with Microsoft Access-based solutions, and we thought we might need to migrate our data set management utilities into Access VBA.

It had custom type-safe collection classes instead of native data structures so that we could use our DataSet class in other projects if we ever wanted to. At its core, a DataSet is just a zipped up collection of files, something we'd probably want to manage in other projects.

It separated the zip utility library functions from scenario management functions so that we could use the zip library independently in other applications if we ever wanted to.

It was a textbook approach toward good COM development techniques.

It was also totally unreadable and unmaintainable. Completely over-designed. When a bug popped up, it could take hours just to trace through all of the layers of abstraction to figure out what was going on. If I ever asked our dev team to investigate an open bug in the Scenario Manager, they'd immediately find 5 excuses as to why they were too busy with other project work.

Last year I finally bit the bullet and refactored this back into a single ScenarioManager assembly in the current Modeling Studio. We got rid of about 15 classes. Reduced it from 2 DLLs to 1. And you know what? You can read the thing now. It could be even cleaner, but it's a lot better than where it was.

You guys would still probably run scared if asked to debug it... but let me assure you that's for legacy reasons.

"Wouldn't it be cool if...?"

We're all creative folks. We have lots of cool ideas about how to make our models and applications better. Now how would you feel if your cool ideas were never ever used by a user, such that building them was actually a total waste of time?

Like this guy said:

The thought that you're (sic) really cool feature idea might be a complete waste of time takes some of the wind out of your sails, doesn't it? I know it does mine. Now imagine that you apply YAGNI to every aspiring new feature? Yeah, sort of depressing.

A recent study showed that 64% of software features are rarely or never used. (Hello, Outlook "Journal"!) That's not just some of your cool ideas -- that's most of them.

So why invest the time to write it, if you don't need it now and it may not be used anyway?

Conclusion

Geoff Skipton was the one who got me thinking about this the other day in New Jersey. In describing the temptation of feature creep, he used this analogy:

Say you're modeling a person walking from point A to point B. You could simply model that person moving between A and B.

Or you could consider the fact that there might be a puddle in between A and B, so the person needs to walk around the puddle, and they might trip in doing it so we need to capture the probability of tripping in that situation, and what if they need to stop and tie their shoe along the way, we might need to model that.....

YAGNI.

Finally, Paul B. MacCready, inventor of human-powered and solar-powered aircraft sums it up well, though he wasn't talking specifically about software at the time:

Treat every problem as if it can be solved with ridiculous simplicity. The time you save on the 98% of problems for which this is true, will give you ridiculous resources to apply to the other 2%.

"stop yagni" sign borrowed from
http://www.rimmkaufman.com/rkgblog/2007/10/16/rule-of-three/

Friday, January 18, 2008

Resource Utilization and AnyLogic and You

A weird thing just happened to me when calculating resource utilization. Thought it was worth sharing. Our AnyLogic gurus probably already know this, but I didn't....

AnyLogic 6 will automatically collect statistics for you on your resources. Each ResourcePool object has an "enableStats" checkbox. Checking this will create an internal statsUtilization object, which is of type StatisticsContinuous.

The continuous statistics are used for things like queue utilization or resource utilization, where the value persists in continuous time but only changes at discrete time moments (like an entity entering or exiting the queue, or a resource turning on or off). This works like a "DSTAT" in Arena.

You can then call myResource.statsUtilization.mean() to return the average utilization over time.

But! You'd better double-check to make sure that you get the value you expect!

Let's do an example.

For instance, let's say that the total run length is 100 minutes. The resource turns on at time 30, and turns off at time 50, for a total of 20 minutes of use. The expected utilization is 20 / 100 = 20% busy. This is illustrated in the diagram below.

|--- total run length ------------|

|-------[ busy ]------------------|

But lo and behold, if you run this exact setup, you will see that statsUtilization.mean() = 40%, twice as high as you expected! Why is that?

It turns out that the statsUtilization is only updated when the statistics change. That's shown by the width of the |---stats---| below. The statistics were collected over a shorter period of time than I expected. In our example, the last time the statistics changed was at time 50, when the resource was released. So the utilization reports 20 / 50 = 40% busy.

|--- total run length ------------|

|-------[ busy ]------------------|

|--- stats ----|

Now, it's possible that AnyLogic has some internal logic that will correctly calculate these statistics at the end of the run, by updating the denominator to reflect the total run time. However, I was calling a function writeStatistics() within my fModelComplete() function that writes out statsUtilization.mean() to the summary report. At the time I called the method, the statsUtilization had not updated the denominator.

So until I figure out where's the proper place to call my writeStatistics() function, I'm resorting to collecting the utilization the old-fashioned way: total busy time / total run time.

Point of all this: Double check your outputs!

TranSystems Software Developer Tips and Tricks