≡ Menu

How To Compare and Sort Revision IDs

Share the knowledge

One piece of code I find myself writing over and over is a function that compares two revision IDs to decide which one is the higher rev. The revision sequence we use is too complicated for a standard lexicographical comparison to work. I’ve found several poor ways of implementing it. I think I’ve finally found a decent solution. Perhaps it’s something that will help some of you.

When Do I need to Compare and Sort?

Here are a couple of examples of where I’ve had to do a comparison of revision IDs.

1. Checking that revisions are created in order

Our rules for CAD models allows revisions to be skipped — typically when multiple RNs are being incorporated at once. What we don’t allow is for users to go back and create a lower rev than the latest. So, A → C → E is fine, skipping revs B and D, but A → C → B is not okay. Once C has been created the users can’t go back and create rev B.

We prevent this with a custom pre-condition on item rev creation that takes the existing rev IDs, sorts them in order, and then verifies that the proposed rev ID for the newly created revision is greater than the last revision in the sorted list.

[box type=”note”]
It’s true that there is a ITEM_ask_latest_rev() function, but that looks at the creation date to determine what the latest rev is and I don’t entirely trust creation dates. Revs could have been created out of order before we implemented the pre-condition that checks for that or during a poorly handled data migration.

2. During data migrations

The other common reason this comes up is data migrations. When migrating data into TC I want to check if each component I’m importing is newer than what’s currently in Teamcenter or not. If it is, I want to migrate it, if it isn’t, I don’t want to migrate it and I want the migrated assembly to use the latest rev that’s already in TC.

What’s So Hard About that?

It’s difficult because of the revision sequence we’re using. If your sequence is simple then you may not have any problems.

The revision sequence we use is,

  1. Numeric revs (01–99) for preliminary work
  2. Revision “dash” — a literal “-” character, for the initial release.
  3. Single character Alpha revisions, A–Y, for approved changes (note that “Z” is an illegal character for revisions)
  4. Two digit Alpha revisions, AA–YY, for when we run out of single digit alpha revisions

If you did a simple text based comparison of the revision IDs it would almost work, but not quite. Rev “-” compares as less than both the numeric and the alpha revisions, and the one and two digit alpha revs don’t compare correctly:

revs = ["-", "91", "01", "AA", "B"]
sorted(revs) == ["01", "99", "-", "B", "AA"]
# Correct sorting

But what we get is,

sorted(revs) == ["-", "01", "99", "AA", "B"]
# Incorrect sorting
# The ASCII value for '-' is 45 while the ASCII value or '0' is 48
# so '-' comes before '01'.
# "AA" comes before "B", alphabetically.

On top of that, legacy alpha revisions were zero-padded to always have two characters, “0A” instead of “A”. That causes more problems. And then on top of that rev “-” used to be entered into Teamcenter (and nowhere else) as “00” (please don’t ask why).

So the correct sorting would be,

revs = ["01", "99", "00", "-", "0A", "A", "0B", "B", "Y", "AA", "YY"]
sorted(revs) == ["01", "99", "00", "-", "0A", "A", "0B", "B", "Y", "AA", "YY"]
# Correct sorting

But instead we would get,

sorted(revs) == ['-', '00', '01', '0A', '0B', '99', 'A', 'AA', 'B', 'Y', 'YY']
# Naive and incorrect sorting.

A Revision Comparison and Sorting Recipe

The process I follow can be called, normalize, decorate, sort

1. Normalize

The first step is to get a normalized version of the rev IDs:

string normalize(const string &rev_id);
// normalize("0A") == "A"
// normalize("00") == "-"
// normalize("A") == "A" // unchanged by the normalization

This function converts rev IDs to a common format for the comparisons so we don’t have to insert all sorts of special handling code into our actual comparisons. I’ll leave it to you to figure out the implementation (it’s not terribly interesting — you’ll probably end up using isalpha() and isdigit() a lot).

2. Decorate each rev ID with its rev type

First, define an enum that lists the different types of rev IDs in order:

typedef enum
    ALPHA, // one digit alpha, A, B, C, etc.
    DOUBLEALPHA // two digit alpha, AA, AB, etc.
} rev_type_t;

Next, define a function that looks at a rev ID and returns its type:

rev_type_t get_rev_type(const string &normalized_rev);
// get_rev_type("01") == NUMERIC
// get_rev_type("-") == DASH
// get_rev_type("A") == ALPHA
// get_rev_type("AA") == DOUBLEALPHA

Again, I’ll leave the implementation as an exercise.

Finally, combine (decorate) each rev ID with its type:

#include <utility> // std::pair<>, std::make_pair()
std::pair<rev_type_t, string> decorate_revid(const string &rev_id)
    const string normalized_rev = normalize(rev_id);
    const rev_type_t rev_type = get_rev_type(normalized_rev);
    return std::make_pair(rev_type, normalized_rev);
// decorate_revid("01") == pair(NUMERIC, "01")
// decorate_revid("-")  == pair(DASH, "-")
// decorate_revid("00") == pair(DASH, "-")  // "00" normalized to "-"
// decorate_revid("A")  == pair(ALPHA, "A")
// decorate_revid("0B") == pair(ALPHA, "B") // "0B" normalized to "B"
// decorate_revid("AA") == pair(DOUBLEALPHA, "AA)

3. Compare and Sort

Now that you’ve decorated the rev IDs you can compare them reliably. Instead of comparing solely by the revision IDs themselves, or even their normalized forms, we’re now comparing the objects of type pair<rev_type_t, string>. The first element of each pair, rev_type_t will ensure that the relative classes of revision IDs compare correctly relative to each other. All Numerics will be less than all Dashes which will be less than all single-digit alphas which will be less than all two-digit alphas. And the normalized revision ID in the second element will ensure that within each type the rev IDs sort correctly.

// given two rev IDs return the greater.
// If they are equivalent, return the first argument
// (preserve weak ordering)
//     max_revid("0A", "A") == "0A"
//     max_revid("A", "0A") == "A"
string max_revid(const string &first_rev, const string &second_rev)
    if( decorate_revid(first_rev) >= decorate_revid(second_rev) )
        return first_rev;
    return second_rev;

And once you can compare them, you can sort them:

#import <algorithm> // std::sort()
#import <vector>
// A comparison function for use by std::sort(). 
// Returns true if the first rev should be sorted before the second, 
//   false otherwise
bool compare_revids(const string &first_rev, const string &second_rev)
    return( max_revid(first_rev, second_rev) == second_rev );
string unsorted_revids[] = {"0A", "99", "AA", "A", "-", "B", "00", "01"}
vector<string> revids(unsorted_revids, unsorted_revids + 8)
std::sort(revids.begin(), revids.end(), compare_revids);
// revids now sorted: ["01", "99", "-", "00", "0A", "A", "B", "AA"]


Now that I have max_revid() and compare_revid() I can do implement my revision ordering pre-condition easily:

  1. Get all current revision IDs for the item
  2. Sort them in order using std::sort() with compare_revids() as the compare function
  3. Compare the last revision in the sorted list of rev IDs to the rev ID for the new revision using max_revid(). If the new rev ID isn’t greater than the current rev, return an error code.
  • Don Knab

    Looks great Scott. Have you ever considered normalizing the odd Rev ID’s already in Teamcenter (“00″, and 0A” revs) …a database cleanup sort of thing?

    • http://plmdojo.com/ Scott Pigman

      Thanks, Don. Yes, I am looking at doing a database cleanup. I need to run some test runs first. We did a similar cleanup earlier to fix item IDs and discovered some problems exporting assemblies that contained renumbered components if the assemblies hadn’t been re-saved since the renumbering. So we want to check if we’ll run into the same sort of issue after normalizing rev IDs.

      • Don Knab

        Agreed. I can think of one issue we often had trouble with when futzing with names outside of the control of the TC/NX interface….Upper/Lower case conflicts!

  • Pingback: Ninja Tricks for Migrating NX CAD Data – The PLM Dojo()

Optimization WordPress Plugins & Solutions by W3 EDGE