How will combining duplicates work?

This is a question concerning the chain of events that occurs when a TD suspects a duplicate identity for a player, and calls attention to the duplicate while submitting a crosstable on-line for rating.

I suspect, but am not sure, that the chain of events is something like this:


  1. The TD marks a suspected duplicate on the crosstable, giving both ID numbers.

  2. This triggers an investigation by the office.

  3. If the office decides there is, in fact, a duplicate identity, one of the records is marked “duplicate – see XXXXXXXX”, and the other one becomes the “correct” record to use.

  4. The crosstable submitted by the TD is changed to the “correct” ID, if it is not already listed that way.

  5. The next time there is a re-rate, all crosstables containing the duplicate ID are changed to the correct ID, and the ratings are computed accordingly.

  6. Eventually, the duplicate record is deleted, or left in as a “dead” record, but is never again used (or permitted) in any future crosstables.


I’m pretty sure about steps 1 through 3, but how automatic is step 4? Does the software do this automatically as a result of the office’s action in step 3? Or must the office do step 4 by hand? Or, is an email sent to the TD asking him to revise and re-submit the crosstable?

Same questions about step 5. Does the re-rate process automatically take care of this? If so, is the rating that originally appeared in the “dead” record changed to unrated, or is it made equal to that in the “correct” record, or is it left alone? (I guess the answer to this question is not important, if that rating is never used again anyway.)

The above steps, if I have the concept correct, should present no problems as long as the duplication (i.e. the divergence of the two records) occurred after the date the re-rate goes back to. What happens, though, if the duplication has existed for years, way too long ago to go back to with a re-rate, and both records have ratings in them? It seems as though some kind of fudging would be necessary in this case.

Bill Smythe

Pretty decent analysis, Bill. Let’s bore the heck out of everyone else by taking it even further. :slight_smile:

There is no place in the rating report file to enter a duplicated ID, which applies to events submitted online and those submitted on diskette, which means that 80% or more of all rated games have no reporting mechanism for duplicate IDs.

Writing that information on the crosstable didn’t work well as a reporting mechanism under the old ratings system (those notes weren’t always seen), and it doesn’t work any better under the new system.

For events submitted online, the ‘USCF office action required’ membership exception request should provide a trackable way of reporting that information.

I don’t have a good solution for reporting duplicates on events mailed to the office, other than a BIG note somewhere that the membership department will see. If anyone has a better idea, speak up!

During validation, if an ID flagged as a duplicate is used, it is treated as an invalid ID which must be corrected before the event passes validation.

We now have nearly 700 duplicate IDs flagged in the system. For events rated in 2004 and 2005, there were about 375 cases where a duplicate-flagged ID was used in a crosstable. Those have been recoded to the correct ID, though that doesn’t always resolve the issue if the duplicate ID had events from before then.

Duplcates reported after an event is rated will be treated like any other incorrect ID in a crosstable, in many/most cases triggering a rerating.

There is a small window of ambiguity here. An event can pass validation before the duplicate is flagged and be rated just afterwards with the duplicated ID still in use. Those cases need to be caught during the ratings process, but that’s a bit tricky to do without requiring a few other validation checks as well, so recoding them will probably become part of the work that precedes each rerate pass, along with resequencing the sections to be rerated into event-ending order.

I’m still not sure what I want to do about the historical record for a duplicated ID, though I think I should have a solution for that based on how I’m handling ID changes reported after an event is rated. That’s a complicated issue, involving a number of security concerns regarding who is permitted to make changes and when. I need to track any changes made, who made them, and when, just in case someone tries to cause
trouble by altering rated events.

I keep the original ID and original results even if either or both are changed, so that I always have a record of how the event stood when it was first rated.

If a duplicate is treated as the equivalent of an incorrect ID, then all I need to do is update the ‘latest rating’ ID field, leaving the historical ‘original’ ID intact. (I’m a data purist, I hate to throw ANY data away or change it irrevocably.)

And if that’s only done in the interval preceding a rerate, when no other changes to the live ratings system data will be permitted, then I shouldn’t have any conflicting update or semaphore issues.