
While the misinformation being tossed around in cyberspace regarding LOTW is almost too wild to answer, let me try and respond to some of the more relevant issues. Currently, LOTW is up and running as of 7:30 am EST with a processing queue of approximately 48 hours. What happened, despite what others may think, involved human error in not recognizing that the space allocated to the LOTW database was filling up. There were monitors on LOTW admin screens within the system but no one recognized the impending condition. This has been rectified and automated alarms have been added to the database. If we had recognized it earlier, expanding the storage to currently available space, it would have been done with no need to take the system down. And as has already been noted, there was no data lost. The system does, given the size of the database, take time to march through the recovery process and that's what was happening over the past 3 days. In the 2012 Plan (Capital Expenditure section) we recognized the need for more storage and got approval to add an additional 2.8 terabytes at a cost of $25,000 to allocate to various HQ functions including LOTW. This was in place. And to respond to a couple of other questions, the system is built on a commercial database product from SAP. Of course there is a lot of custom programming for the actual LOTW functions as there is no product waiting to be pulled off the shelf for this application. In addition, there is unique, custom programming to connect the LOTW software to the SAP database. The Chairman has assured me that this issue will be discussed further with the Administration and Finance Committee at their meeting Saturday, November 17th. 73, Barry J. Shelley, N1VXY Chief Financial Officer ARRL, Inc. The National Association for Amateur Radio (860) 594-0212 www.arrl.org

Barry, Thanks for the update and further explanation. Without getting into the "whys" of "hows" of the actual situation, I have one question regarding its management. My question is why did we not send out a broadcast notice that described the situation and what was being done to rectify it as soon as the problem was recognized? Doing this would most certainly have headed off much of the ". . . misinformation being tossed around in cyberspace . . ." As the old adage says, "An ounce of prevention is worth a pound of cure." FYI, after the situation had been described to Bernie, as I recall, I sent a message to the Great Lakes Division to let members know what had happened, and more importantly, what had not happened (i.e., that the misinformation was not correct). A number of members replied. While a number of these were displeased over what they viewed as the failure by IT to perform a basic function, they were as pleased that finally someone told them what was going on. Conversely, several of them expressed considerable displeasure that by its inaction, HQ "once again" demonstrated (to paraphrase) "it is a self-serving operation that neither respects nor appreciates the members." My major concern is that just about the time "we" seem to recover from a problem that became inflated by insufficient communication (and "our" status is nearly repaired with the members), "we" drop the ball once again and undermine "our" prior good work by failing to acknowledge a situation exists or in delaying the acknowledgement until after the ARRL-naysayers have had free reign to establish the tone. Jim Jim Weaver, K8JE Director, Great Lakes Division 5065 Bethany Rd. Mason, OH 45040 Tel. 513-459-1661; e-mail K8JE@arrl.org ARRL: The reason Amateur Radio Is Members: The reason ARRL is _____ From: arrl-odv-bounces@reflector.arrl.org [mailto:arrl-odv-bounces@reflector.arrl.org] On Behalf Of Shelley, Barry, N1VXY Sent: 09 November, 2012 2:35 PM To: arrl-odv Subject: [arrl-odv:21203] LOTW Update While the misinformation being tossed around in cyberspace regarding LOTW is almost too wild to answer, let me try and respond to some of the more relevant issues. Currently, LOTW is up and running as of 7:30 am EST with a processing queue of approximately 48 hours. What happened, despite what others may think, involved human error in not recognizing that the space allocated to the LOTW database was filling up. There were monitors on LOTW admin screens within the system but no one recognized the impending condition. This has been rectified and automated alarms have been added to the database. If we had recognized it earlier, expanding the storage to currently available space, it would have been done with no need to take the system down. And as has already been noted, there was no data lost. The system does, given the size of the database, take time to march through the recovery process and that's what was happening over the past 3 days. In the 2012 Plan (Capital Expenditure section) we recognized the need for more storage and got approval to add an additional 2.8 terabytes at a cost of $25,000 to allocate to various HQ functions including LOTW. This was in place. And to respond to a couple of other questions, the system is built on a commercial database product from SAP. Of course there is a lot of custom programming for the actual LOTW functions as there is no product waiting to be pulled off the shelf for this application. In addition, there is unique, custom programming to connect the LOTW software to the SAP database. The Chairman has assured me that this issue will be discussed further with the Administration and Finance Committee at their meeting Saturday, November 17th. 73, Barry J. Shelley, N1VXY Chief Financial Officer ARRL, Inc. The National Association for Amateur Radio (860) 594-0212 www.arrl.org

Jim, Greetings from the airport in Ho Chi Minh City where I am sitting with Rod Stafford awaiting our flight to Hong Kong. I agree with you that information was too slow to be made available. I became aware of the problem on Tuesday morning here, but of course it was Monday night in Connecticut and we do not staff for 24/7 IT coverage. Of course it took a while for an assessment of the situation on Tuesday and IT's first priority was to fix it, but we should have had an explanation posted sooner. I have made that point to staff. LOTW is to some extent a victim of its own success. It was designed originally to provide a faster, cheaper, and less labor-intensive alternative to paper QSLs. It has accomplished that and more, at least with respect to the awards programs it supports. But while we certainly don't want to have the system unavailable for days at a time it was not designed to be a real-time confirmation system. One of the key design objectives was to preserve the integrity of the DXCC program. Achieving that required more complexity than, say, Club Log. They're about to call our flight so let me close with one thought. Many of our members are passionate about their particular interest within Amateur Radio. Sometimes that leads them to lose perspective, especially if other aspects of their lives are not going as well as they would like. One of the reasons people have hobbies is to be able to retreat from other frustrations in life, which makes problems such as this to become magnified. But it's when our members stop being passionate that we should really worry. 73, Dave K1ZZ arrl-odv-bounces@reflector.arrl.org on behalf of Jim Weaver K8JE Sent: Fri 11/9/2012 8:34 PM To: Shelley, Barry, N1VXY; arrl-odv Subject: [arrl-odv:21204] Re: LOTW Update Barry, Thanks for the update and further explanation. Without getting into the "whys" of "hows" of the actual situation, I have one question regarding its management. My question is why did we not send out a broadcast notice that described the situation and what was being done to rectify it as soon as the problem was recognized? Doing this would most certainly have headed off much of the ". . . misinformation being tossed around in cyberspace . . ." As the old adage says, "An ounce of prevention is worth a pound of cure." FYI, after the situation had been described to Bernie, as I recall, I sent a message to the Great Lakes Division to let members know what had happened, and more importantly, what had not happened (i.e., that the misinformation was not correct). A number of members replied. While a number of these were displeased over what they viewed as the failure by IT to perform a basic function, they were as pleased that finally someone told them what was going on. Conversely, several of them expressed considerable displeasure that by its inaction, HQ "once again" demonstrated (to paraphrase) "it is a self-serving operation that neither respects nor appreciates the members." My major concern is that just about the time "we" seem to recover from a problem that became inflated by insufficient communication (and "our" status is nearly repaired with the members), "we" drop the ball once again and undermine "our" prior good work by failing to acknowledge a situation exists or in delaying the acknowledgement until after the ARRL-naysayers have had free reign to establish the tone. Jim Jim Weaver, K8JE Director, Great Lakes Division 5065 Bethany Rd. Mason, OH 45040 Tel. 513-459-1661; e-mail K8JE@arrl.org ARRL: The reason Amateur Radio Is Members: The reason ARRL is ________________________________ From: arrl-odv-bounces@reflector.arrl.org [mailto:arrl-odv-bounces@reflector.arrl.org] On Behalf Of Shelley, Barry, N1VXY Sent: 09 November, 2012 2:35 PM To: arrl-odv Subject: [arrl-odv:21203] LOTW Update While the misinformation being tossed around in cyberspace regarding LOTW is almost too wild to answer, let me try and respond to some of the more relevant issues. Currently, LOTW is up and running as of 7:30 am EST with a processing queue of approximately 48 hours. What happened, despite what others may think, involved human error in not recognizing that the space allocated to the LOTW database was filling up. There were monitors on LOTW admin screens within the system but no one recognized the impending condition. This has been rectified and automated alarms have been added to the database. If we had recognized it earlier, expanding the storage to currently available space, it would have been done with no need to take the system down. And as has already been noted, there was no data lost. The system does, given the size of the database, take time to march through the recovery process and that's what was happening over the past 3 days. In the 2012 Plan (Capital Expenditure section) we recognized the need for more storage and got approval to add an additional 2.8 terabytes at a cost of $25,000 to allocate to various HQ functions including LOTW. This was in place. And to respond to a couple of other questions, the system is built on a commercial database product from SAP. Of course there is a lot of custom programming for the actual LOTW functions as there is no product waiting to be pulled off the shelf for this application. In addition, there is unique, custom programming to connect the LOTW software to the SAP database. The Chairman has assured me that this issue will be discussed further with the Administration and Finance Committee at their meeting Saturday, November 17th. 73, Barry J. Shelley, N1VXY Chief Financial Officer ARRL, Inc. The National Association for Amateur Radio (860) 594-0212 www.arrl.org

My IT experience was with the network of a large utility. There were applications in use which contained millions and millions of records. They ran 24/7 with the exception of downtime for nightly back-ups. None of them took more than several minutes to restore service to the users after those back-ups. Below is a quote of mail received from one of my members who has made a very good living the the field. I believe that he knows of what he speaks. 73, Bob Vallio -- W6RGG ------------------------------------ ARRL promotes LoTW as a "service product"; a "feature", something they are proud of, something they offer to the amateur community, and something that is supposedly a benefit of ARRL membership (e.g., the ability to get rewards without handling paper QSLs). It is NOT like many other amateur radio computer products in the realm of "I built this for my own use. You are welcome to use it, but no guarantees." I have designed and built many PRODUCTS, including hardware and software. Some of them have shipped in quantities of tens-of-millions. I don't ship any "works in progress." That doesn't mean that my products have never had problems that needed fixing, only that they were thoroughly tested BEFORE release, and subjected to extensive design review BEFORE even being built. They are the result of my best tradeoffs among features, cost, reliability, and time-to-market. I put them out there, and am willing to be measured by their success or failure. It is fairly clear, both from discussions I have had with ARRL and others, and from experience, that LoTW began as a "weekend project" by some ambitious programmer, and that it caught on and grew. In fact, it grew way beyond the capabilities of the original "design." (I believe a lot of the problems with LoTW have to do with the choice of database, and to a certain extent with the amount of resources that ARRL is willing to devote to the project.) Good designers build prototypes, and test them extensively. Then they start from scratch to build a PRODUCT from what was learned in the prototype stage. LoTW appears to be a "prototype" that was "shipped". Of course, there will be problems. However, ARRL has indicated many times that they are either oblivious to them (e.g., log processing delays: "What problem?") or they consider turnaround times of days or weeks to be an acceptable design decision. (In contrast, log processing response times on both eQSL and Clublog are nearly instantaneous, even after a big contest weekend.) They also don't consider it important to provide any feedback to their user community on the progress of repairs, or the progress of log processing. (A simple indicator of the number of logs in the queue, or the expected processing time, would alleviate a lot of angst.) A message that "The system is down. We're working on it." is not very reassuring. Neither is the occasional cryptic error message from their database engine that occasionally comes up when you try to log into the system. (Sort of like hitting the UP elevator button from the lobby of a beautiful hotel, and instead getting dumped in the base ment with the steam pipes.) Yes, there may be times when the system must be brought down for major reconfiguration. But any reasonable service provider would inform their user community of such an event BEFORE it happened, and not respond only after users complain about the disappearance of the service. Now, if the problems were the result of something totally out of ARRL's control, then all bets are off. For example, the IEEE last week had major problems with their e-mail forwarding system, due to outages caused by Sandy. Other than not incorporating a hot standby backup system, that is not a problem with the IEEE's system design or operation. But many of us believe that LoTW is in need of a re-DESIGN, and that is from a rather knowledgeable user community. So far, ARRL has been non-responsive. That said, I agree that *anything* is better than nothing. But as long as ARRL touts their system as a "real" service, then they deserve the criticism that goes with that. If they want to call it an "extended beta test," that's fine too, and the expectations will be much lower. They have not chosen to do that, though."
participants (4)
-
Bob Vallio
-
Jim Weaver K8JE
-
Shelley, Barry, N1VXY
-
Sumner, Dave, K1ZZ