Dan Fisher
Dan Fisher

With over 35 years in the financial industry, Dan M. Fisher has proven himself as a leader in the financial industry holding roles as the former director of the Federal Reserve Bank of Minneapolis and former Chairman of the ABA Payment Committee.

Surprise! You didn’t know your own weak points

It is amazing how often things happen within the technology infrastructure that are unexpected.  Sometimes they result in an event that can take the whole shebang—data center, critical function, or application—offline for an extended and unexpected period of time.  As the staff scrambles to restore order, you ponder what has occurred, why it happened and you look for causes and people to blame.  The cause is shortly identified as a single point of failure.  You are surprised and angry that there was a service interruption. But you are also irked. You know that if you had known about that weak spot, the situation would have been entirely preventable.  You, as the technology executive, CIO, or IT manager are the one person that everyone depends on.  You are never thanked when it works.  And you are always called when it doesn’t.  So, recalling from some experiences of my own, here are some hints that could save you from your own weak points:

Routers, switches, and ATMs

When the power blinks or is interrupted, the router, communications switch, or ATM will go offline. Many of these devices will not restore themselves when they trip offline after a power blink. Consequently the branch location, office, or ATM will sit there, idle, until the device is re-booted or reset.

To prevent the blink or short power outage and the undesirable outcome, a small UPS will suffice. (That’s short for uninterruptable power supply, and you should get one with a duration rating of four hours.) Just be sure that when you install this UPS that you connect it into the plug that says “battery and surge” protection.

Note: Every router, switch, and ATM should have an UPS battery back-up—and a schedule for replacing the battery.

Virtualized services and services

Using a server for multiple applications is a cost-effective tool. Too often, however, the concept of virtualization is misunderstood.

Just because you have multiple servers and virtualized applications does not translate to easy restoration if the hardware fails.

More importantly, systems that are configured to auto-failure may not be able to handle the increased volume.

Just to make sure, use the Wombat’s Rule of 300!

Here is how it works. Add up the total computing demand of all of the applications you are running. Now multiply that by 300%. A single server should have that capacity, so that if any server should fail, any one of your servers can still manage the load, and give you time to find another replacement server without creating a cascading failure.

This situation should be constantly monitored as applications grow. Virtual failover events can create a catastrophic and cascading incident.

That will take out all of your servers when the remaining server is unable to handle the load.

Simple solutions

Finally, almost every major event can be avoided with the implementation of a simple solution. It is just a matter of talking about it with your staff.

Start by being philosophical:

How can our system go down?

Let me count the ways!

In other words, knowing what you don’t know!

—The Wombat!

Dan Fisher
Dan Fisher

With over 35 years in the financial industry, Dan M. Fisher has proven himself as a leader in the financial industry holding roles as the former director of the Federal Reserve Bank of Minneapolis and former Chairman of the ABA Payment Committee.

Related posts

Leave a Reply

avatar
  Subscribe  
Notify of

Join The Mailing List

The Copper River Group is a financial consulting firm that believes in the benefits technological advancement has for streamlining business.

  • This field is for validation purposes and should be left unchanged.
cea
cio