Data Center Management: Cleaning out the stove pipe
Updated: Jan 8, 2021
December is upon us and in the northern parts of the US it is the annual return to colder temperatures, hats and sweaters, snow tires, and cleaning out the stove pipes. Anyone familiar with burning wood is familiar with cleaning out the stove pipe so that waste can be efficiently carried away and there is much less risk of burning down the house.
An Application-centric approach drives an urgency for improvement in coordination and planning between IT and Facilities
Organizational stove pipes can be equally prone to causing problems and deserve at least as much attention. Stove pipe organizations largely run up through the organization’s structure with little interaction before they reach the roof, or at least the corner offices of the C-level executives. This phenomenon is particularly challenging in data centers, where the delivery of IT services requires a close integration in design, deployment and the operation of the Facility Infrastructure, IT hardware, and software. These systems need to be planned and operated in concert, but often operate under a wholly separate management chain.
A common use buzz phrase today when selecting and deploying IT network and compute hardware is “Application-Centric”. This notion often falls short of its full usefulness as a model when operating a data center. Being Application-centric means looking at every process in the data center in terms of how it impacts the ultimate mission of the data center.
The facility staff often evaluates options like shutting down unused capacity or operating with a reduced redundancy
An Application-centric approach drives an urgency for improvement in coordination and planning between IT and Facilities to assure the effective use of resources, while simultaneously, maximizing the use of the company’s technology investments. The most dramatic examples resulting from a lack of coordination are in clients where their facilities equipment wastes space and energy because they are only using 10-15% of the capacity that they were designed for, or clients who have spent millions of dollars in new IT equipment that are sitting idle on their loading docks because there is no space, power, or cooling capacity left to support them.
The above examples result in undesirable outcomes for both organizations. In the case of the clients who have invested in new hardware that cannot be installed, the outcome will reflect poorly on both the facilities team that its data center does not have the capacity or condition to support this new equipment, and on the IT team that his equipment cannot be put online. On the other hand, in the case of the clients with underutilized space, the struggles are different. The Facilities organization will encounter issues like getting their Data Center to maintain or improve its energy efficiency, and sometimes dealing with equipment that may not function well at very light loads.
The facility staff often evaluates options like shutting down unused capacity or operating with a reduced redundancy, but they are often challenged to implement changes without knowing what the real IT requirements are for today, next month, or even for the next year
most of the day to day operations decisions take place without any effective coordination.
Data centers are much better served by the coordination resulting from having the “day to day” management operations teams responsible for providing the power, cooling, and information technology reporting to a common manager. When the coordination between facilities and IT resides with the CTO or the CFO, major changes are coordinated, but most of the day to day operations decisions take place without any effective coordination. In cases where the management structure is fixed, it is strongly recommended to establish working teams across both organizations to coordinate these “day to day” activities down to the details of selecting where to install a new server, or when to perform maintenance on a cooling unit.
This is one of many operational risks that are frequently identified and mitigated when engaged in an operational risk assessment or on evaluating the condition and capacity of their data center facilities. It is highly recommended to look at operational risks posed by the hardware installed and people and processes that operates and maintains them.
At the end of the day, the goal will be to support the availability and reliability of the applications with the lowest cost on operations.