Artificial intelligence (AI) policy: ASHRAE prohibits the entry of content from any ASHRAE publication or related ASHRAE intellectual property (IP) into any AI tool, including but not limited to ChatGPT. Additionally, creating derivative works of ASHRAE IP using AI is also prohibited without express written permission from ASHRAE. For the full AI policy, click here. 

Close
logoShaping Tomorrow’s Global Built Environment Today

Commissioning & Performance Validation

AI Data Center Energy Performance Framework

Share This

 Back to AI Data Center Energy Performance Framework

Commissioning and Performance Validation   

 Impact

As a result of the AI arms race, AI data center design and deployment is occurring at record pace. Speed to compute has become one of the key metrics of successful projects and companies.

Traditional approaches to the overall project delivery model are being significantly strained and tested, but specifically the feedback from commissioning to design and operations must accelerate. AI data centers are deploying new technology at a rate never before seen in the industry as technical advances in compute, power, cooling, and all the supporting infrastructure are in a never-ending state of research and development. As such, facilities are commencing construction with partially completed design documents that are constantly being tested by the release of new technical information from IT manufacturers and power and cooling manufacturers. Often this approach to rapid construction limits the impact a Design-Phase Commissioning review can provide.

Because of the rapid rate of deployment, traditional cycles of manufacturer research and development as well as developer proofs of concept are no longer able to be fully vetted before the first scaled-up project commences. This puts the pressure for facility and infrastructure design and continuous improvement on the owner, the design team, and the commissioning agent. The commissioning agent (CxA) is often challenged to engage sufficiently in the design process to provide meaningful insight if engaged in a traditional workflow.  Integrating the commissioning process (CxP) in the real-time design and review process can maximize their value.

Likewise, the penalties from end users to the owner (whether internal or external) for delays or failures are increasing because of the commercial pressures for speed to compute. The projects must be built faster, with new technologies, while simultaneously retaining the quality we have come to expect in mission-critical facilities. This requires closer coordination and collaboration between all parties in the project delivery chain, with additional skilled workforce on-site to support the combination of speed and technical complexity. The combination of rapid deployment of new technologies with partially completed designs puts additional uncertainty in project execution schedules and inevitably pressures those activities at the tail end of the project, namely commissioning, to be adjusted to make contractual delivery timelines.

The data center industry has always had some form of feedback from commissioning back to design to enable iterative improvement for those designs. However, the current speed of deployments combined with the dollars involved require faster feedback, and from all aspects of the team to each other.

In traditional data center construction, large portions of a data hall or data center would be turned over from the construction team to the commissioning team after extensive quality assurance and quality control (QA/QC) activities have been executed. Commissioning occurs in phases, known as levels. Level 1 (L1) focuses on factory acceptance testing to ensure components meet design expectations before shipment, while Level 2 (L2) verifies proper delivery and installation on‑site. Level 3 (L3) conducts pre‑functional checks to confirm individual components and subsystems are ready for operation, followed by Level 4 (L4), which tests full functional performance of each system under various conditions. Finally, Level 5 (L5) performs integrated systems testing, ensuring all systems work together seamlessly under real‑world and failure scenarios to validate overall facility reliability and readiness for operation. The presence of downtime or inactivity on certain systems between L2 and L3, and between L3 and L4, was expected and normal while waiting for large tranches of infrastructure to be ready to test.

Now, more non-traditional approaches to the sequence of construction and commissioning should be considered, including scheduling and turnover of smaller blocks of infrastructure to accelerate testing. This can speed up the diagnosis of systemic issues and help prioritize certain construction and QA/QC activities. However, this also requires a more dynamic approach to the scheduling and handover process, as well as very close coordination and collaboration between the construction and commissioning teams. Specifically, careful consideration should be given to systems or activities that involve new technologies, products, or construction methodologies in determining which activities are low risk to streamline when pressed.

Author Acknowledgements

Back to top


Highlights

  1. The pace and complexity of AI data center deployment, combined with the business impacts of failures, drive the need for larger commissioning teams focused on quality, speed, and accelerated feedback to the team.

  2. Design-phase commissioning becomes more critical as the accelerated pace leaves fewer opportunities for adjustments if the CxA is not engaged to review and provide feedback on the design in parallel with development of that design.

  3. The pace of construction and high penalties for late delivery are forcing construction and commissioning teams to reimagine the process and milestones for handover from construction to commissioning. More granular milestones, which can be harder to schedule and track, enable earlier testing and commissioning but require much closer coordination between teams.

  4. Earlier focus on construction and commissioning of monitoring and control systems helps to enable faster commissioning by streamlining data capture and analysis during L4 and L5 testing.
  5. Measurement and Verification (M&V) Plans, which are often treated as almost an afterthought, must start meaningful development at the start of design. Specifically, early implementation of this infrastructure can aid in the identification of issues and determination of corrective actions for new technologies and infrastructure designs.
  6. Especially in today’s public relations climate regarding AI data centers, understanding, quantifying, and managing impacts to the surrounding community are critical. Matters such as influencing grid power quality, managing power demand, and optimizing water use profiles are all more critical than ever.

Back to top


Discussion

Throughout the design, construction, commissioning, and operations of a data center’s life, there must be seamless integration of requirements and execution for the project to realize its full value proposition. Historically, many data centers have been built as one-offs, or part of a small fleet with significant time between the occupancy of one and the design of the next. In today’s AI factory development market, data centers are frequently designed and built with a modular concept from a prototype that is then repeated numerous times on the same campus or across a corporate portfolio.

To optimize the performance of the portfolio, a rapid feedback loop from commissioning and operations to the design and construction is critical. Lessons learned cannot necessarily wait months for discussion and arbitration to be applied to the next phase, as many times the next phase itself is trailing by only a few months. Larger teams of experienced professionals are often needed to accelerate the feedback loop while continuing to progress the construction, deployment, commissioning, and operations of the facilities.

AI factories, especially liquid-cooled deployments, are particularly susceptible to equipment fouling due to insufficient cleanliness and preparation during construction and startup. Proper cleaning, flushing, and passivating on hydronic systems – especially technology cooling systems (TCS) – should never be compromised in the name of speed. Failure to maintain proper fluid cleanliness can result in extensive project delays if IT equipment cold plates become fouled during deployment or if information technology equipment (ITE) is exposed to fluid leaks due to insufficient rigor during commissioning.

The speed of design, construction, commissioning, and turnover is straining the traditional approaches to commissioning, where construction will turn over large blocks of infrastructure at a given time. Smaller blocks of infrastructure on accelerated timelines, with QA/QC teams barely ahead of commissioning, are becoming more common. To optimize the speed of overall delivery without sacrificing quality, more granular milestones with more aggressive tracking and more intensive coordination between construction and commissioning teams is becoming more commonplace.

Instrumentation and Controls Systems which execute the M&V Plan are often the last systems fully commissioned. However, if they’re brought online early enough in process they can provide the best value for troubleshooting issues as well as recording baseline acceptable performance of the systems during L4 and L5 testing. Having a trended and recorded data set from L4 and L5 commissioning on the Historian server is extremely valuable to the Operations Teams’ ability to validate during occupancy and commercial operations the systems in the facility are operating per design, or not. The more baseline data that can be recorded and stored, the better positioned the Operations Team is to provide highly reliable systems and data center operation.

Operator training, which is typically left for the end of the project when L5 commissioning is largely complete, often times cannot be performed at this point due to project delivery timelines to the end customer. Methods of Procedure (MOPs) are often either not drafted or not validated prior to commercial operations introducing risk into their execution in a live environment. L4 and L5 commissioning activities are typically the single best opportunity for facility operators to see the equipment performing properly, responding to failure scenarios, troubleshooting issues, and validating MOPs without significant risk. No amount of training videos can replace the value of witnessing L4 and L5 testing, working with the startup technicians, and walking through MOPs ahead of commercial operations. The integration of the Operations Team to the commissioning process and the validation of MOPs during commissioning requires commitment from the owner or operator for staffing and resource allocation ahead of the start of revenue generating operations. However, viewing such commitments through the lens of risk mitigation rather than overhead cost can ease the approval process.

Back to top


Recommended Practices

  • 1. Integrate Commissioning Throughout Design

    Integrate the commissioning agent with the design process to provide real-time review and feedback, rather than soliciting review at specific design milestones. This also enables more seamless understanding of the design and development of testing procedures.

  • 2. Deploy On Site Subject Matter Experts

    Third-party commissioning teams should be staffed with discipline-specific subject matter experts (SMEs) on-site every day of commissioning activities. Generalists can support significant portions of the testing, but subject matter expertsSMEs are often required for quick troubleshooting of complex technical issues.

  • 3. Assign a Dedicated Commissioning Scheduler

    The commissioning team should have a dedicated scheduler who can keep up with the pace of construction and, changes in priorities, and help advise the Commissioning and Construction teams about [CM13.1]the equipment and systems

  • 4. Establish Startup and Troubleshooting Tiger Teams

    Manufacturer and contractor startup efforts should include dedicated teams for startup and for troubleshooting – “Tiger Teams” – that are responsible for resolving identified issues while testing continues ahead.

  • 5. Implement Centralized Document Control

    Document control is critical when commissioning at speed. Centralized file management through a platform where individuals do not become bottlenecks and new team members can easily find information saves time and improves quality.

  • 6. Use Field Generated Testing Documentation

    Speed causes challenges on many fronts. The traditional approach of waiting on formal office-prepared reports slows progress in the field. Establish acceptable protocols for preliminary field-generated test documentation to enable testing to progress. Do not, however, proceed on notification alone that the testing has been completed. Receipt and review of testing reports should remain a hold-point in the process to avoid careless oversights that could endanger equipment or people.

  • 7. Maintain Daily Cross Team Collaboration

    Ensure the commissioning team is working hand in hand with the construction team and design team daily to evaluate what is working and what is not, and look for opportunities to improve design, process, and operations in a real-time collaboration.

  • 8. Engage Commissioning in Troubleshooting

    The commissioning team can best support the project completion and design evolution through engagement as an active participant in the troubleshooting process, rather than simply highlighting nonconformance and leaving it to the Contractor and Engineer of Record to resolve.

  • 9. Protect Quality While Moving at Speed

    Do not allow the push for speed to deploy to compromise the known quality requirements for equipment and systems, as failure to uphold standards will end up costing time and, money, and potentially risking safety, not improving it.

  • 10. Prioritize Controls and M&V Validation

    Early focus on completion of the Instrumentation and Controls System and validation of the M&V Plan provides data logs for both troubleshooting during commissioning as well as establishing baseline performance during operations.

  • 11. Integrate Operations During Commissioning

    Integration of the Operations Team during L4 and L5 commissioning, as well as validation of MOPs during commissioning, significantly improves team training and derisks post-occupancy maintenance and operations.

Close