Skills required in a sustenance operation
What do hospitals, network monitoring stations & the neighborhood landscaper have in common?
They ensure that people, infrastructure, or yards are monitored, maintained and kept in good health. Each of them is a sustenance operation.
The ongoing maintenance and remediation activities are the more visible part of any sustenance operation. However, a lot of design and development work precedes this to ensure that the right processes, tools, and infrastructure are in place to enable the downstream operations
Those involved in the initial design activities need to possess expertise and skills to envision and construct something new. Personnel engaged in the run operation, on the other hand, need problem resolution skills, empathy, and the discipline to follow processes.
1. Design & build
The initial set up phase follows the same steps as in any new development program – requirement gathering, prototype, design, and construction.
Requirements – performance benchmarks & operational requirements
The expectations of performance for an application, infrastructure or product are outlined in its design & release documents. These performance expectations in turn define the service levels.
It is important to know likely points of failure, impact of different failure types, what should be monitored, how, and with what level of prioritization. Therefore, familiarity with the product or platform and of the context in which it is being used is essential skills to be able to understand and define service levels.
The requirements stage is also the time to understand the needs of the monitor and resolve teams. What tools, infrastructure and enablers do they require in order to be effective in their respective activities? The ability to engage with them and relate to their priorities is important.
Design - the process book and automation
Having recorded the performance measures and needs of the operational teams, it’s time to decide on the appropriate design, tools, infrastructure, & resources for the sustenance activity.
Getting this done right is critical to the success of the sustenance process, hence time to involve experienced personnel who have expertise in the design of similar sustenance operations. Designers who can ensure a robust / functional design which can be taken into implementation.
The process book and automation are key facets of the design.
The process book
What will be the process to follow for monitoring, logging incidents, and then resolving them? Which resources will be involved at each stage? How will a resolution be tracked and measured?
These guidelines are detailed in a process book which is the play book for the monitor, flag, and resolve activities.
Knowledge and expertise on likely defects, their severity, and resolution methodologies are required when writing the process book.
The team which has originally developed or constructed the application, infrastructure or product best understands the design, materials, and interdependencies. They need to lead the activity of writing the process book along with the key personnel who will lead the monitoring and resolution activities.
A monitoring activity requires high uptime and consistency with minimal variability over extended hours. It can be strenuous, often mindless activity with long hours.
The use of automation & tools like sensors, logging devices, chat bots, AI & process automation can enable this consistency & uptime. How and where they can be incorporated are important design considerations.
An associated benefit of automation in is that while capital intensive up front, it can often have lower lifecycle costs vs. having humans performing the same activity
Knowledge and experience of different tools and automation methodologies is a highly valued skill in the design process.
With an approved design, it’s time to set up the monitoring infrastructure, tools and facilities.
This is the stage with the highest spending and budgets. It’s time again to involve experienced personnel who can effectively & reliably get the job done – on spec, time & budget.
Given the many moving parts, project and program managers play a vital role in providing oversight, and course correction to keep schedules, resources, & budgets on track. Specialists from resource supply chain & infrastructure, compliance, environment, and legal, will provide expertise & resources in their area of specialization
The ability to collaborate and work constructively with these multiple groups is a key skill required of those leading and participating in this construction stage.
2. The run operation – monitor, resolve & improve
Monitor and track - - follow the process book
The primary goal of the monitoring activity is to spot likely issues and failures and enable actions to prevent breakdowns and downtime.
Anyone participating in the monitoring activity needs to have knowledge on how to work with the tools which are being used for monitoring and of the processes to be followed for information capture and diagnostics.
Secondly, it is crucial to have the discipline to follow the process book. Ensuring that the right steps are taken and in time to log incidents and bringing them to the attention of appropriate resolution teams can significantly enhance successful outcomes
If incident monitoring requires interaction with impacted people reporting issues, empathy and good listening become critical skills for those manning the incident desk. They are engaging with people in pain. The ability to demonstrate concern and work constructively towards a resolution is vital.
A skill required of all involved a monitoring operation is the ability to handle stress. Every incident or call is a situation flagging trouble of pain somewhere. The ability to handle it with a calm demeanor and to follow the recommended processes is essential, not only to a timely resolution, but also for the mental and physical health of those engaged in the monitoring activity
Resolve and repair – minimal handoffs
Timely and consistent resolutions with minimal handoffs are key benchmarks for the resolution activity.
Deploying the right skilled resources with appropriate training complemented with tools / workflows which assist them during issue resolution can enable this.
The discipline to follow the process book is a trait also required in those who are engaged in the resolution process. This ensures that issues get funneled to teams with the expertise to resolve them, and they in turn follow the best path to resolution.
The expertise required for incident resolution varies based on the complexity of the issue.
Low severity incidents can be resolved by training those who are manning the incident desk, and supplementing this training with workflows and knowledge databases as guides.
The resolution of low severity, less complex, and often repetitive incidents is also the area amenable to the use of automation. The use of chatbots, process automation and advanced tools like self-healing algorithms are increasingly being used to aid speedy and consistent resolutions of these high volume, low severity incidents.
Complex incidents on the other hand require greater expertise to resolve. This requires deeper knowledge of the product/infrastructure and of the design complexities. Members from the group which originally designed and built the product, application or infrastructure are often the best resources to participate and take ownership of resolution of these more complex issues.
When something is broken, and the clock is ticking to get it fixed, it’s a stressful situation. The ability to work productively in these stressful situations and with empathy towards those affected by it is required in all involved in the resolution process too.
Enhance and improve
Information on what is working well and what is not is captured and available from the defect tracking systems. This data is the foundation for any activity to improve the product or service
Analysis of incidents, their frequency, success in resolution and the resources required to ensure resolution can point to areas of improvement in the sustenance operation – what needs more attention, how can the process be improved, what additional expertise is required, and where are the redundancies.
The data may also throw up high volume or hard to resolve incidents. These are input to the design and development team on improvements which need to be incorporated into future redesign and releases to prevent these chronic defects.
A sustenance operation starts with the understanding of performance expectations and therefore what defects or incidents need to be monitored and prioritized for resolution.
Following this are the design & operationalization of the infrastructure and resources required for monitoring & resolution. This then moves to the run or operational phases of monitoring and resolution of defects and of continuous improvement based on analysis of what is working and where there is scope for improvement.
Each of these stages requires unique expertise and skills. to ensure success at each stage of the sustenance activity, it is important to recognize the key skills required, and to then to deploy them.
Some of the key skills required during the lifecycle of a sustenance operation are outlined in the table below
You may also want to read Skills required when building something new