Fault Management

Fault management

Perform advanced fault analysis and streamline fault handling processes

The dpCom Fault Management targets problems in the physical network. But if it is integrated with an OSS or a monitoring system, you can even access faults in the active part of the network.

The basis for dpCom Fault Management is the documented as-built physical network together with connections, services, and customers. This can be combined with information on leased lines in order to trace the physical connections end-to-end.

Managing planned outages for maximum uptime and customer service

Reconfigurations, equipment replacements, equipment upgrades, and other maintenance activities in the network can result in loss of service for customers. dpCom uses information about the connected network, including connections, services, and customers, to analyze the impact of the planned changes. This analysis highlights the SLA and risk classification for the affected services.

Once the plans for the changes are approved, you define them in dpCom as planned disturbances or outages with planned start and restoration times. You can then create a communication plan and use the media of your choice –email, SMS text, letter, etc. – to send template-based information to affected customers.

  • Messages in your communication plan can include:
  • Initial plan notices and details
  • Reminders about upcoming planned work
  • Warnings that the plan is about to be executed
  • Progress and status reports
  • Alerts on modifications to the planned restoration plan
  • Notice of service restoration

Network owners often do not have all the end customers as direct customers. They might belong to another operator, an ISP, or another player in the value chain. By integrating dpCom with the OSS/BSS, trouble-ticket or customer-care systems of these service providers, communications can travel through the value chain and reach all affected parties.

Fault detection and correction

Faults share many aspects of a planned disturbance or outage. A cut cable is a cut cable, regardless of whether it was done by accident or as part of planned network modifications. The main difference lies in the initiation of the fault.

Especially for faults in the passive part of the network, you need to identify the faulty object. This is easy to do in some cases, like when you get a field report about accidental service disruptions such as an accidentally cut cable. In this case, the NOC can make a quick initial estimate of the actual cable and identify it as the source of the fault. But in other cases, you need a bit more detective work.

Fault information on a set of services normally serves as input for this detective work. Because dpCom contains full information about the complete path a service takes in the network, the system’s fault analysis function can use this information for a set of services and look for patterns or common denominators, such as “15 out of 16 services are present in a certain cable.”

OTDR measurements can be used to check for cuts in the fiber-level network, and dpCom will calculate and geographically show the position of the detected distance. The calculation takes documented cable sling lengths into account. This information provides a reliable reference when determining where to send your field crew.

Just as for a planned outage, information about affected customers and services, including SLA levels and risk classifications, is available for the fault. This helps the NOC to set priorities and urgency levels. You can also customize this information to implement government-imposed fault classifications, such as “If the number of affected customers multiplied by downtime exceeds a certain value, send a notification to a public outage portal.”

Predefined workflow templates can be used for fault-finding tasks and repair work. These templates can be standardized or, for faults in higher-risk classification objects, they can be taken from the objects themselves. Object-specific templates can, for instance, state that the initial action is to execute a prepared redundancy plan. As more information about the problem arrives, the fault can be moved from one object to another and the scope of the fault increased or decreased.

The same communication plan mechanism described for planned outages can be used to keep affected customers informed about the outage and restoration.

Customer error reports – clear and instant communication

dpCom provides an intuitive user interface that customer-facing staff will find easy to use. It is mainly used for managing error reports from customers. When an error report is received, it is checked against active faults or disturbances. This allows for immediate feedback if dpCom is aware of a fault and the reporting customer is probably affected by that fault. In these cases, you can instantly inform customers about the fault and the planned restoration time.

If no existing fault matches the reported error, the error will be filed and can be used together with other reported errors for a fault analysis. As soon as a corresponding fault is registered, the customer will be included in the communication plan for that fault and begin receiving updates on the fault and restoration status.

Fault and outage statistics

Information on faults and outages is stored in a database and is available for use in various reports, including reports needed for regulatory compliance. When interacting with customers, the customer service staff thus gets access to the complete event history – useful information for determining conformance to SLA terms. The data also serves as valuable input to general maintenance analysis when you need to create upgrade or replacement plans for equipment involved in faults.


Sign up for our newsletter!

Enjoy the latest information about what is going on at Digpro – read about our products, news, events and much more.