Article - Issue 26, March 2006
Protecting Phone Systems
Dr William Webb FREng
Simplified schematic showing how fixed, mobile and other networks interconnect – many elements have been omitted for clarity Schematic by Simon Roulestone
Whenever a disaster occurs there is a critical need for maintaining lines of communication. However, the telecommunications networks themselves are often unable to cope with such extraordinary conditions and pressure. Dr William Webb FREng examines how phone systems coped with two emergencies last year and looks at the technical and operational solutions available to deal with future incidents in the US and UK.
Whenever a disaster such as Hurricane Katrina, or a terrorist activity such as the London bombings occurs, there is a critical need for maintaining lines of communication. Primarily, the emergency services need to be able to communicate in order to coordinate their relief efforts. But there is also a strong need for the public to be able to communicate, either with the emergency services, or with their relatives – the latter contact reduces panic and discourages others into the disaster area to find loved ones. However, the telecommunications networks themselves are often unable to cope despite the increasingly technical and operational solutions available.
Broadly speaking,wireless systems survive better than hard-wired systems (including optical fibres) in emergencies because their infrastructure is more dispersed. Only a direct hit on a base station (ie where the mast is sited) or a control centre will disable it. However, those wireless systems that incorporate hard-wired connections, to link base stations back into the control centre, increase their vulnerability under attack. Wireless systems, just like hard-wired ones are also dependent on power. While some base stations are provided with back-up generators, they typically only have fuel for a few days.
Satellite systems tend to survive best. The satellite itself will be unaffected in an emergency and, providing the user terminals are undamaged and the satellite ground station is functioning, communications can continue to be provided. However, few people have access to satellite phones; they tend not to work well in dense urban environments, and the capacity of satellite networks is relatively low so this system can only work for a minority.
In most cases, the emergency services have their own wireless networks. They may have their own hard-wired connections to key points such as police stations but rarely own the entire wired network, which includes the control centre. The wireless networks operate in the same way as cell phones but with some specific modifications to allow, for example, group calling. Emergency services may aim for higher reliability than cellular systems with techniques such as extra hardware to allow for breakdowns, increased back-up power supplies and a wider coverage than necessary, in order to continue working if parts of the system fail. Emergency service systems may also have different coverage patterns, with, for example, better coverage in tunnels. Some emergency systems also enable terminals to communicate directly to other nearby terminals bypassing the base stations and control centres. Through a combination of these different techniques, emergency networks will continue to work longer and in more severe conditions than cellular networks. Even emergency networks, however, can fail under extreme enough conditions.
In an ideal world
Networks could be made more resilient simply through better deployment. For example, by employing wireless as well as wired connections in the system so that one could replace the other automatically, the chances of connectivity being lost would be substantially reduced. Base stations and control centres could be ‘hardened’ – made waterproof and hurricane proof with greater fuel supplies. More base stations could be installed so that if one were disabled another could cover the same area. The network capacity could be enlarged by building substantially more cell sites so that they would not become overloaded in a disaster. But these improvements are expensive.
The cost would vary considerably depending on the network deployment and availability of local infrastructure. ‘Hardening’ the network could add 20% to the total costs faced by a cellular operator, while increasing the capacity to a level able to cope with all imaginable traffic peaks might add up to 100% depending on the level of excess capacity deemed relevant. These costs would need to be passed through to consumers. Many would judge that adding this cost to all users of mobile phones, to provide resilience for a small percentage of users in the very rare occasions that a disaster occurs,would be unreasonable.
Technology can help in a number of ways, it can make the network more resilient. It can for example also be used to reduce the capacity needed for each call when the network is nearing congestion.
Case study 1: Hurricane Katrina
After Hurricane Katrina had passed through New Orleans, the city lost virtually all its communications capability. Three million phone lines were disabled in Alabama, Louisiana and Mississippi by the wind and water. It took months in some cases to fully restore service. Although no full report has yet been made, it appears from anecdotal remarks that most networks survived the hurricane itself reasonably well. It was the subsequent flooding that caused the lasting damage.
Few base stations, control centres or other infrastructure components are sufficiently protected to survive prolonged immersion in water. Hence, shortly after the flooding started, key switches began failing. This quickly removed most of the fixed line functionality and, since most mobile networks use fixed networks to connect their base stations to their switches, or have their own switches in the vicinity, the knockon effects quickly took out many other networks including emergency service infrastructure.
For the equipment that survived the hurricane, avoided submersion, and remained connected to working switches, the next problem became one of power. Most base stations have fuel-generators, but these typically only carry fuel for a few days. With the city evacuated and telecommunication maintenance teams restricted from entering the city, there was no way to replenish fuel so these base stations failed after a few days. At that point there were few people left in the city and those remaining had little charge left in their cell phones.
Some residents of the affected areas managed to reach help using other phone services including text messaging over their wireless phones, Voice over Internet Protocol (VoIP) and satellite telephony. The lesson for consumers was that no service can currently be counted on to survive a large-scale calamity.
Most cell phones cannot be located in an emergency – advanced technology is needed to provide a caller’s location and more than 50% of US emergency calls are now made by mobiles. The Federal Communications Commission which regulates US interstate calls declared afterwards that by 2006 wireless emergency number callers must be locatable to within a few hundred feet.
Another result of Hurricane Katrina’s devastation has been greater interest in emergency community notification programmes. These use automated outgoing phone calls, email, and text messages to tell residents about evacuations, environmental threats or missing persons. This is usually referred to as Reverse 911.
Other lessons learned involved the people management during and after the storms. Much of the telecommunications equipment could have been brought back up to working order relatively quickly but engineers were not permitted to enter the submerged city for three days. Distinguishing potential life-savers from looters will always be problematic but formulating a solution may be a beneficial by-product of the Katrina catastrophe.
Further information: Visit www.consumerreports.org
Whenever a disaster such as Hurricane Katrina, or a terrorist activity such as the London bombings occurs, there is a critical need for maintaining lines of communication.
Case study 2: London bombings
The London suicide bombings on 7 July 2005 were different to the New Orleans emergency. No damage was reported to any of the communications networks, but there was a widespread perception that the cellular networks had failed. Others claimed that they had been switched off to prevent them being used to trigger further bombs. None of this was true; the networks were simply highly congested.
The first bomb exploded at approximately 8.50am during the morning rush hour. Traffic levels then rose on both fixed and mobile networks. Traffic on the fixed network peaked at around 11am, when nearly twice as many call attempts as normal were being made. High traffic levels were managed by discarding a percentage of calls – a process known as call-gapping.
Mobile networks also experienced high levels of traffic. Like the fixed networks, collapse was avoided by reducing the traffic load. This was done in three ways. Incoming traffic from other networks was reduced by applying call gapping, a percentage of mobile-originated calls were prevented (at one point up to 70% of all call requests were being rejected) and, where feasible, speech quality was reduced to increase capacity. High demand resulted in severe, up to two and a half hour, delays for the delivery of SMS (text) messages.
A facility exists whereby emergency services can invoke priority access to mobile networks, this was used in London that morning, but only for one mobile network and only in the close vicinity of one of the bombsites. One of the reasons for not invoking priority access more widely was that some of the emergency services were making widespread use of the mobile networks, using phones that were not identified as owned by the emergency services.
A few satellite phones were available but it was found too problematic to establish coverage in built-up urban environments, and hence these had very limited utility. This suggests that the ability of satellite phones to play a key communications role in urban disasters is limited.
The police in London utilised a new emergency service network, using an advanced digital technology developed specifically with emergency services in mind, called TETRA (TErrestial Trunked RAdio). This is provided by a commercial entity known as Airwave and owned by one of the cellular operators. This network was reported as having performed well, with demand not exceeding supply and hence no congestion.
As an aside, mobile phones also played a new role in London. Many are now equipped with cameras and news services made widespread use of still and video footage captured by those involved in the disaster. This type of ‘mobile reporter’ role can be expected to grow in the future. However, it doesn’t require a working telecommunications network. Looking forward, emergency services may make increased use of mobile phone location data to track victims, especially in collapsed buildings.
One way to increase resilience is to allow cell phones to switch automatically between different communication modes, such as WiFi or satellite, so that if one route fails another (working) one will be selected. However, the consequence of this moves all the congestion onto the few remaining networks.
Another approach is to use mesh technology (see Figure 2). This technique is at the heart of the design of the internet and similar approaches are now being applied to wireless. These types of systems have been proposed to provide WiFi (radio transmission working on 2 or 5 GHz) coverage across cities. These WiFi systems normally have a range of a few hundred metres, and a mesh protocol is the technology that allows the systems to connect to each other or to core networks.
Using a mesh approach, each mobile or handset might communicate to other nearby devices at the same level in the hierarchy, eg mobile to mobile, or base station to base station. This may be sufficient to route local calls to their destinations. For longer routes, the call must eventually be sent further up the hierarchy, although there is much flexibility about the point where that occurs. However, mesh systems often have long delays, making them less appropriate for voice communications, and may have limited capacity.
Congestion can be reduced by sacrificing call quality for a lower data rate. For example, GSM networks can halve the bandwidth required (half-rate coding), resulting in some loss of voice quality. More extreme measures could further reduce the bandwidth, or even switch to SMS-only (text messaging) mode, which requires less bandwidth. In fact,moving to half-rate coding was adopted by some networks during the London bombings but it only succeeded in moving the congestion to the switching centres. However, increasing their capacity is reasonably simple.
Communications networks are vital in the aftermath of a disaster, but they do not always survive the disaster itself. Even when they do, they can rapidly become congested in the aftermath. This can be rectified at a cost, and has to be balanced against the likelihood of a city- or country-wide disaster. This is a political and economic problem, rather than a technical one. The technology is available and is developing rapidly, although each potential solution has its own shortcomings. It is up to society to decide to what extent it wishes to invest in order to ensure network resilience in a disaster.
Dr William Webb FREng William Webb joined Ofcom as Head of Research and Development and Senior Technologist in 2003. William has published eight books, is a Visiting Professor at the University of Surrey and a Vice President and Fellow of the IEE. He was elected a Fellow of The Royal Academy of Engineering in 2005.