Monitoring a totally automated, driverless subway line
The centralized control room (CCR) of a subway line is a hub of critical activity. The operators managing the line must react immediately to the slightest incident. The idea of an IT failure compromising the operators’ view of the line’s activity is totally unthinkable.
The operators monitor movements on the subway tracks, process automatic door closure incidents via video cameras and provide safety information to passengers via the PA systems. A system of Windows and Unix gateways accessing the equipment (tracks, video, radio, etc.), based on a core and DBMS, manages and stores information. A web front-end presents this information to the operators.
A built-in high availability solution
RATP’s project management had high availability requirements for the CCR supervision system for line 1. The project manager, Stéphane Guilmin, was faced with major constraints: he was subject to strict scheduling requirements, and most of the applications to be taken into account were not initially designed to run in this mode.
Among the possible solutions, the SafeKit software had already proved its value to RATP, having been implemented in the CCR of line 4 in 2000. However, RATP’s new automation project for line 1 had additional constraints: the subway line to be monitored would be totally automated and driverless.
Hardware solutions (shared disk on a SAN, load balancing network box) were quickly eliminated because they were too expensive, difficult to configure and relatively inflexible. SafeKit was chosen because the product appeared to be simpler to implement on standard servers, easy to administrate– the same consol manages both Windows and Unix servers – and versatile. It ensures load balancing, automatic application recovery and real time data replication.
“Automation of line 1 of the Paris subway is a major project for RATP, requiring a centralized command room (CCR) designed to resist IT failures.
With SafeKit, we have three distinct advantages to meet this need.
Firstly, SafeKit is a purely software solution that does not demand the use of shared disks on a SAN and network boxes for load balancing. It is very simple to separate our servers into separate machine rooms.
Moreover, this clustering solution is homogeneous for our Windows and Unix platforms.
SafeKit provides the three functions that we needed: load balancing between servers, automatic failover after an incident and real time data replication.”
Implementation of SafeKit clusters at the RATP
In practice, the deployment of SafeKit consisted in a functional definition stage, followed by a SafeKit configuration for each protected application. The actual applications were not modified. SafeKit does not require any physical modification on the servers it manages. RATP was able to use standard hardware, thereby reducing its costs and minimizing the impacts of project implementation.
A SafeKit application cluster has two servers with the same operating system, an application installed on these two servers, which also run SafeKit and one or more “application module(s)”.
An application module is a high availability and load balancing configuration package, adapted to the characteristics of a given application. Several application modules can be run in parallel on the same cluster to optimize power use of the two servers.
RATP uses two types of clustering on line 1’s CCR:
- The active / active farm cluster, which provides network load balancing and automatic failover in the event of a failure
- The active /passive mirror cluster, which provides real time data replication and automatic failover in the event of a failure
The applications for the CCR of line 1 were delivered one after another according to a pre-defined schedule. Evidian’s professional service developed the SafeKit application modules for each of the applications. RATP validated the deliveries on its three infrastructures: tests, pre-production and production. This organization minimized dependency during the project and meant no impact on application developments.
Daily use of the solution
The IT operators of the line 1 CCR can now administrate all their application clusters using the same tool, the same configuration, the same on-line commands and the same administration console. Indeed, SafeKit works in the same way on Windows and Unix, and uses the same console to monitor load balancing, high availability and data replication.
The backup servers of the numerous critical applications are in a separate machine room and run on separate power supplies. In the event of an incident, operation can continue without loss of service for the passengers of the Paris subway.