The present invention relates to estimation of disasters in infrastructures, such as computer networks.
Risk analysis predicts likelihood of disasters, such as severe failures of an Information Technology (“IT”) infrastructure, that an organization may face, and the consequences of such failures. IT disasters, such as an e-mail server failure or other computer network failure, can impact the organization's ability to operate efficiently.
Known cindynic theory (science of danger) is applicable in different domains. For example, cindynics has been used to detect industrial risks and can also be used in the area of computer network (including computer hardware and software) risks. According to the modern theory of description, a hazardous situation (cindynic situation) has been defined if the field of the “hazards study” is clearly identified by limits in time (life span), limits in space (boundaries), and limits in the participants' networks involved and by the perspective of the observer studying the system. At this stage of the known development of the sciences of hazards, the perspective can follow five main dimensions.
A first dimension comprises memory, history and statistics (a space of statistics). The first dimension consists of all the information contained in databases of large institutions constituting feedback from experience (for example, electricity of France power plants, Air France flights incidents, forest fires monitored by the Sophia Antipolis center of the Ecole des Mines de Paris, and claims data gathered by insurers and reinsurers).
A second dimension comprises representations and models drawn from the facts (a space of models). The second dimension is the scientific body of knowledge that allows computation of possible effects using physical principles, chemical principles, material resistance, propagation, contagion, explosion and geo-cindynic principles (for example, inundation, volcanic eruptions, earthquakes, landslides, tornadoes and hurricanes).
A third dimension comprises goals & objectives (a space of goals). The third dimension requires a precise definition by all the participants and networks involved in the cindynic situation of their reasons for living, acting and working. It is arduous to clearly express why participants act as they do and what motivates them. For example, there are two common objectives for risk management—“survival” and “continuity of customer (public) service”. These two objectives lead to fundamentally different cindynic attitudes. The organization, or its environment, will have to harmonize these two conflicting goals.
A fourth dimension comprises norms, laws, rules, standards, deontology, compulsory or voluntary, controls, etc. (a space of rules). The fourth dimension comprises all the normative set of rules that makes life possible in a given society. For example, socient determined a need for a traffic code when there were enough automobiles to make it impossible to rely on courtesy of each individual driver; the code is compulsory and makes driving on the road reasonably safe and predictable. The rules for behaving in society are aimed at reducing the risk of injuring other people and establishing a society. On the other hand, there are situations, in which the codification is not yet clarified. For example, skiers on the same ski-slope may have different skiing techniques and endanger each other. In addition, some skiers use equipment not necessarily compatible with the safety of others (cross country sky and mono-ski, etc.)
A fifth dimension comprises value systems (a space of values). The fifth dimension is the set of fundamental objectives and values shared by a group of individuals or other collective participants involved in a cindynic situation. For example, protection of a nation from an invader was a fundamental objective and value, and meant protection of the physical resources as well as the shared heritage or values. Protection of such values may lead the population to accept heavy sacrifices.
A number of general principles, called axioms, have been developed within cindynics. The cindynic axioms explain the emergence of dissonances and deficits.
CINDYNIC AXIOM 1—RELATIVITY: The perception of danger varies according to each participant's situation.
Therefore, there is no “objective” measure of danger. This principle is the basis for the concept of situation.
CINDYNIC AXIOM 2—CONVENTION: The measures of risk (traditionally measured by the vector Frequency-Severity) depend on convention between participants.
CINDYNIC AXIOM 3—GOALS DEPENDENCY: Goals can directly impact the assessment of risks. The participants may have conflicting perceived objectives. It is essential to try to define and prioritise the goals of the various participants involved in the situation. Insufficient clarification of goals is a current pitfall in complex systems.
CINDYNIC AXIOM 4—AMBIGUITY: There is usually a lack of clarity in the five dimensions previously mentioned. A major task of prevention is to reduce these ambiguities.
CINDYNIC AXIOM 5—AMBIGUITY REDUCTION: Accidents and catastrophes are accompanied by brutal transformations in the five dimensions. The reduction of ambiguity (or contradictions) of the content of the five dimensions will happen when they are excessive. This reduction can be involuntary and brutal, resulting in an accident, or voluntary and progressive achieved through a prevention process.
CINDYNIC AXIOM 6—CRISIS: A crisis results from a tear in the social cloth. This means a dysfunction in the networks of the participants involved in a given situation. Crisis management may comprises an emergency reconstitution of networks.
CINDYNIC AXIOM 7—AGO-ANTAGONISTIC CONFLICT: Any therapy is inherently dangerous. Human actions and medications are accompanied by inherent dangers. There is always a curing aspect, reducing danger (cindynolitic), and an aggravating factor, creating new danger (cindynogenetic).
The main utility of these principles is to reduce time lost in unproductive discussions on the following subjects:
FIG. 1 shows a known “Farmer” curve where disasters are placed on a graph showing the relationship between probability and damage.
Disaster study is a part of Risk Analysis; its aim is to follow the disaster evolution. Damages are rated in term of cost or rate, with time. Let “d” denote the damage of a given disaster and “If” denote the frequency of such a disaster. From a quantitative point of view, it is common to define a rating “R” of the associated risk as: R=d×f. In practice, often, the perception of risk is such that the relevance given to the damaging consequences “d” is far greater than that given to its probability of occurrence f so that, the given “R=d×f” is slightly modified to: R=d^{k}×f with k>1. So, numerically larger values of risk are associated with larger consequences.
Disasters are normally identified by IT infrastructure components. These components follow rules or parameters and may generate log traces. Typically, disaster information is represented in the form of log files. The disaster rating and scale are relative rather than absolute. The scale may be, for example, values between “1” and “10”: “1” being a minor disaster of minimal impact to the disaster data group and “10” being a major disaster having widespread impact. The logging function depends of the needs of monitoring systems and data volumes and, in some cases, delay due to legal obligations.
The known Risk Analysis uses a simple comparison between values found by the foregoing operations, in order to extract statistics. Also, a full Risk Analysis of a IT infrastructure required a one to one analysis of all the data held on disasters. By comparing each disaster with each of the other disaster it was possible to calculate the likelihood of further disasters. This process is computationally expensive and also requires a significant amount of a computer's Random Access Memory (RAM).
An object of the present invention is to estimate risk of disaster of an infrastructure.
Another object of the present invention is to facilitate estimation of risk of disaster of an infrastructure.
The present invention is directed to a method, system and computer program for estimating risk of a future disaster of an infrastructure. Times of previous, respective disasters of the infrastructure are identified. Respective severities of the previous disasters are determined. Risk of a future disaster of the infrastructure is estimated by determining a relationship between the previous disasters, their respective severities and their respective times of occurrence.
In accordance with a feature of the present invention, the risk is estimated by generating a polynomial linking severity and time of occurrence of each of the previous disasters. The polynomial can be generated by approximating a Tchebychev polynomial.
In accordance with other features of the present invention, the risk is also estimated by modifying the polynomial by extracting peaks in a curve representing the polynomial, regenerating the polynomial using the extracted peaks and repeating the modifying step until a number of extracted peaks is less than or equal to a predetermined value.
FIG. 1 illustrates an example of a prior art Farmer's curve.
FIG. 2 illustrates the result of the Tchebychev's polynomials approximation's use.
FIG. 3 illustrates two polynomial curves showing the collected disaster information from a first origin and a second origin.
FIG. 4 illustrates the combining of the polynomial curves of FIG. 3 according to an embodiment of the invention.
FIG. 5 is a flow diagram, including a flowchart and a block diagram, illustrating a program and system for generating polynomials according to the present invention.
FIG. 6 illustrates a system according to the present invention for estimating risk of disaster of an infrastructure.
The present invention will now be described in detail with reference to the Figures. A Tchebychev analysis program 500 (shown in FIGS. 5 and 6) executing in a risk estimation computer 20 generates a continuous polynomial curve with a corresponding polynomial equation. Program 500 takes derivatives of the polynomial equation. When the derivative of the continuous curve is null, the risk reaches its maximum. The construction of the polynomial equation is shown below.
For i≧1 and j≧1, a Tchebychev polynomial having “n” points is given by:
For example, to calculate the polynomial between two points, Point1 and Point2, having coordinates (x_{1}, y_{1}) and (x_{2}, y_{2}) respectively in space (x, y), the formula is: n=2,
Where P_{2}(x_{1})=Y_{1}, and P_{2}(x_{2})=Y_{2}.
To calculate the polynomial between 3 points: Point1(x1, y1), Point2(x2, y2) and Point3(x3, y3), the formula is: n=3,
where P_{3 }(x_{1})=y_{1}, P_{3 }(x_{2})=y_{2 }and P_{3 }(x_{3})=y_{3}.
The Tchebychev polynomial is a continuous curve between “n” points.
Referring to FIG. 5, Tchebychev analysis program 500 receives identified disasters data 510 from an infrastructure which are then inputted to a Tchebychev approximation module 520. The Tchebychev module 520 calculates a polynomial from the identified disasters data 510. The polynomial is inputted to a derivative module 530. The derivative module 530 identifies peaks and troughs by identifying points which have a null derivative. The peaks having a null derivative are forwarded to a peaks (or tops) module 540. The peaks module 540 identifies the peaks by studying the sign of the derivative before and after each of the identified points. Where the sign of the derivative is positive before and negative after an identified point, a peak has been found. A new filter module 550 counts the number of identified peaks and compares this to a predetermined maximum. If there are more identified peaks than the maximum, the identified peaks are inputted to the Tchebychev module 520 and the process is repeated. If the number of peaks is less than or equal to the maximum the process stops (step 560).
FIG. 2 illustrates an example of results produced by program 500. An identified disasters trace 210 plots severity of a disaster against their time of occurrence. Program 500 then generates an approximation of Tchebychev's polynomials to obtain a first polynomial equation represented by a first polynomial curve 220. Program 500 then takes derivatives of first polynomial equation 220 to identify the points at which the derivative is equal to zero. Null derivative points 230 correspond to peaks and troughs on the polynomial curve. Program 500 identifies peaks by analyzing each null derivative point 230. If the polynomial values of the polynomial 220 before and after each null derivative point 230 are lower that the peak polynomial value at this point, a peak is identified. In this example, program 500 also identifies the extracted peaks 240 from the polynomial 220 through comparison with the identified disasters trace 210. Where a null derivative point 230 is identified as a peak, program 500 compares the null derivative point 230 to the value of identified disasters trace 210 before and after the null derivative point 230. Thus, program 500 identifies the extracted peaks 240 in FIG. 2. For example, point A is one of extracted peaks 240, B is the null derivative point 230 preceding A, and C is the null derivative point 230 following A. If the derivative is positive between A and B, and negative between A and C, point A is a peak. Furthermore, the values of the identified disasters trace 210 before and after point A are less than point A. Therefore point A is an extracted peak 240.
Program 500 then uses an approximation of Tchebychev's polynomials to create a modified polynomial 250 using points which have been identified as peaks and the start and end point. Program 500 further modifies polynomial 250 by repeating the process described above to identify peaks. In this case, there would be no further improvement but in other cases the process will preserve only the highest peaks.
Referring now to FIG. 3, polynomial curves 340 show two collections of disaster information for two organizations (first origin and second origin) with each disaster 310 shown as a point on the polynomial curve 340. Program 500 identifies represented peaks 320 by the process described above to identify peaks from recovered data points. Each polynomial curve 340 has ends 330.
Referring now to FIG. 4, the polynomial curves 450 represent the two polynomial curves of FIG. 3 (340). The first origin has disaster points 420 and the second origin has disaster points 430. Program 500 identifies peaks and ends of each of the polynomial curves 450 and extracts represented peaks. The new ends 440 are the ends from either of the polynomial curves 450 which are of greater gravity or greater extremity of time. Program 500 then uses the represented peaks from each polynomial curve 450 along with the new ends 440 to generate a merged polynomial 460 which represents disaster from the combined information of the first and second origin.
Referring now to FIG. 6, a data logger 602 which enables information, typically consisting of logged events, to be collected from a infrastructure network 604. The information from the data logger 602 is stored in a data storage 606. A disaster identification program 608 assesses the logged events to determine whether the event is deemed a disaster. For example, if the logged event indicates a failure of system hardware or software it may be logged as a disaster. A disaster gravity program 610 assesses each identified disaster generating disaster data. For example, as described previously, a disaster may be assigned a value between “1” and “10” corresponding to level of impact on the infrastructure 604. The disaster data is then inputted to Tchebychev analysis program 500 as described previously. The Tchebychev analysis program generates a risk analysis equation or data. Program 500 then analyzes the risk analysis data to identify one or more high risk disaster events. For example, after the Tchebychev analysis program 500 has completed the risk analysis, program 500 typically identifies a number of peaks corresponding to high risk events 612. These peaks/events can be identified as disasters which generate significant risk to the infrastructure 604. Measures can then be automatically, or otherwise, taken to minimise further risk. For example, the computer system 20 could instigate additional services on other computers or server of the network 604 to provide additional redundancy to cope with a particular high risk event. The high risk events 612 can also be displayed on a computer screen, or any type of visual display unit, to allow a user to view and obtain more information about the high risk events 612. In this manner, a disaster of greatest potential risk can be identified automatically.
The present invention may be embodied in a computer program (including program modules 608, 610, 500 and 612) comprising instructions which, when executed in computer 20, perform the functions of the system or method as described above. The computer 20 includes a standard CPU 12, operating system 14, RAM 16 and ROM 18. The program modules 608, 610, 500 and 612 can be loaded into computer 20 from a computer readable medium such as a magnetic disk or tape, optical medium, DVD, or network download media (such as including a TCP/IP adapter card 21).
Improvements and modifications may be incorporated without departing from the scope of the present invention.