Y. Y. KAGAN EARTHQUAKE CLUSTERING AND FORECASTING: GLOBAL PERSPECTIVE. These comments to some degree are prompted by the recent email message exchange at the SCEC RELM project, and discussion of the Schorlemmer et al. ms (2004). Recently, in addition to me and my colleagues at UCLA as well as Yosi Ogata and his colleagues (see references below), many researchers (Console, Felzer, Gerstenberger, Helmstetter, McGuire, Schorlemmer, Sornette, Wiemer, and others) started to apply the theory of stochastic point processes to analyze earthquake occurrence and in particular its clustering. The major impetus of these investigations was application of statistical methods for earthquake forecasting, both long- and short-term. I hope that these comments will help to organize the research activities and lead to a better solution of the practical problem -- estimation of seismic hazard. I will briefly review the available methods for earthquake occurrence analysis and their application for earthquake forecasting and then discuss how these methods can be improved by analyzing global seismicity. I. Stochastic processes and earthquake occurrence. Almost any earthquake forecast requires proper accounting for earthquake clustering, mainly for aftershock occurrence, although foreshocks, if present, may be used to calculate a mainshock probability. Even if we are mainly interested in a long-term earthquake forecast, the influence of earthquake clustering on the results should be estimated. Moreover, faithful modeling of earthquake clustering is needed for any short-term earthquake forecasting. This clustering presents a special challenge since modern local catalogs have a magnitude range extending over several units: in California and Japan, the lower magnitude threshold is close to 1.0, whereas the largest earthquake may exceed 8.0. In such catalogs one should expect the aftershock numbers approaching or even exceeding millions after a very strong event. Handling these earthquakes and accounting for various systematic and random effects presents a serious problem. We cannot simply claim that the great majority of these events are irrelevant to seismic hazard; small earthquakes may turn out to be foreshocks, thus they need to be properly processed. Aftershock sequences have traditionally been taken into account by catalog declustering. Majority of researchers would agree that declustering can be used only as a preliminary step of seismicity analysis: it is subjective; many techniques exist, but they are not optimized and rigorously tested. Thus declustering can be used only in qualitative preliminary studies. Quantitative statistical methods need to be employed to rigorously describe earthquake clustering. Only an application of the stochastic point process theory may provide a rigorous solution to the problem. However, the multidimensional nature of earthquake occurrence, fractal or power-law properties of earthquake statistical distributions, and inhomogeneities of earthquake distributions makes creation of stochastic models and their statistical analysis an extremely difficult task. Over the years several stochastic models of earthquake occurrence have been proposed. All these models are based on the theory of branching processes. Branching is supposed to model the well-known property of primary and secondary clustering for aftershock sequences, i.e., a strong aftershock (or foreshock) tends to have its own sequence of dependent events. These multidimensional models are: (A) Point process branching along magnitude axis (Kagan, 1973a;b -- PPBM). (B) Continuum state process branching along time axis (Kagan and Knopoff, 1981; Kagan, 1982 -- CSBT). (C) Point process branching along time axis (Hawkes and Adamopoulos, 1973; Hawkes and Oakes, 1974; Kagan and Knopoff, 1987; Ogata, 1988 -- called Hawkes self-exciting process, PPBT, or ETAS). Hawkes and Tukey (see discussion section in Kagan, 1973b) debate the difference between branching in earthquake size and in time. All these models employ in one form or another the major known statistical properties of earthquake occurrence: the Gutenberg-Richter (GR) relation and Omori's law. Model (A) reproduces the GR relation as the result of branching in magnitude and uses Omori's law to describe earthquake clustering in time. Model (B) obtains the GR law as the consequence of critical branching (Vere-Jones, 1976); it applies Omori's law to temporal distribution of micro-dislocations, and simulates the position and orientation of dislocations to reproduce the entire earthquake process. Model (C) combines the GR relation and Omori's law in a fairly empirical fashion to approximate seismicity. Although as I discuss below, other models may have certain advantages in earthquake forecasting and the representation of seismicity, the model (C) is now almost exclusively used to statistically analyze earthquake occurrence and to simulate it (Kagan and Jackson, 2000; Ogata, 2004). The models (A-C) can be parameterized to analyze earthquake catalogs. The optimal parameter values can then be found by the maximum likelihood method (Kagan, 1991; Ogata, 2004). To account for earthquake clustering one may put the obtained parameter values back into the model and find probabilities of each event to be foreshock/mainshock/aftershock (Kagan and Knopoff, 1976; Zhuang et al., 2004). If these probabilities are known, a catalog can be either declustered in an objective manner, or dependent events can be taken into account. II. Several problems with the forecast program. (1) Seismicity spatial distribution is very complex: the depth inhomogeneity, the fractal character of the spatial pattern, and various hypocenter location errors make the model parameterization difficult and create various biases in the parameter estimation. This makes it questionable whether the results are reproducible. Recent applications of stochastic point processes for seismicity analysis often yielded the results which are incompatible or unstable -- slight variations in the data, assumptions, or processing techniques yield significantly different parameter values. It is difficult to see whether these contradictions are caused by biases of analysis, data defects, or differences in parametrization. Such diversity and ambiguity of the results leaves an independent observer in a state of bewilderment. (2) A critical and careful analysis of errors in the earthquake catalogs needs to be performed before each statistical analysis. I've read recently somewhere that 90% of the substantial part of a paper in natural sciences should consist of error analysis and the rest should report results. Otherwise, unless the effect being studied is very strong, the results are almost surely artifacts. The problem is that most of errors in the earthquake data are caused not by random effects but by systematic effects, hence they are more difficult to identify and to correct. (3) There is no effective statistical tool to select proper models and check whether they fit the data. Likelihood methods and the Akaike Information Criterion dependent on them (AIC -- see Ogata, 2004; Daley and Vere-Jones, 2004) apparently work properly only for regular processes -- quasi-Gaussian in a continuous case and quasi-Poisson for discrete (point) processes. However, an earthquake occurrence is controlled by scale-invariant, fractal distributions, diverging to infinity. Although these infinities can be regularized by using renormalization procedures, similar to what model (B) attempts to do, statistical tests applicable to such distributions have not yet been developed. Calculating the likelihood function for aftershock sequences illustrates this point: the rate of aftershock occurrence after a strong earthquake increases by a factor of thousands. Log(1000) = 6.9, hence one close aftershock yields a contribution to the likelihood function analogous to about 7 free parameters. (4) What can be done in the present situation to obtain reliable statistical results? The number of model degrees of freedom should be kept as small as feasible; the new adjustable parameters are to be introduced only if they are critically tested against the data in various catalogs and different tectonic environments. Dyson (2004) says about Enrico Fermi advising him "... My friend Johnny von Neumann used to say, with four parameters I can fit an elephant ..." Let's not have elephants on the loose in our earthquake garden! (5) Earthquake catalogs are incomplete in a wake of strong events, they are also incomplete in general for small earthquakes. Both of these effects need to be carefully accounted for. (6) Until now only worldwide seismicity or seismicity in certain seismic zones have been analyzed. Several tectonic provinces have not been investigated sufficiently: deep earthquakes, oceanic earthquakes, earthquakes in stable continental areas, volcanic earthquakes. Dependence of earthquake clustering on the rate of tectonic deformation should also be investigated: for example, in continental areas (and specifically in California) aftershock sequences occur in zones of fast and slow deformation rate. Are the clustering properties of earthquakes the same in these conditions? A study of earthquake occurrence in these tectonic environments should yield important information on general properties of seismicity. (7) Apparently all of the statistical models based on Omori's law fail to capture the properties of a long-term earthquake clustering. Kagan and Jackson (1991) argued that, in addition to the short-term clustering, which manifests itself in foreshock/mainshock/aftershock shallow earthquake sequences, there is a long-term clustering. The latter phenomenon is common both to shallow and deep earthquakes. Kagan and Jackson (1991) conjectured that the short-term clustering is the result of stress redistribution in a brittle crust, whereas the long-term clustering is most likely due to mantle convection. Moreover, volcanic seismicity is often characterized by earthquake swarms when all major earthquakes are of approximately equal size. It is likely that magma motion is the cause of this clustering. Earthquake occurrence at mid-ocean ridge transform faults (RTFs) also seems to require different tools to describe its clustering features (McGuire et al., 2004). Although reports on one tectonic earthquake triggering another by viscoelastic stress relaxation of the lower crust and upper mantle are often published, statistical relationship is rarely if ever investigated. And yet, only statistics can give a proof of this long-term interaction. (8) The earthquake probabilities calculated using model (C) have a serious defect -- if a strong earthquake is preceded by a foreshock or foreshocks, this event is considered as a dependent event with a low probability of being an independent or mainshock event. Model (A) does not present this difficulty -- the largest event in a cluster would always have the probability 1.0 (mainshock). (9) Point models by definition provide only a point forecast, each future earthquake is characterized by its location, magnitude, time, and possibly by its focal mechanism. In reality, earthquakes are spatially extended and they are not instantaneous, this fact is especially important for large events. Therefore, to compute seismic hazard, a point forecast needs to be supplemented by an extended source model. In contrast, model (B) is in principle a continuum model, hence it can simulate realistic complex earthquake rupture process extended in time, space, and fault orientation. Kagan and Knopoff (1984) provide an example of such a forecast. In principle, if appropriate Green's functions are available, this model can generate a set of seismograms for each synthetic sequence. III. Statistical analysis of seismicity: Challenges. The major difficulty in handling earthquake clustering is defects of the stochastic modeling of earthquake occurrence, discussed above. What can be done to improve the models? The best chance is going globally. Studying the worldwide earthquake occurrence would benefit forecasting efforts in many ways. We can investigate earthquakes in various tectonic provinces, depth ranges, study the dependence of clustering on the rate of tectonic deformation. This may help to find optimal spatial smoothing kernels, and discover the differences and similarities in the seismicity of various zones. Additionally, as Peter Bird indicates (message of 18-OCT-2004 09:05:36.96), testing of earthquake forecasts would be much easier and faster if done globally. Bird and Kagan (2004) analyzed the global seismicity to find parameters of the GR law in different tectonic zones. They suggest that the 20th century seismic moment data can be explained by a tapered GR relation which has a universal b-value. The maximum or the corner moment magnitude (Mc), however, is specific for tectonic provinces. They specify that for subduction zones Mc is >9+, in most of other continental areas it is >8+, whereas for most of mid-oceanic zones Mc=6-6.5. The models (A-C) are scale-invariant, since both GR and Omori's laws are fractal distributions. However, the GR relation should be limited from above (Kagan, 2002, section 2.2) and this introduces an upper scale in the description of seismicity. In the study of the continental seismicity we do not encounter many earthquakes with the magnitude close to Mc; this is perhaps why the branching models work reasonably well. But for mid-oceanic earthquakes the upper magnitude limit is reached by many events, hence both the classical GR and Omori's laws start to break down. Similar conditions characterize volcanic seismicity -- the maximum magnitude is often small for these earthquakes due to a limited rock volume subjected to critical stress (Kagan, 2002, p. 539). Therefore, we often observe earthquake swarms in volcanic areas where no event is distinguished by its size. Swarm temporal features are not likely to be well approximated by Omori's law; a different model is needed to describe its clustering. We may not observe earthquake swarms among the tectonic continental earthquakes because the recurrence time scales (hundreds and thousands years) of the largest earthquakes (Mc>=8-9) exceeds the temporal scale of the available observational data by a large margin. If very long earthquake catalogs for continental areas were available, we might see clusters/swarms of very strong earthquakes (like those of New Madrid, Landers-Hector-Mine, see also Dawson et al., 2003, and references therein). It is possible that for forecasting long-term clustering or swarm activity, new types of models would be needed, such as Neyman-Scott clusters (like shown in Fig. 2b by Kagan, 1973b, but without a main event in a cluster). The relative abundance of foreshocks for mid-ocean earthquakes which are close to the upper Mc limit (McGuire et al. 2004) can be explained by the arguments proposed by Knopoff et al. (1982): when the magnitude range of earthquakes is small, there is little difference between foreshocks, mainshocks, and aftershocks. The stochastic point process would have an almost symmetrical correlation function -- the number of foreshocks would be close to the number of aftershocks. Taking into account the uncertainty of magnitude determination, any event may become a foreshock/mainshock/aftershock if the magnitude range is comparable to the accuracy. Because of its complexity and scope, the program sketched above cannot be executed by an individual scientist, or even by a small group of collaborators. Therefore, I propose to combine our efforts in joint seismicity analysis and the application of the obtained results to improve current earthquake forecast methods and create new more advanced techniques to predict future seismic activity. Among the issues which need attention in future research, I would mention the following (this is, of course, not a complete list): 1. New statistical programs should be robust is application to various seismic regions, the cause of difference in model parameter values or results, if present, should be well understood. 2. What conditions are imposed on parameter values (or in general on model parametrization) by the scale-invariance of earthquake occurrence below Mc. 3. The scale-invariance is obviously broken down for the largest earthquakes. Once again, what can be theoretically inferred from this fact? 4. Are there any other violations of scale-invariance? For example, the thickness of the seismogenic zone has been often invoked as a possible geometrical scale. Until now investigations failed to show unambiguously any influence of this scale on earthquake distributions. IV. References: Many of my publications, especially recent ones, listed below are available from my Web site: see http://scec.ess.ucla.edu/ykagan.html or http://scec.ess.ucla.edu/~ykagan/pub_index.html Bird, P., and Y. Y. Kagan, 2004. Plate-Tectonic Analysis of Shallow Seismicity: Apparent Boundary Width, beta-Value, Corner Magnitude, Coupled Lithosphere Thickness, and Coupling in 7 Tectonic Settings, Bull. Seismol. Soc. Amer., accepted, http://scec.ess.ucla.edu/~ykagan/plates_index.html Daley, D. J., & Vere-Jones, D., 2004. Scoring probability forecasts for point processes: The entropy score and information gain, J. Applied Probability, 41A: 297-312, (Spec. Issue) http://moho.ess.ucla.edu/~kagan/DJD_DVJ_04.pdf Dawson, T. E., S. F. McGill, and T. K. Rockwell, 2003. Irregular recurrence of paleoearthquakes along the central Garlock fault near El Paso Peaks, California, J. Geophys. Res., 108(B7), 2356, doi:10.1029/2001JB001744. Dyson, F., 2004. A meeting with Enrico Fermi - How one intuitive physicist rescued a team from fruitless research, Nature, 427(6972), 297, http://moho.ess.ucla.edu/~kagan/dyson.pdf Hawkes, A. G., and L. Adamopoulos, 1973. Cluster models for earthquakes - Regional comparisons, Bull. Int. Statist. Inst., 45(3), 454-461. Hawkes, A. G. and Oakes, D., 1974. A cluster process representation of a self-exciting process, J. Appl. Prob., 11, 493-503. Kagan, Y. Y., 1973a. A probabilistic description of the seismic regime, Izv. Acad. Sci. USSR, Phys. Solid Earth, 213-219, (English translation). Kagan, Y. Y., 1973b. Statistical methods in the study of the seismic process (with discussion: comments by M. S. Bartlett, A. G. Hawkes, and J. W. Tukey), Bull. Int. Statist. Inst., 45(3), 437-453. Kagan, Y. Y., 1982. Stochastic model of earthquake fault geometry, Geophys. J. R. astr. Soc., 71, 659-691. Kagan, Y. Y., 1991. Likelihood analysis of earthquake catalogues, Geophys. J. Int., 106, 135-148. Kagan, Y. Y., 2002. Seismic moment distribution revisited: I. Statistical results, Geophys. J. Int., 148, 521-542. Kagan, Y. Y., and D. D. Jackson, 1991. Long-term earthquake clustering, Geophys. J. Int., 104, 117-133. Kagan, Y. Y., and D. D. Jackson, 2000. Probabilistic forecasting of earthquakes, (Leon Knopoff's Festschrift), Geophys. J. Int., 143, 438-453. Kagan, Y., and Knopoff, L., 1976. Statistical search for non-random features of the seismicity of strong earthquakes, Phys. Earth Planet. Inter., 12, 291-318. Kagan, Y. Y., and Knopoff, L., 1981. Stochastic synthesis of earthquake catalogs, J. Geophys. Res., 86, 2853-2862. Kagan, Y. Y., and Knopoff, L., 1984. A stochastic model of earthquake occurrence, Proc. 8-th Int. Conf. Earthq. Eng., 1, 295-302. Kagan, Y. Y., and Knopoff, L., 1987. Statistical short-term earthquake prediction, Science, 236, 1563-1567. Knopoff, L., Kagan, Y. Y., and Knopoff, R., 1982. b-values for fore- and aftershocks in real and simulated earthquake sequences. Bull. Seismol. Soc. Am., 72, 1663-1676. McGuire, J. J., Boettcher, M. S., and Jordan, T. H., 2004. Foreshock Sequences and Short-Term Earthquake Predictability on East Pacific Rise Transform Faults, Eos Trans. AGU, 85(46), Fall Meet. Suppl., Abstract NG24B-03. Ogata, Y., 1988. Statistical models for earthquake occurrence and residual analysis for point processes, J. Amer. Statist. Assoc., 83, 9-27. Ogata, Y., 2004. Space-time model for regional seismicity and detection of crustal stress changes, J. Geophys. Res., 109(B3), Art. No. B03308. Correction J. Geophys. Res., 109(B6), Art. No. B06308. Schorlemmer, D., M. Gerstenberger, S. Wiemer, and D. Jackson, 2004. Earthquake likelihood model testing, ms, http://moho.ess.ucla.edu/~kagan/sjg.pdf . Vere-Jones, D., 1976. A branching model for crack propagation, Pure Appl. Geophys., 114, 711-725. Zhuang, J. C., Y. Ogata, and D. Vere-Jones, 2004. Analyzing earthquake clustering features by using stochastic reconstruction, J. Geophys. Res., 109(B5), Art. No. B05301. CLUST.TXT;88 43/45 14-NOV-2004 12:05:34.03