Why, and How, Should Geologists Use Compositional Data Analysis/Print Version< Why, and How, Should Geologists Use Compositional Data Analysis
|This is the print version of Why, and How, Should Geologists Use Compositional Data Analysis
You won't see this message or any elements not part of the book's content when you print or preview this page.
WHY, AND HOW, SHOULD GEOLOGISTS USE COMPOSITIONAL DATA ANALYSIS
A Step-by-Step Guide for the Field Geologists
Special Edition for Wikibooks
Ricardo A. Valls, P. Geo., M. Sc Hector Nuñez Dr. Jorge Cruz Martin
January 1st, 2008
Compositional data arise naturally in several branches of science, including geology. In geochemistry, for example, these constrained data seem to occur typically, when one normalizes raw data or when one obtains the output from a constrained estimation procedure, such as parts per one, percentages, ppm, ppb, molar concentrations, etc.
Compositional data have proved difficult to handle statistically because of the awkward constraint that the components of each vector must sum to unity. The special property of compositional data (the fact that the determinations on each specimen sum to a constant) means that the variables involved in the study occur in constrained space defined by the simplex, a restricted part of real space.
Pearson was the first to point out dangers that may befall the analyst who attempts to interpret correlations between Ratios whose numerators and denominators contain common parts. More recently, Aitchison, Pawlowsky-Glahn, S. Thió, and other statisticians have develop the concept of Compositional Data Analysis, pointing out the dangers of misinterpretation of closed data when treated with “normal” statistical methods
It is important for geochemists and geologists in general to be aware that the usual multivariate statistical techniques are not applicable to constrained data. It is also important for us to have access to appropriate techniques as they become available. This is the principal aim of this book.
From a hypothetical model of a copper mineralization associated to a felsic intrusive, with specific relationships between certain elements, I will show how “normal” correlation methods fail to identify some of such embedded relationships and how we can obtain other spurious correlations. From there, I will test the same model after transforming the data using the CRL, ARL, and IRL transformations with the aid of the CoDaPack software.
Since I addressed this publication to geologists and geoscientists in general, I have kept to a minimum the mathematical formulae and did not include any theoretical demonstration. The “mathematical curios geologist”, if such category exists, can find all of those in a list of recommended sources in the reference section.
So let us start by introducing the model of mineralization that we will be testing.
Figure 1 shows a very simplistic version of a copper mineralization associated to a zone of fractures within a granodiorite intrusive.
Table 1 shows the results of a random sampling using existing access to the top of the intrusive.
Table 1. Chemical composition of the sampling of the granodiorite intrusive.
The initial data respond to the following pre-established conditions:
- Is a close system, meaning that the sum of all the values is equal or very close to 100%
- There are no zero values present (I will deal with zero values in a separate example to avoid complicating the initial model).
- There are no statistical outliers (hurricane values) present.
- There is a big difference between the concentRations of the major oxides with respect to the trace elements.
- As shown in Figs. 2-4, there are embedded positive and significant correlation between Cu and As, Ni and Co, and MgO and K2O.
- Finally, I introduced significant negative correlations between K2O and CaO (Fig. 5), SiO2 and Al2O3 (Fig 6), K2O and Cu (Fig. 7), Na2O and K2O (Fig. 8), and Co with Cu (Fig 9).
According to these embedded conditions, any correlation analysis must give us two coefficients (Range Correlation Coefficient or RCC):
- Equation 1
- Range Correlation Coefficient type A for the initial data, according to the embedded correlations.
- Equation 2
- Range Correlation Coefficient type B for the initial data, according to the embedded correlations.
Normal Processing of the Data
For this case, I will process the initial dataset as a whole, without differentiating between major and trace elements. Since we established that there are no statistical outliers and no zero values within the data, the first step was to determine their distribution, using the kurtosis and skewness test as described by Kashdan et al, (1979). As one can see from Table 2, all elements responded to a Normal Distribution Law, except for CaO and Na2O, so the next step was to transform those values into Logarithms before testing their correlations.
Table 2. Results of the analysis of kurtosis and skewness.
Using Excel data analysis capabilities, I then determined the correlation analysis of the data (Table 3).
Table 3. Correlation analysis of the initial dataset.
To determine the significance of the obtained correlations, I then proceeded to calculate the critical value of Student using equation 3 (Table 4).
- Equation 3
- Critical value of Student to determine the significance of the obtained correlations.
Where, tc- critical value of Student r- correlation n- amount of data
Table 4. Critical values of Student for the correlation analysis of the initial dataset.
It is a common practice in geology that for n > 30 and a probability of 0.05 (95%), if tc > 3, then the correlation is significant. Table 6 shows which correlations are significant from the initial dataset.
Table 5. Results of the significant correlation analysis of the initial dataset.
Selecting the proper RCCs
Since the use of Range Correlation Coefficients (RCC) is not a common practice, I will explain in more detail the methodoLogy for their selection from the significant correlations. a. We start by ranging all correlations from the highest positive to the lowest negative correlation. b. Start at the bottom of the list and select all the negative significant correlations first to create the initial coefficient. c. After the first time you use a correlation pair, every time you get the same element, put a dot on top as shown in equation 4.
- Equation 4
- Example of the calculation of the multiplicative factor on a RCC.
This example means that you had a significant positive correlation between Cu and As and two significant negative correlations between Co and Cu and Co and As.
d. Once you finish with the negative significant correlations, go back to the top of the list and repeat the same process for the positive correlations. e. If it is possible, do combine the obtained coefficients. f. If you get a contradictory result, e.g. an element that has “conflicting correlations” with previous elements, take those elements out from the coefficient and start a new one.
You can reduce the size of the obtained RCC by eliminating the less frequent elements and subtracting their influence from the overall coefficient. For example, let us assume that we obtained the RCC represented in equation 5 as follow:
- Equation 5
- Hypothetical RCC to demonstrate the reduction process.
If we would like to eliminate the Sc and the L.O.I., we first eliminate the L.O.I. and subtract 2 from every element from equation 5.
- Equation 6
- Hypothetical RCC without the L.O.I. component.
Then we would subtract the remaining Sc as shown in equation 7.
- Equation 7
- Hypothetical RCC without the remaining Sc.
RCC from the initial dataset
According to Table 5, and using the methodoLogy just described, I obtained the RCC shown in equation 8.
- Equation 8
- RCC1 for the initial dataset.
In addition, because L.O.I. has a conflicting correlation with some elements from RCC1, I created a separated RCC for this case as shown in equation 9.
- Equation 9
- RCC 2 for the initial dataset.
Before we use SURFER v. 8.0 to graphically plot these coefficients over our model of mineralization, let us graphically analyze these coefficients using Grapher v. 7.0, also from the Golden Software suite of programs (www.goldensoftware.com).
Analysis of the RCCs from the initial dataset.
The objective of the processing of the data should be to obtain a RCC that will be as similar as possible to our theoretical ones represented by equations 1 and specially equation 2. We can see that we obtained all the embedded correlations, but “masked” by the presence of spurious (inexistent) ones. For example:
- There is no correlation whatsoever between Al2O3 and the other elements (Fig. 10)
- Same situation with the Fe2O3 (Fig. 11).
- Same situation with the Sc (Fig 12).
- Same situation with the TiO2 (Fig. 13).
- Same situation with the L.O.I. (Fig. 14).
Note however, that if we only choose the most strong correlations (r>0.92), then we get a RCC with just the embedded correlations.
A more common way to study this kind of data will be to separate the major oxides from the trace elements and treat them separately. It is often intuitively clear for geologists that when mixing percentages with ppm or ppb, the existing correlation between the trace elements is masked or eliminated by the relationships between the major oxides. So I proceeded to separate the initial dataset into major oxides and trace elements (Tables 7 and 8 in the file Initial data.xls [worksheet “Processing”], located in the attached CD).
Table 6. Correlation for the major oxides from the initial dataset.
Table 7. Critical value of Student for the correlations of the major oxides of the original dataset.
Table 8. Significant correlations for the major oxides from the initial dataset.
Analysis of the RCCs from the major oxides of the initial dataset.
As it was the case when we processed the whole dataset, here we obtain a RCC that contains the hypothetical one that we are looking to obtain (equation 1), but it is masked by the presence of other elements, many of them without real correlations between them (equation 10).
Equation 10. RCC3 for the major oxides of the initial dataset.
Table 9. Correlation analysis of the trace elements of the initial dataset.
Table 10. Critical value of Student for the correlations of the trace elements of the original dataset.
Table 11. Significant correlations for the trace elements from the initial dataset.
Analysis of the RCCs from the Trace Elements of the initial dataset.
As it was the case when we processed the whole dataset, here we obtain a RCC that contains the hypothetical one that we are looking to obtain (equation 2), but it is masked by the presence of other elements, many of them without real correlations between them (equation 11).
Equation 11. RCC 4 for the trace elements of the initial dataset.
Graphical representation of the RCCs
Using SURFER v.8 from Golden Software Inc., (you can download the demos from www.goldensoftware.com), I obtained Figures 15 – 18.
RCC1 only maps the southwestern border of the mineralized target in a much-dispersed fan that makes it impossible to use as a targeting tool. This was our best RCC from the analysis of the whole dataset.
The only reason why RCC2 partially covers the ore body is the strong and real correlations between Ni, Co, and K2O. None of these elements have however, a correlation with L.O.I:, therefore, this is a classic example of the formation of a spurious correlation because we applied correlation analysis to a “closed” dataset.
RCC3 contains one of the embedded correlations (SiO2 vs. Al2O3), but it also contains several spurious correlations and, since it is a petrographic association, has little to do with the location of the ore body.
Finally, RRC4 is mostly our main embedded correlation, and although it also contains some spurious components (e.g. correlation with Sc), it is not surprising that it maps perfectly the ore body.
Conclusions and recommendations from the processing of the initial dataset
Closed systems do provoke spurious correlations that mask the effectiveness of the established RCC. This is especially true when processing datasets that contain a combination of major oxides and trace elements. In those cases, I recommend to use only the extremely intense correlations.
A more useful solution is to separate major oxides from trace elements, and concentrate again only on the intense correlations. The disadvantage here is that we do not use the combine information of both groups of elements.
Will the transformation of the data be more efficient in the creation of RCCs that will help us target the mineralized zone within the granodiorite intrusive?
Compositional Data Analysis
The CoDaPack software (which is included in the attached CD and the user guide is presented in Appendix 1) offers three type of transformation, the Centered Log-Ratio transformation (CRL), the Additive Log-Ratio transformation (ARL), and the Isometric Log-Ratio transformation (IRL). The last two require a column with the residual (100 minus the sum of all the other components).
Centered Log-Ratio Transformation (CLR)
Appendix 1 contains the instructions on how to use the CoDaPack software. Since there are no zero values in our dataset, we can proceed directly to the CLR transformation (see table).
Table 12. Results of the CLR transformation of the initial dataset.
As Table 12 shows, the dataset is now “open”, since the sum of all the components is equal to zero, not 100%.
Once I achieved this transformation, I processed the data following the same steps as with the initial dataset. Tables 13 to 15 show the results of this process.
Table 13. Correlation analysis of the CLR transformed data.
Table 14. Critical value of Student of the CLR transformed data.
Table 15. Significant correlations of the CLR transformed data.
Using SYSTAT SPSS 10.0 for Windows I constructed a matrix of scatter plots (Fig. 19) to confirm the results from table 15, as well as some individual graphics using Grapher 7.0 which clearly show that all the correlations now are real (Figs. 20 – 23).
Figure 19. Matrix of scatter plots for the CLR transformed data.
Equation 12 shows the RCC determined for the CLR transformed data and equation 13 shows the same RCC, but reduced by eliminating the SiO2 and the L.O.I.
Equation 12. RCC5 for the CLR transformed data.
Equation 13. RCC5a for the CLR transformed data after reducing the SiO2 and the L.O.I.
Fig. 24 shows that RCC5a can effectively target the copper mineralization within the granodiorite intrusive.
Figure 24. The RCC5a can effectively target the copper mineralization within the granodiorite intrusive.
Now, if we will use only the strongest correlations (r>0.95) then we will obtain RCC6 and RCC7 (equations 14 and 15).
Equation 14. RCC 6 for correlations stronger than ±0.95 for the CLR transformed data.
Equation 15. RCC 7 for correlations stronger than ±0.95 for the CLR transformed data.
As one can see from Fig. 25, RCC 6 is an almost perfect match with the location of the ore body. The RCC 7 (Fig 26) is similar to RCC 3 and represents a petroLogic association.
Figure 25. Almost perfect correspondence between the RCC 6 and the location of the ore body.
Figure 26. RCC 7 represents a petroLogic association of major oxides.
Additive Log-Ratio Transformation (ARL)
Table 16 shows the results of the transformation of the original dataset. Tables 17 through 19 show the results of the correlation analysis.
Table 16. ALR transformed data.
Table 17. Correlation analysis for the ALR transformed data.
I used SYSTAT SPSS 10.0 for Windows to construct a matrix of scatter plots (Fig. 27) to confirm the results from table 17.
Figure 27. Matrix of scatter plots for the ARL transformed data.
Table 18. Critical values of Student for the ARL transformed data.
Table 19. Significant correlations of the ALR transformed data.
Equations 16 and 17 shows the RCCs determined for the ALR transformed data. It is interesting to note that all the correlations here are positive.
Equation 16. RCC 8 of the ARL transformed data.
Equation 17. RCC 9 of the ARL transformed data.
Figures 28 and 29 show the spatial behavior of these RCCs with respect to the location of the ore body. If we combine both RCCs, we obtain equation 18 (Fig. 30).
Equation 18. Combination of RCC 9 and RCC 8 for the ARL transformed data.
Isometric Log-Ratio Transformation (IRL)
Table 20 shows the results of the transformation of the original dataset. Tables 21 through 23 show the results of the correlation analysis.
Table 20. IRL transformed data.
Table 21. Critical value of Student of the ILR transformed data
Figure 31 shows the result of a matrix of scatter plots constructed with SYSTAT SPSS 10.0 for Windows to test the results from table 21.
Figure 31. Matrix of scatter plots for the IRL transformed data.
Table 22. Critical value of Student of the ILR transformed data.
Table 23. Significant correlations of the ILR transformed data.
Equations 18 through 20 shows the RCCs determined for the ALR transformed data.
Equation 19. RCC 10 for the IRL transformed data.
Equation 20. RCC 11 for the IRL transformed data.
Equation 21. RCC 12 for the IRL transformed data.
Equation 22. RCC 13 for the IRL transformed data.
Figures 32 -35 show the result of the use of these RCCs as targeting tools.
Conclusions and Recommendations from the Compositional Data Analysis
One very important effect of “opening” a dataset by using any of these transformations is that we get rid off all spurious correlations. The transformed data do contain unexpected correlations, but they are real.
Another important point is that we do not need to process the data separately (e.g. separating major oxides from trace elements), but can process the whole dataset taking advantage of the information contained in both groups.
From all the RCCs obtained so far, the RCC 8, RCC 9, and especially the RCC9/8 (ALR) were by far the most efficient one for targeting the copper mineralization.
The CRL transformed data did also provide for useful RCCs, especially if we concentrate in the higher correlations.
Finally, the IRL transformed data was effective for as long as the geochemist will “interpret” the coefficient and not plot them blindly. For example, a geochemist should know that elements like Pb and Co, usually concentrate bellow the ore body (inframinerals), and therefore while using RCC 13, the investigator should concentrate on the lower values as an indication of the location of the ore body. We have a similar situation with RCC 11. The investigator should know that a common effect of sodic metasomatism would be the lixiviation of MgO and K2O; therefore, the geochemist should be looking for lower values of the RCC.
In general, I can state that transformed data are more effective for the location of the mineralized targets than the non-transformed dataset, and that the ARL method seems to be the most effective for processing this type of data. However, the geochemist should always use his background knowledge to help to decide the most efficient RCC for the studied area.
I recommend the use of the software CoDaPack for the processing of any type of “closed” dataset.
- Remember to add a column of the residuals to the original dataset.
- Remember to add a column of the residuals to the original dataset.
Factor analysis is a statistical data reduction technique used to explain variability among observed random variables in terms of fewer unobserved random variables called factors. It is useful to reduce the number of variables, by combining two or more variables into a single factor, thus “simplifying” the original dataset.
Factor analysis (FA) is especially useful in geochemistry when one has a known target or some other way to understand the meaning of the obtained associations. When failing this, the geologist is usually forced to “plot and see”, and then to select the FA that he believes is the most useful for the studied area.
I processed both the initial dataset and the three transformed versions using SYSTAT SSPS 10.0 for Windows, but you can use any other statistical program capable of factor analysis.
Factor Analysis for the Initial Dataset
Figure 36 shows the plot for the initial dataset, while table 24 shows the principal components defined by the software.
Figure 36. Scree plot for the initial dataset.
Table 24. Principal component analysis (PCA) for the initial dataset.
Equations 23 – 25 show the three FA components for the initial dataset.
Equation 23. FA 1 for the initial dataset.
Equation 24. FA 2 for the initial dataset.
Equation 25. FA 3 for the initial dataset.
Figures 37 – 39 show the effectiveness of these FA as a targeting tool for our ore body.
Conclusions and recommendations on the use of FA for the initial dataset
For as long as we have a known target to test the obtained FA, this method offers better results than the RCC. It also allows for the combined studied of all the elements together.
FA1 and FA2 do contain the embedded correlations I introduced in the initial dataset, thus their effectiveness, especially FA 1, in mapping the location of the ore body.
The next question will be: Will the transformed data be any more effective in helping us locate our target?
CRL transformed data
Figure 40 shows the scree plot for the CLR transformed dataset, while table 25 shows the principal components defined by SYSTAT.
Figure 40. Scree plot for the CLR transformed dataset.
Table 25. Principal component analysis for the CLR transformed dataset.
Equations 26 – 28 show the three FA components for the CLR transformed dataset.
Equation 26. FA 4 for the CLR transformed dataset.
Equation 27. FA5 for the CLR transformed dataset.
Equation 28. FA6 for the CLR transformed dataset.
Figures 41 – 43 show the effectiveness of these FA as a targeting tool for our ore body.
Factor Analysis for the ALR Transformed Dataset
Figure 44 shows the scree plot for the ALR transformed dataset, while table 26 shows the principal components defined by SYSTAT.
Figure 44. Scree plot for the ALR transformed dataset.
Table 26. Principal component analysis for the ALR transformed dataset.
Although table 26 shows two components, I will analyze only the second, which is a coefficient as shown in equation 29.
Equation 29. FA7 for the ALR transformed dataset.
This factor contains the embedded relationship from the initial dataset, but because of the presence of other elements, its usefulness as a targeting tool is more limited, as shown in Figure 45.
Figure 45. FA7 covers mostly the southeastern part of the ore body.
Factor Analysis of the IRL Transformed Dataset
Figure 46 shows the scree plot for the IRL transformed dataset, while table 27 shows the principal components defined by SYSTAT.
Figure 46. Scree plot for the IRL transformed dataset.
Table 27. Principal component analysis for the IRL transformed dataset.
The fact that we have so many components as the result of the P.C.A., is an indication that we will not get good results this time. Equations 30 through 34 show the obtained factors.
Equation 30. FA8 for the IRL transformed dataset.
Equation 31. FA9 for the IRL transformed dataset.
Equation 32. FA10 for the IRL transformed dataset.
Equation 33. FA11 for the IRL transformed dataset.
Figures 47 through 50 shows the spatial distribution of these factors with respect to the location of our ore body.
Conclusions and Recommendations on the Use of FA for the Transformed Datasets
As I mentioned earlier, for FA to be most useful, one needs to have a known target to calibrate it. The factor analysis applied to the CLR transformed data gave us three factors, but only one (FA5) was useful for targeting the ore body.
The factor analysis of the ALR transformed data (Factor 7) was good in general, but the best factors were obtained from the ILR transformed data, specially Factor 9 that not only gave the exact location of the ore body, but also its internal structure. Another efficient factor was FA11, but it definitively required calibration based on a known target.
So answering the question from page 41, yes, the factor analysis of the IRL transformed data will be more effective than the factor analysis of the raw data as a tool for locating the ore deposit.
Dealing With Zero Values
Conclusions and Recommendations
Conclusions and Recommendations
The treatment of “closed” dataset by normal statistical methods does create spurious correlations that lower the effectiveness of the obtained results. While there are ways to minimize this problem, like processing major oxides independently from trace elements, or using only the strongest correlations into the composition of the RCCs, I believe that the transformation of the initial dataset presents a better solution for the processing and interpretation of geological data.
For the estimation of the most efficient RCC, I propose the use of the ARL transformation, although the CLR is also effective.
When a target for the testing of the effectiveness of our coefficient is available, then we should use the factor analysis preferentially. I recommend the use of the IRL transformation to define the most effective combination of factors.
Finally, I introduced here a method for dealing with zero values. This method’s main advantage is that we do not obtain a fixed value for the “Rounded Zeros”, but one that depends on the real value of the other variable. The proposed method depends on the geological characteristics of the data, and therefore is less biased or random than other methods. It also presents a viable alternative to amalgamation and an effective way to deal with “Essential Zeros” in a population.
The sequence of the method is as follows:
- We transform the data using CoDaPack or other similar software.
- We select the lower quartile of real data for the element with the b.d.l. values.
- Within this dataset, we test the relationship between the elements with the b.d.l. values with one (or more) element without b.d.l. values. In most cases, these elements will correspond with well-established geological relationship like between Pb and Zn on polymetallic deposits, or between Au and Pb in hydrothermal deposits, or between Cu and Mo in porphyritic deposits, as in the case I presented here.
- We establish the regression equation.
- We then substitute the b.d.l. values by those estimated with the obtained equation of regression.
We can apply this method to any type of data, provided we establish first their correlation dependency. I would also like to see this method included as an option for dealing with zeros in the next version of CoDaPack.
Used in this text
- Aitchison, J, 2003 (2nd ed.). The statistical analysis of compositional data. The Blackburn Press, Caldwell, NJ (USA). 435 p. ISBN 1-930665-78-4.
- Aitchison, J. and J.W. Kay, 2003. Possible solutions of some essential zero problems in compositional data analysis. Proceedings of Compositional Data Analysis Workshop, 2003.
- Bacon-Shone, J., 2005. Modeling structural zeros in compositional data. Proceedings of Compositional Data Analysis Workshop, 2003.
- Kashan, A.B., Guskov, O.I. and A.A. Shimonsky, 1979. Mathematical modeling in prospection (original in Russian). Moscow: Nedra.
- Trusova, I. F and V. I. Chernov. Petrography of magmatic and metamorphic rocks (original in Russian). Moscow: Nedra.
3rd Compositional Data Analysis Workshop
Bias arising from missing data in predictive models
Compositional Data Analysis and Zeros in Micro Data.
Compositional Time Series: Past and Present
Homepage of "compositions", an R package for compositional data analysis.
Modeling Zeroes in Microdata.
New Features of CoDaPack. An User-friendly Compositional Data Package
Tectonic discrimination of basalts with classification trees.
Santiago Thió Fernández de Henestrosa.
Simcluster: clustering enumeRation gene expression data on the simplex space
Some numerical considerations in the geochemical analysis of distal microtephra
Statistical empirical index of chemical weathering in igneous rocks: A new tool for evaluating the degree of weathering
GNU Free Documentation License
||As of July 15, 2009 Wikibooks has moved to a dual-licensing system that supersedes the previous GFDL only licensing. In short, this means that text licensed under the GFDL only can no longer be imported to Wikibooks, retroactive to 1 November 2008. Additionally, Wikibooks text might or might not now be exportable under the GFDL depending on whether or not any content was added and not removed since July 15.|
Version 1.3, 3 November 2008 Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The purpose of this License is to make a manual, textbook, or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.
This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.
1. APPLICABILITY AND DEFINITIONS
This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law.
A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.
A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.
The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none.
The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.
A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not "Transparent" is called "Opaque".
Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only.
The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text.
The "publisher" means any person or entity that distributes copies of the Document to the public.
A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as "Acknowledgements", "Dedications", "Endorsements", or "History".) To "Preserve the Title" of such a section when you modify the Document means that it remains a section "Entitled XYZ" according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License.
2. VERBATIM COPYING
You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.
3. COPYING IN QUANTITY
If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.
You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
- Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission.
- List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement.
- State on the Title page the name of the publisher of the Modified Version, as the publisher.
- Preserve all the copyright notices of the Document.
- Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.
- Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.
- Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document's license notice.
- Include an unaltered copy of this License.
- Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence.
- Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission.
- For any section Entitled "Acknowledgements" or "Dedications", Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein.
- Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles.
- Delete any section Entitled "Endorsements". Such a section may not be included in the Modified version.
- Do not retitle any existing section to be Entitled "Endorsements" or to conflict in title with any Invariant Section.
- Preserve any Warranty Disclaimers.
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles.
You may add a section Entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties—for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.
5. COMBINING DOCUMENTS
You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled "History" in the various original documents, forming one section Entitled "History"; likewise combine any sections Entitled "Acknowledgements", and any sections Entitled "Dedications". You must delete all sections Entitled "Endorsements".
6. COLLECTIONS OF DOCUMENTS
You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.
7. AGGREGATION WITH INDEPENDENT WORKS
A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation's users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document's Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.
Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled "Acknowledgements", "Dedications", or "History", the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.
You may not copy, modify, sublicense, or distribute the Document except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, or distribute it is void, and will automatically terminate your rights under this License.
However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice.
Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, receipt of a copy of some or all of the same material does not give you any rights to use it.
10. FUTURE REVISIONS OF THIS LICENSE
The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. If the Document specifies that a proxy can decide which future versions of this License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Document.
"Massive Multiauthor Collaboration Site" (or "MMC Site") means any World Wide Web server that publishes copyrightable works and also provides prominent facilities for anybody to edit those works. A public wiki that anybody can edit is an example of such a server. A "Massive Multiauthor Collaboration" (or "MMC") contained in the site means any set of copyrightable works thus published on the MMC site.
"CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0 license published by Creative Commons Corporation, a not-for-profit corporation with a principal place of business in San Francisco, California, as well as future copyleft versions of that license published by that same organization.
"Incorporate" means to publish or republish a Document, in whole or in part, as part of another Document.
An MMC is "eligible for relicensing" if it is licensed under this License, and if all works that were first published under this License somewhere other than this MMC, and subsequently incorporated in whole or in part into the MMC, (1) had no cover texts or invariant sections, and (2) were thus incorporated prior to November 1, 2008.
The operator of an MMC Site may republish an MMC contained in the site under CC-BY-SA on the same site at any time before August 1, 2009, provided the MMC is eligible for relicensing.
How to use this License for your documents
To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:
- Copyright (c) YEAR YOUR NAME.
- Permission is granted to copy, distribute and/or modify this document
- under the terms of the GNU Free Documentation License, Version 1.3
- or any later version published by the Free Software Foundation;
- with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
- A copy of the license is included in the section entitled "GNU
- Free Documentation License".
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the "with...Texts." line with this:
- with the Invariant Sections being LIST THEIR TITLES, with the
- Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.