Errata (2nd Edition)

This page lists known errors in the second edition of the book "Data Mining: Practical Machine Learning Tools and Techniques" by Ian H. Witten and Eibe Frank.

A list of errors in the first edition can be found here.

Table of Contents

Page xi
    Line 12 : The entries "Backpropagation 227" and "Radial basis function networks 234" are missing
    (reported by Lin Dong, October 2005)
Page xxvii
    Line 20 : "to" should be "too"
    (reported by Derrf Seitz, February 2007)

Chapter 1

Page 31
    Line -16 : "for weather" should be "for windy"
    (reported by Quan Qiu, October 2005)
Page 37
    Line -15 : "Cendrowska (1998)" should be "Cendrowska (1987)"
    (reported by Johannes Fürnkranz, December 2005)

Chapter 2

Page 51
    Line 2 : "example of an ordinal quantity" should be "example of a nominal quantity"
    (reported by André Coelho, November 2008)
Page 52
    Line -17 : "is often of" should be "is often"
    (reported by Michael C. Harris, July 2009)

Chapter 4

Page 93
    Lines 5 and 9 : the exponent is missing a minus sign before the fraction, i.e. the superscript of e should be preceded by a minus sign
    (reported by Lin Dong, January 2006)
Page 94
    Line 8 : "0.0221" should be "0.0279" and "0.000108" should be "0.000137"
    (reported by Stephen Jelfs, July 2007)
Page 94
    Line 10 : "0.000108" should be "0.000137" and "25.0%" should be "20.8%"
    (reported by Stephen Jelfs, July 2007)
Page 94
    Line 11 : "0.000108" should be "0.000137" in both cases and "75.0%" should be "79.2%"
    (reported by Stephen Jelfs, July 2007)
Page 100
    Figure 4.3 : the left rectangle in (c) should contain only one "yes"
    (reported by Derrf Seitz, February 2007)
Page 110
    Line -10 : "covers three" should be "covers two", and "two of which are" should be "one of which is"
    (reported by Derrf Seitz, February 2007)
Page 111
    Line -6 : "third line from" should be "line at"
    (reported by Lin Dong, September 2005)
Page 113
    Line -12 : "first row" should be "first rows", and "shows" should be "show"
    (reported by Lin Dong, September 2005)
Page 113
    Line -10 : "mild" should be "hot"
    (reported by Lin Dong, September 2005)
Page 115
    Line 7 : "4/12" should be "4/14"
    (reported by André Coelho, November 2008)
Page 117
    Line -6 : "are not candidate three-item sets" should be "are not three-item sets with minimum coverage"
    (reported by André Coelho, November 2008)
Page 121
    Line -7 : introduce opening square bracket "[" after "log" and before "(" and closing square bracket "]" after ")" and before ".".
    (reported by André Coelho, November 2008)
Page 122
    Line -5 : "w_0 = 0.5 and w_1 = 1" should be "w_0 = -1.25 and w_1 = 0.5".
    (reported by André Coelho, November 2008)
Page 130
    Lines -5 and -6 : "left child is empty, and its right child" should be "right child is empty, and its left child"
    (reported by Quan Qiu, September 2005)

Chapter 5

Page 156
    Line -9 : the number "1" in the formula should be the letter "l"
    (reported by Lin Dong, December 2005)
Page 156
    Line -11 : the second subscript "1" in this line should be the letter "l", i.e. the second "y subscript 1" should be "y subscript l"
    (reported by Lin Dong, December 2005)
Page 167
    Line -10 : "lift factor of four" should be "lift factor of about 2.4"
    (reported by Lin Dong, September 2005)
Page 172
    Line 1 : "5%" should be "20%"
    (reported by Lin Dong, December 2005)
Page 175
    Line -5 : "C[+|-]" should be "C[-|+]" (in both cases) and "C[-|+]" should be "C[+|-]" (just one case)
    (reported by Lucila Ohno-Machado, June 2006)
Page 178
    Table 5.8 : the mean of the observed target values in the training data should be used in the relative absolute error and the root relative squared error; in contrast, in the correlation coefficient, the mean of the observed target values in the test data should be used.
    (reported by Petra Kralj Novak, February 2020)
Page 179
    Line 6 : "figure compensate" should be "figure compensates"
    (reported by Dror Baron, November 2006)

Chapter 6

Page 204
    Line 11 : "33.0" should be "33.3"
    (reported by Dror Baron, November 2006)
Page 207 [Fixed in 2nd and later print runs]
    Line -8 : "building it incrementally by adding conjunctions" should be "pruning a rule incrementally by removing conjunctions"
    (reported by Lin Dong, July 2005)
Page 208 [Fixed in 2nd and later print runs]
    Lines 19-20 : "However, if during backtracking a node is encountered all of whose children are not leaves" should be "However, if during backtracking a node is encountered not all of whose children expanded so far are leaves"
    (reported by Lin Dong, July 2005)
Page 215 [Fixed in 2nd and later print runs]
    Line 20 : "eight weights" should be "ten weights"
    (reported by Lin Dong, May 2005)
Page 225 [Fixed in 2nd and later print runs]
    Figure 6.10 (e) : The weight for the connection from attribute a1 to node B should be "-1" instead of "1", and the weight for the connection from attribute a2 to node A should be "1" instead of "-1".
    (reported by Lin Dong, May 2005)
Page 225
    Figure 6.10 (h) : The weight for the connection from the bias node should be "0.5" instead of "-0.5", and the name of the attribute in the other node should be "a_1" instead of "a_i".
Page 226 [Fixed in 2nd and later print runs]
    Line 14 : "a3" should be "a2".
    (reported by Lin Dong, May 2005)
Page 230
    Lines 14,-6 : "y - f(x)" should be "f(x) - y".
Page 231
    Lines -6,-3 : "y - f(x)" should be "f(x) - y".
Page 232
    Line 4 : "a_i" should be "a_j".
    (reported by Olivier Pauplin, October 2010)
Page 232
    Lines 7 : "y - f(x)" should be "f(x) - y".
Page 234
    Lines -17 and -18 : ", without looking at the class labels of the training instances at all" should be deleted.
    (reported by E. Jiang, June 2006)
Page 239
    Line -11 : "exemplars is necessary" should be "exemplars it is necessary"
    (reported by Dror Baron, November 2006)
Page 254
    Line -12 : delete "using cross-validation".
Page 254
    Line -11 : delete "and cross-validation makes it even slower".
Page 254
    Line -6 : delete ", or use cross-validation".
Page 257
    Line 7 : "differs from e" should be "differs from f"
    (reported by Dror Baron, November 2006)
Page 261
    Line -5 : the exponent is missing a minus sign before the fraction, i.e. change "exp(" to "exp(-"
    (reported by Lin Dong, January 2006)
Page 262
    Lines 19 : "minimum" should be "maximum".
    (reported by Lin Dong, July 2005)
Page 262 [Fixed in 2nd and later print runs]
    Lines 20 : "minimum" should be "maximum".
    (reported by Lin Dong, July 2005)
Page 264
    Line -7 : the exponent is missing a minus sign before the fraction, i.e. the superscript of e should be preceded by a minus sign
    (reported by Jennifer Rihn, December 2005)
Page 265
    Lines -10 and -11 : "101" should be "263"
    (reported by Lin Dong, September 2005)
Page 275
    Line -17 : "any other ancestors" should be "any other set of non-descendants"
    (reported by John MacCormick, September 2007)
Page 275
    Line -16 : "In other words, ancestors" should be "In other words, other sets of non-descendants"
    (reported by John MacCormick, September 2007)
Page 275
    Line -13 : "Pr[node|ancestors]" should be "Pr[node|parents plus any other set of non-descendants]"
    (reported by John MacCormick, September 2007)
Page 275
    Line -9 : "and so on" should be "and any other set of non-descendants"
    (reported by John MacCormick, September 2007)

Chapter 7

Page 291
    Line 7 : "highest-ranked one" should be "lowest-ranked one"
    (reported by Mark Hall, October 2010)
Page 293
    Figure 7.1 : The fourth node in the third row (outlook windy) should have an incoming connection from the first node in the second row (outlook). Also, the second node in the fourth row should be (outlook temperature windy) rather than (outlook temperature humidity).
    (reported by Carlos Bustamante, November 2007)
Page 297 [Fixed in 2nd and later print runs]
    Lines -17 and -18 : " and true otherwise" should be deleted and "are set to false" should be "are set to true"
    (reported by Lin Dong, August 2005)
Page 334
    Line -19 : "(page 123)" should be "(page 121)"
    (reported by Lijun Zhang, June 2008)

Chapter 10

Page 384
    Line -7 : " has been loaded" should be ", with an additional attribute indicating the vendor, has been loaded"
    (reported by Dror Baron, November 2006)
Page 392
    Line -8 : "mumber" should be "number"
    (reported by Milan Simonovic, April 2008)
Page 410
    Line -12 : "gaurd" should be "guard"
    (reported by Zhu Xiaofei, November 2005)

Chapter 11

    Figure 11.1 : The "DataSources" tab should be the rightmost tab in the figure.
    (reported by Dror Baron, November 2006)

Chapter 12

    Figures 12.1-12.3 : "Analyse" should be "Analyze"
    (reported by Dror Baron, November 2006)

Chapter 14

Page 466 [Fixed in 2nd and later print runs]
    Line -12 : "'o'" should be "'t'"
    (reported by Peter Reutemann, September 2005)

References

Page 488
    Line 4 : "1998" should be "1987"
    (reported by Johannes Fürnkranz, December 2005)
Page 492
    Lines -8, -10, and -12 : "krantz" should be "kranz"
    (reported by Johannes Fürnkranz, December 2005)