[数据挖掘]数据挖掘经典算法(转) |
Classification==============
#1. C4.5
Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.Morgan Kaufmann Publishers Inc.
#2. CART
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification andRegression Trees. Wadsworth, Belmont, CA, 1984.
#3. K Nearest Neighbours (kNN)
Hastie, T. and Tibshirani, R. 1996. Discriminant Adaptive NearestNeighbor Classification. IEEE Trans. PatternAnal. Mach. Intell. (TPAMI). 18, 6 (Jun. 1996), 607-616. DOI= http://dx.doi.org/10.1109/34.506411
#4. Naive Bayes
Hand, D.J., Yu, K., 2001. Idiot's Bayes: Not So Stupid After All?Internat. Statist. Rev. 69, 385-398.
Statistical Learning====================
#5. SVM
Vapnik, V. N. 1995. The Nature of Statistical LearningTheory. Springer-Verlag New York, Inc.
#6. EM
McLachlan, G. and Peel, D. (2000). Finite Mixture Models. J. Wiley, New York.
Association Analysis====================
#7. Apriori
Rakesh Agrawal and Ramakrishnan Srikant. Fast Algorithms for MiningAssociation Rules. In Proc. of the 20th Int'l Conference on Very LargeDatabases (VLDB '94), Santiago, Chile, September 1994. http://citeseer.comp.nus.edu.sg/agrawal94fast.html
#8. FP-Tree
Han, J., Pei, J., and Yin, Y. 2000. Mining frequent patterns withoutcandidate generation. In Proceedings of the 2000 ACM SIGMODinternational Conference on Management of Data (Dallas, Texas, UnitedStates, May 15 - 18, 2000). SIGMOD '00. ACM Press, New York, NY, 1-12.DOI= http://doi.acm.org/10.1145/342009.335372
Link Mining===========
#9. PageRank
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextualWeb search engine. In Proceedings of the Seventh internationalConference on World Wide Web (WWW-7) (Brisbane,Australia). P. H. Enslow and A. Ellis, Eds. Elsevier SciencePublishers B. V., Amsterdam, The Netherlands, 107-117. DOI= http://dx.doi.org/10.1016/S0169-7552(98)00110-X
#10. HITS
Kleinberg, J. M. 1998. Authoritative sources in a hyperlinkedenvironment. In Proceedings of the Ninth Annual ACM-SIAM Symposium onDiscrete Algorithms (San Francisco, California, United States, January25 - 27, 1998). Symposium on Discrete Algorithms. Society forIndustrial and Applied Mathematics, Philadelphia, PA, 668-677.
Clustering==========
#11. K-Means
MacQueen, J. B., Some methods for classification and analysis ofmultivariate observations, in Proc. 5th Berkeley Symp. MathematicalStatistics and Probability, 1967, pp. 281-297.
#12. BIRCH
Zhang, T., Ramakrishnan, R., and Livny, M. 1996. BIRCH: an efficientdata clustering method for very large databases. In Proceedings of the1996 ACM SIGMOD international Conference on Management of Data(Montreal, Quebec, Canada, June 04 - 06, 1996). J. Widom, Ed. SIGMOD '96. ACM Press, New York, NY, 103-114. DOI= http://doi.acm.org/10.1145/233269.233324
Bagging and Boosting====================
#13. AdaBoost
Freund, Y. and Schapire, R. E. 1997. A decision-theoreticgeneralization of on-line learning and an application toboosting. J. Comput. Syst. Sci. 55, 1 (Aug. 1997), 119-139. DOI= http://dx.doi.org/10.1006/jcss.1997.1504
Sequential Patterns===================
#14. GSP
Srikant, R. and Agrawal, R. 1996. Mining Sequential Patterns:Generalizations and Performance Improvements. In Proceedings of the5th international Conference on Extending Database Technology:Advances in Database Technology (March 25 - 29, 1996). P. M. Apers,M. Bouzeghoub, and G. Gardarin, Eds. Lecture Notes In ComputerScience, vol. 1057. Springer-Verlag, London, 3-17.
#15. PrefixSpan
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal andM-C. Hsu. PrefixSpan: Mining Sequential Patterns Efficiently byPrefix-Projected Pattern Growth. In Proceedings of the 17thinternational Conference on Data Engineering (April 02 - 06,2001). ICDE '01. IEEE Computer Society, Washington, DC.
Integrated Mining=================
#16. CBA
Liu, B., Hsu, W. and Ma, Y. M. Integrating classification andassociation rule mining. KDD-98, 1998, pp. 80-86. http://citeseer.comp.nus.edu.sg/liu98integrating.html
Rough Sets==========
#17. Finding reduct
Zdzislaw Pawlak, Rough Sets: Theoretical Aspects of Reasoning aboutData, Kluwer Academic Publishers, Norwell, MA, 1992
http://blogger.org.cn/blog/more.asp?name=DMman&id=30496Graph Mining============
#18. gSpan
Yan, X. and Han, J. 2002. gSpan: Graph-Based Substructure PatternMining. In Proceedings of the 2002 IEEE International Conference onData Mining (ICDM '02) (December 09 - 12, 2002). IEEE ComputerSociety, Washington, DC. | |
|
|