Fuzzy Association Rule Mining for the Analysis of Historical Process Data
Keywords:
fuzzy logic, classification, association rules, knowledge discovery, polymerizationAbstract
Process data collected during the operation of complex production processes can be used for system identification, process monitoring and optimization. This work presents a new algorithm that is able to extract useful knowledge from data. The extracted information is given in the form of association rules. Association rule mining finds interesting association or correlation relationships among a large set of data items. The large itemsets can be related to the frequent events of a process, and this is useful for detect unknown relationships among the process variables, reduct the models of the system, estimate the product quality and build a classifier. The proposed method based on the Apriori algorithm, but the main idea is incorporate fuzziness (fuzzy logic increases the interpretability of the model and tolerance against measurement noise and uncertainty). The general applicability and efficiently of the developed tool are showed by an application study, one general example for the feature (input) selection problem and the analysis of a polymerization process data. Moreover the proposed classifier is used for three general used classification problems.
References
Agrawal, R., Srikant R. (1994). Fast algorithm for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, 487–499.
Agrawal, R., Imielinski T., Swami, A. (1993). Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering, 5(6), 914–925. https://doi.org/10.1109/69.250074
Aguirre, L. A., Billings, S. A. (1995). Improved structure selection for nonlinear models based on term clustering. Int. J. Control, 62(3), 569–587. https://doi.org/10.1080/00207179508921557
Aguirre, L. A., Mendes, E. M. A. M. (1996). Global nonlinear polynomial models: Structure, term clusters and fixed points. Int. J. Bifurcation Chaos, 6(2), 279–294. https://doi.org/10.1142/S0218127496000059
Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Autom. Control, 19. 716–723. https://doi.org/10.1007/978-1-4612-1694-0_16
Clark, P., Niblett T. (1989). The CN2 induction algorithm. Machine Learning, 3. 261–283. https://doi.org/10.1023/A:1022641700528
Cohen, W. (1995). Fast effective rule induction. Proceedings of the 12th International Conference on Machine Learning, Tahoe City, CA, Morgan Kaufmann 115–123. https://doi.org/10.1016/B978-1-55860-377-6.50023-2
Dong, G., Zhang, X. Wong, L., Li, J. (1999). CAEP: classification by aggregating emerging patterns. Second International Conference on Discovery Science. https://doi.org/10.1007/3-540-46846-3_4
Doyle, F. J., Ogunnaike, B. A., Pearson, R. K. (1995). Nonlinear model-based control using second-order volterra models. Automatica, 31(5), 697–714. https://doi.org/10.1016/0005-1098(94)00150-H
Duda, R., Hart, P. (1973). Pattern Classification and Scene Analysis. JohnWiley & Sons : New York
Goethals, B., den Bussche, J.V. (2000). On supporting interactive association rule mining. Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery, Lecture Notes in Computer Science, Springer, 1874. 307–316. https://doi.org/10.1007/3-540-44466-1_31
Goethals, B., Muhonen, J., Toivonen, H. (2005). Mining Non-Derivable Association Rules SIAM International Data Mining Conference, Newport Beach, California. https://doi.org/10.1137/1.9781611972757.22
Gustafson, D. E., Kessel, W. C. (1979). Fuzzy clustering with fuzzy covariance matrix. In: Proceedings of the IEEE CDC, San Diego, 761–766.
Hong, T. P., Kuo, C. S., Chi, S. C. (1999). Mining association rules from quantitative data. Intelligent Data Analysis, 3(5), 363–376. https://doi.org/10.3233/IDA-1999-3504
Jaroszewicz, Sz. (2003). Information - theoretical and combinatorial methods in data mining. PhD Dissertation, University of Massachusetts, Boston
Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., Verkamo, A. I. (1994). Finding interesting rules from large sets of discovered association rules. Third International Conference on Information and Knowledge Management (CIKM’94), Gaithersburg, MD, USA, ACM. 401–407. https://doi.org/10.1145/191246.191314
Korenberg, M., Billings, S. A., Liu, Y., McIlroy, P. (1988). Orthogonal parameter estima- tion algorithm for nonlinear stochastic systems. Int. J. Control, 48(1), 193–210. https://doi.org/10.1080/00207178808906169
Kuok, C. M., Fu, A., Wong, M. H. (1998). Mining fuzzy association rules in databases. ACM SIGMOD Record, 27(1), 41–46. https://doi.org/10.1145/273244.273257
Li, W., Han, J., Pei, J. (2001). CMAR: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of the 2001 IEEE International Conference on Data Mining (eds.: Cercone, N., Lin, T. Y., Wu, X.), San Jose, California, USA, IEEE Computer Society 369–376.
Liang, G., Wilkes, D., Cadzow, J. (1993). Arma model order estimation based on the eigen-values of the covariance matrix. IEEE Trans. Signal Process, 41(10), 3003–3009. https://doi.org/10.1109/78.277805
Lim, T. S., Loh, W. Y., Shih, Y. S. (2000). A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 40. 203–228. https://doi.org/10.1023/A:1007608224229
Liu, B., Hsu, W., Ma, Y. (1998). Integrating classification and association rule mining. KDD’98, New York
Liu, B., Ma, Y., Wong C. K. (2000). Improving an Association Rule Based Classifier. Principles of Data Mining and Knowledge Discovery, 504–509. https://doi.org/10.1007/3-540-45372-5_58
Mendes, E. M. A. M., Billings, S.A. (2001). An alternative solution to the model structure selection problem. IEEE Trans. Syst. Man Cybernetics, Part A: Syst. Humans, 31(6), 597–608. https://doi.org/10.1109/3468.983416
Meretakis, D., Wuthrich, B. (1999). Extending Naive Bayes Classifiers Using Long Itemsets. Knowledge Discovery and Data Mining, 165–174. https://doi.org/10.1145/312129.312222
Nauck, D., Kruse, R. (1999). Obtaining interpretable fuzzy classification rules from medical data. Artif. Intell. Med., 16(2), 149–169. https://doi.org/10.1016/S0933-3657(98)00070-0
Ng, R. T., Lakshmanan, L. V. S., Han, J., Pang, A. (1998). Exploratory mining and pruning optimizations of constrained association rules. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, SIGMOD Record, 27(2), 13–24. https://doi.org/10.1145/276305.276307
Pena-Reyes, C. A., Sipper, M. (2000). A fuzzy genetic approach to breast cancer diagnosis. Artif. Intell. Med., 17(2), 131–155. https://doi.org/10.1016/S0933-3657(99)00019-6
Quinlan, J. R. (1992). C4.5: program for machine learning. Morgan Kaufmann : San Mateo, CA.
Rhodes, C., Morari, M. (1998). Determining the model order of nonlinear input/output systems. AIChE Journal, 44(1), 151–163. https://doi.org/10.1002/aic.690440116
Setiono, R. (2000). Generating concise and accurate classification rules for breast cancer diagnosis. Artif. Intell. Med., 18(3), 205–219. https://doi.org/10.1016/S0933-3657(99)00041-X
Wang, K., Zhou, S., He, Y. (2000). Growing decision tree on support-less association rules. In: KDD’00, Boston, MA, 265–269. https://doi.org/10.1145/347090.347147
Yin, X., Han, J. (2003). CPAR: Classification based on predictive association rules. In: Proceedings of 2003 SIAM International Conference on Data Mining (SDM’03) https://doi.org/10.1137/1.9781611972733.40
Zimmermann, A., Raedt L. D. (2004) CorClass: Correlated Association Rule Mining for Classification. Discovery Science, 7th International Conference, Padova, Italy, 60–72. https://doi.org/10.1007/978-3-540-30214-8_5
Downloads
Published
Issue
Section
License
Copyright (c) 2006 Pach Ferenc Péter, Gyenesei Attila, Németh Sándor, Árva Péter, Abonyi János

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

