Fuzzy Association Rule Mining for the Analysis of Historical Process Data

Ferenc Péter Pach; Attila Gyenesei; Sándor Németh; Péter Árva; János Abonyi

Authors

Ferenc Péter Pach University of Veszprém, Departement of Process Engeneering, Veszprém, Hungary
Attila Gyenesei Department of Knowledge and Data Analysis, Unilever Research Vlaardingen, The Netherlands
Sándor Németh University of Veszprém, Departement of Process Engeneering, Veszprém, Hungary
Péter Árva University of Veszprém, Departement of Process Engeneering, Veszprém, Hungary
János Abonyi University of Veszprém, Departement of Process Engeneering, Veszprém, Hungary

Keywords:

fuzzy logic, classification, association rules, knowledge discovery, polymerization

Abstract

Process data collected during the operation of complex production processes can be used for system identification, process monitoring and optimization. This work presents a new algorithm that is able to extract useful knowledge from data. The extracted information is given in the form of association rules. Association rule mining finds interesting association or correlation relationships among a large set of data items. The large itemsets can be related to the frequent events of a process, and this is useful for detect unknown relationships among the process variables, reduct the models of the system, estimate the product quality and build a classifier. The proposed method based on the Apriori algorithm, but the main idea is incorporate fuzziness (fuzzy logic increases the interpretability of the model and tolerance against measurement noise and uncertainty). The general applicability and efficiently of the developed tool are showed by an application study, one general example for the feature (input) selection problem and the analysis of a polymerization process data. Moreover the proposed classifier is used for three general used classification problems.

Author Biography

János Abonyi, University of Veszprém, Departement of Process Engeneering, Veszprém, Hungary

corresponding author
abonyij@fmt.uni-pannon.hu

References

Agrawal, R., Srikant R. (1994). Fast algorithm for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, 487–499.

Agrawal, R., Imielinski T., Swami, A. (1993). Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering, 5(6), 914–925. https://doi.org/10.1109/69.250074

Aguirre, L. A., Billings, S. A. (1995). Improved structure selection for nonlinear models based on term clustering. Int. J. Control, 62(3), 569–587. https://doi.org/10.1080/00207179508921557

Aguirre, L. A., Mendes, E. M. A. M. (1996). Global nonlinear polynomial models: Structure, term clusters and fixed points. Int. J. Bifurcation Chaos, 6(2), 279–294. https://doi.org/10.1142/S0218127496000059

Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Autom. Control, 19. 716–723. https://doi.org/10.1007/978-1-4612-1694-0_16

Clark, P., Niblett T. (1989). The CN2 induction algorithm. Machine Learning, 3. 261–283. https://doi.org/10.1023/A:1022641700528

Cohen, W. (1995). Fast effective rule induction. Proceedings of the 12th International Conference on Machine Learning, Tahoe City, CA, Morgan Kaufmann 115–123. https://doi.org/10.1016/B978-1-55860-377-6.50023-2

Dong, G., Zhang, X. Wong, L., Li, J. (1999). CAEP: classification by aggregating emerging patterns. Second International Conference on Discovery Science. https://doi.org/10.1007/3-540-46846-3_4

Doyle, F. J., Ogunnaike, B. A., Pearson, R. K. (1995). Nonlinear model-based control using second-order volterra models. Automatica, 31(5), 697–714. https://doi.org/10.1016/0005-1098(94)00150-H

Duda, R., Hart, P. (1973). Pattern Classification and Scene Analysis. JohnWiley & Sons : New York

Goethals, B., den Bussche, J.V. (2000). On supporting interactive association rule mining. Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery, Lecture Notes in Computer Science, Springer, 1874. 307–316. https://doi.org/10.1007/3-540-44466-1_31

Goethals, B., Muhonen, J., Toivonen, H. (2005). Mining Non-Derivable Association Rules SIAM International Data Mining Conference, Newport Beach, California. https://doi.org/10.1137/1.9781611972757.22

Gustafson, D. E., Kessel, W. C. (1979). Fuzzy clustering with fuzzy covariance matrix. In: Proceedings of the IEEE CDC, San Diego, 761–766.

Hong, T. P., Kuo, C. S., Chi, S. C. (1999). Mining association rules from quantitative data. Intelligent Data Analysis, 3(5), 363–376. https://doi.org/10.3233/IDA-1999-3504

Jaroszewicz, Sz. (2003). Information - theoretical and combinatorial methods in data mining. PhD Dissertation, University of Massachusetts, Boston

Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., Verkamo, A. I. (1994). Finding interesting rules from large sets of discovered association rules. Third International Conference on Information and Knowledge Management (CIKM’94), Gaithersburg, MD, USA, ACM. 401–407. https://doi.org/10.1145/191246.191314

Korenberg, M., Billings, S. A., Liu, Y., McIlroy, P. (1988). Orthogonal parameter estima- tion algorithm for nonlinear stochastic systems. Int. J. Control, 48(1), 193–210. https://doi.org/10.1080/00207178808906169

Kuok, C. M., Fu, A., Wong, M. H. (1998). Mining fuzzy association rules in databases. ACM SIGMOD Record, 27(1), 41–46. https://doi.org/10.1145/273244.273257

Li, W., Han, J., Pei, J. (2001). CMAR: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of the 2001 IEEE International Conference on Data Mining (eds.: Cercone, N., Lin, T. Y., Wu, X.), San Jose, California, USA, IEEE Computer Society 369–376.

Liang, G., Wilkes, D., Cadzow, J. (1993). Arma model order estimation based on the eigen-values of the covariance matrix. IEEE Trans. Signal Process, 41(10), 3003–3009. https://doi.org/10.1109/78.277805

Lim, T. S., Loh, W. Y., Shih, Y. S. (2000). A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 40. 203–228. https://doi.org/10.1023/A:1007608224229

Liu, B., Hsu, W., Ma, Y. (1998). Integrating classification and association rule mining. KDD’98, New York

Liu, B., Ma, Y., Wong C. K. (2000). Improving an Association Rule Based Classifier. Principles of Data Mining and Knowledge Discovery, 504–509. https://doi.org/10.1007/3-540-45372-5_58

Mendes, E. M. A. M., Billings, S.A. (2001). An alternative solution to the model structure selection problem. IEEE Trans. Syst. Man Cybernetics, Part A: Syst. Humans, 31(6), 597–608. https://doi.org/10.1109/3468.983416

Meretakis, D., Wuthrich, B. (1999). Extending Naive Bayes Classifiers Using Long Itemsets. Knowledge Discovery and Data Mining, 165–174. https://doi.org/10.1145/312129.312222

Nauck, D., Kruse, R. (1999). Obtaining interpretable fuzzy classification rules from medical data. Artif. Intell. Med., 16(2), 149–169. https://doi.org/10.1016/S0933-3657(98)00070-0

Ng, R. T., Lakshmanan, L. V. S., Han, J., Pang, A. (1998). Exploratory mining and pruning optimizations of constrained association rules. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, SIGMOD Record, 27(2), 13–24. https://doi.org/10.1145/276305.276307

Pena-Reyes, C. A., Sipper, M. (2000). A fuzzy genetic approach to breast cancer diagnosis. Artif. Intell. Med., 17(2), 131–155. https://doi.org/10.1016/S0933-3657(99)00019-6

Quinlan, J. R. (1992). C4.5: program for machine learning. Morgan Kaufmann : San Mateo, CA.

Rhodes, C., Morari, M. (1998). Determining the model order of nonlinear input/output systems. AIChE Journal, 44(1), 151–163. https://doi.org/10.1002/aic.690440116

Setiono, R. (2000). Generating concise and accurate classification rules for breast cancer diagnosis. Artif. Intell. Med., 18(3), 205–219. https://doi.org/10.1016/S0933-3657(99)00041-X

Wang, K., Zhou, S., He, Y. (2000). Growing decision tree on support-less association rules. In: KDD’00, Boston, MA, 265–269. https://doi.org/10.1145/347090.347147

Yin, X., Han, J. (2003). CPAR: Classification based on predictive association rules. In: Proceedings of 2003 SIAM International Conference on Data Mining (SDM’03) https://doi.org/10.1137/1.9781611972733.40

Zimmermann, A., Raedt L. D. (2004) CorClass: Correlated Association Rule Mining for Classification. Discovery Science, 7th International Conference, Padova, Italy, 60–72. https://doi.org/10.1007/978-3-540-30214-8_5

Fuzzy Association Rule Mining for the Analysis of Historical Process Data

Authors

Keywords:

Abstract

Author Biography

References

Downloads

Published

Issue

Section

License

How to Cite

Most read articles by the same author(s)

Make a Submission

Latest publications

Information

Language

Keywords