creators_name: Turney, Peter creators_id: 2175 type: confpaper datestamp: 2003-04-16 lastmod: 2011-03-11 08:55:15 metadata_visibility: show title: Data Engineering for the Analysis of Semiconductor Manufacturing Data ispublished: pub subjects: comp-sci-mach-learn subjects: comp-sci-art-intel full_text_status: public abstract: We have analyzed manufacturing data from several different semiconductor manufacturing plants, using decision tree induction software called Q-YIELD. The software generates rules for predicting when a given product should be rejected. The rules are intended to help the process engineers improve the yield of the product, by helping them to discover the causes of rejection. Experience with Q-YIELD has taught us the importance of data engineering -- preprocessing the data to enable or facilitate decision tree induction. This paper discusses some of the data engineering problems we have encountered with semiconductor manufacturing data. The paper deals with two broad classes of problems: engineering the features in a feature vector representation and engineering the definition of the target concept (the classes). Manufacturing process data present special problems for feature engineering, since the data have multiple levels of granularity (detail, resolution). Engineering the target concept is important, due to our focus on understanding the past, as opposed to the more common focus in machine learning on predicting the future. date: 1995 date_type: published pagerange: 50-59 refereed: TRUE referencetext: Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. California: Wadsworth. Famili, A. and Turney, P.D. (1991), Intelligently helping the human planner in industrial process planning, Artificial Intelligence for Engineering Design, Analysis, and Manufacturing, Vol. 5, No. 2, pp. 109-124. Famili, A. and Turney, P.D. (1992), Application of machine learning to industrial planning and decision making, in Artificial Intelligence Applications in Manufacturing, edited by A. Famili, S. Kim, and D. Nau, MIT Press, Cambridge, MA, pp. 1-16. Lavrac, N., & Dzeroski, S. (1994). Inductive Logic Programming: Techniques and Applications. New York: Ellis Horwood. Van Zant, P. (1986). Microchip Fabrication: A Practical Guide to Semiconductor Processing. California: Semiconductor Services. citation: Turney, Peter (1995) Data Engineering for the Analysis of Semiconductor Manufacturing Data. [Conference Paper] document_url: http://cogprints.org/2891/1/NRC-39163.pdf