Semantic Text Analytics Technique for Classification of Manufacturing Suppliers




Sabbagh, Ramin

Journal Title

Journal ISSN

Volume Title



Most of the information available in the manufacturing industry is in unstructured, natural language format. The unstructured data could contain important and useful information that can inform decision makers across different phases of product lifecycle. However, due to its unstructured nature, it is often difficult to effectively use the information embedded in the data represented in plain text. Manufacturing Capability data is one type of data often represented in unstructured format on the websites of manufacturing firms. If manufacturing capability data is parsed, organized, and analyzed properly, it can be used for supplier evaluation and selection during supply chain formation process. In order to come up with an efficient method of capability analysis, it is important to identify the main characteristics of the capability. Different aspects of manufacturing capability include manufacturing processes, industry coverage, engineering, organizational, and quality capabilities. There are several methods that can be used for extracting information from text. Data mining is one of the most powerful methods which is currently used for different knowledge extraction purposes. This research presents a method for manufacturing capability analysis and modeling through implementation of different supervised and unsupervised text mining methods using unstructured text in suppliers’ website as the input. For supervised text mining, Naïve Bayes, KNN, SVM, and Random Forest methods are used as the analytical classification techniques. The objective is to classify suppliers into pre-labeled classes based on the textual description of their capabilities. In unsupervised text mining method, two popular methods, namely, Clustering and Topic Modeling methods are used to split the diverse suppliers into several groups and then, find the appropriate characterizations associated with each group. The proposed methods are evaluated experimentally using real capability data collected from the webpages of manufacturers in contract machining industry. In order to evaluate the accuracy of the results, precision, recall, and F-measure are used as the metrics.



Manufacturing, Supply chain, Text mining, Classification, Clustering, Topic modeling, Machine learning


Sabbagh, R. (2018). <i>Semantic text analytics technique for classification of manufacturing suppliers</i> (Unpublished thesis). Texas State University, San Marcos, Texas.


Rights Holder

Rights License

Rights URI