Capability Language Processing (CLP): Classification and Ranking of Manufacturing Suppliers Based on Unstructured Capability Data




Zandbiglari, Kimia

In manufacturing industry, data is available in both structured and unstructured forms. Although the unstructured data represented in natural language text contains valuable information and knowledge, its effective processing for the sake of information retrieval and knowledge extraction is a challenge. Manufacturing Capability data is an example of unstructured data widely used for describing the technological capabilities of manufacturing companies. The objective of this research is to use a set of text analytics techniques to enable automated classification and ranking of manufacturing companies based on their capability narratives available on their websites. For this purpose, a supervised classification method is used in conjunction with semantic similarity measurement method. A formal thesaurus that uses Simple Knowledge Organization System (SKOS) format provides structural and lexical semantics to support classification and ranking. To conduct semantic similarity measurement, edge-based method is combined with Normalized Google Distance (NGD) technique to create a weighted edgebased method for measuring the similarities of manufacturers’ capabilities with the queried capabilities provided by customers. The proposed framework is validated experimentally using a hypothetical search scenario. The results indicate that the generated ranked list is highly correlated with human judgment, especially if the query model and supplier capability model belong to the same class. However, the correlation decreases when multiple overlapping classes of suppliers are mixed. The findings of this research can be used to improve the precision and reliability of Capability Language Processing (CLP) tools and methods and improve the intelligence of supplier discovery and capability mapping platforms.



Capability Language Processing (CLP), Capability modeling, Text mining, Document classification, Formal thesaurus, Semantic similarity


Zandbiglari, K. (2022). <i>Capability Language Processing (CLP): Classification and ranking of manufacturing suppliers based on unstructured capability data</i> (Unpublished thesis). Texas State University, San Marcos, Texas.


