A Computational Approach to Identifying Engineering Design Problems

3 4 Chijioke C. Obieke, first author 5 Systems Realization Laboratory, Division of Industrial Design, School of Engineering 6 University of Liverpool, 7 Brownlow Hill, Liverpool L69 3GH, United Kingdom 8 chijioke.obieke@liverpool.ac.uk 9 10 Jelena Milisavljevic-Syed, second author 11 Systems Realization Laboratory, Division of Industrial Design, School of Engineering 12 University of Liverpool, 13 Brownlow Hill, Liverpool L69 3GH, United Kingdom 14 j.milisavljevic-syed@liverpool.ac.uk 15 16 Sustainable Manufacturing Systems Centre 17 School of Aerospace, Transportation and Manufacturing (SATM) 18 Cranfield University, Cranfield, Bedfordshire MK43 0AL, United Kingdom 19 j.milisavljevicsyed@cranfield.ac.uk 20 ASME Membership No. 000102058626 21 22 Arlindo Silva, third author 23 Singapore University of Technology and Design 24 8 Somapah Rd, Singapore 487372 25 arlindo_silva@sutd.edu.sg 26 ASME Membership No. 102035333 27 28 Ji Han, fourth author1 29 INDEX, Business School, 30 University of Exeter 31 Tintagel House, 92 Albert Embankment, London SE1 7TY, United Kingdom 32 j.han2@exeter.ac.uk 33 34 35 36 37 38


INTRODUCTION 59
One of the main tasks of a design engineer at early-stage product design and 60 development is to provide an Engineering Design Solution (EDS) to a societal problem 61 using personal knowledge, experience, and background [1,2]. Another main task is 62 identifying or conceptualising a new Engineering Design Problem (EDP) [3,4]. The EDP duplication recognition, and python programming language. A case study is conducted in 108 three parts with a sample size of 43 participants drawn worldwide from the engineering 109 design community in academia and industry. In the first part of the case study, the aim is 110 to obtain empirical evidence on the lack of focus on EDPE within the engineering design 111 community. The aim of the second part of the case study is to test how closely a 112 computationally framed EDP matches a naturally framed one by a human. In the last part 113 of the case study, the value of the computational EDPE support tool -Pro-Explora V1 (Pro-114 Explora) presented for the first time in this paper is evaluated. 115 Presented in the following section are the natural EDPE process including possible 116 determinants of its scarce practice. Section 3 is on the methodology used in this paper to 117 come up with a data-driven computational EDPE framework. Section 4 is about the case 118 study data collection methods. Qualitative and quantitative results of the case study are 119 presented in Section 5 and discussed in Section 6. The paper is concluded in Section 7. 120 121 2 LITERATURE REVIEW 122 123

Models and phenomena related to the natural EDPE approach 124
The EDPE process is characterized by divergent thinking and decision-making for 125 a new EDP. The "rational" model and "garbage can" model relate to the natural EDPE 126 process. The "rational" model is a formal model of science which postulates that careful 127 analyses of previous problems and theories underpin the discovery of a new problem [28]. 128 It supports that a new problem is identified progressively or logically based on gaps in 129 previous problems and theories. The "garbage can" model postulates that a new problem 130 emerges stochastically rather than logically [29,30]. It supports that a new problem 131 comes up from a stochastic synthesis of previous problems that may not be related. A 132 new problem based on the "garbage can" model is considered more creative than that 133 based on the "rational" model [31]. The "garbage can" model relates to connectionism -134 a cognitive science concept that likens the connections in computer Artificial Neural 135 Networks (ANN) to natural cognitive ability [32]. The computer ANN contains stochastic 136 and complex interconnected nodes that distribute information for ML. 137 The "rational" and "garbage can" models describe the natural process through 138 which new opportunities, ideas, or concepts are produced. Specifically, they are used to 139 describe the process of coming up with research topics or titles. For example, Alter and 140 Dennis [28] states that: "As faculty, we tend to teach our students a formal "rational 141 model" of science in which research activity is driven by a solid understanding of prior 142 work. Under this approach, research topics emerge from a careful analysis of prior 143 research and theory." Also, project advice to students is to begin the "search for a suitable 144 problem as soon as possible" [33]. In discussing where a new project is found and how a 145 project is identified, selected, developed, and refined, Dennis and Valacich [29] state that 146 the garbage can model is "a more useful model of how research projects are typically 147 developed" where "the key elements of the project are thrown together into a garbage 148 can, mixed together, and out comes the project." Also, Martin [30] states that an 149 organization looking for a problem could be imagined as a garbage can where, as 150 "members of the organization generate problems and solutions, they dump these into 151 the garbage can" from which a new problem emerges. 152 Serendipity is a cognitive phenomenon related to EDPE or discovering something 153 new and valuable by chance [34,35]. It is described as one of the mechanisms of 154 innovation [36], and "connotes the profound ability of finding out valuable things 155 different from those who have been exploring by spending a lot of time or for years" [37]. 156 Serendipity occurs when observation by a design engineer triggers unexplored 157 possibilities. Usually, this is based on coincidence with the design engineer's interest, 158 passion, experience, knowledge, cultural background, and so on. It is reported that the 159 "ground effect" in aircraft is a serendipity discovery [38]. Apophenia is another cognitive 160 phenomenon related to the discovery of a new EDP. It is a natural tendency to see or 161 make meaningful, valuable, invisible connections between unrelated or random data 162 [39]. Apophenia is related to the "garbage can" model and ANN. It could lead to an 163 "invention: creating new, previously unimaginable meanings through accident" [40]. 164 EDPE is considered challenging, and findings on some determinants are presented next. 165 166 2.2 Challenges with the natural EDPE 167

Memory limitation in the natural EDPE process 168
Studies show that the average amount of information the short-term memory can retain 169 and process when exposed to a new concept is 7 ± 2 [41,42]. The number approaches 170 the minimum with an increased number of syllables in a word during processing of a 171 sequence of words [43]. EDPE, as a cognitive activity, inherently involves processing word 172 sequences, as the "rational" and "garbage can" models suggest. Hence, despite the 173 complexity of the brain, its information-processing capacity is limited [ [56,57]. BERT, like LSTM, uses long-term dependencies 204 (depending on previous states) in its network to predict the next state in a sequence. For 205 example, if the state represented with the ellipses is missing from the sequence -206 "Engineering design is a noble…", contextual word embeddings BERT can be used to 207 predict (natural language inference, NLI) the next or missing state (masked word) as 208 "profession". The next state is inferred relative to the previous states. BERT can also 209 combine the context of both previous and next states (bidirectional) to predict a state in 210 a sequence. This could be described as forward and backward determinism. 211 The persistence of previous states in BERT

Theoretical framework for computational EDPE 255
Concerning RQ2, a theoretical data-driven computational framework, shown in 256 Fig. 1, is presented to support EDPE as a challenging activity. The framework is based on 257 the information presented in Table 1 on the natural EDPE approach and its computational 258 equivalence. The natural EDPE approach in Table 1 is based on findings discussed in 259 Section 2 and supported by most intellectual property offices [68]. 260 2) Make an automated search for prior existence in relevant databases using duplication recognitions. The computational EDPE framework shown in Fig. 1

comprises System A and 266
System B. The input to System A is a collection or corpus of engineering design project 267 titles extracted online from Compendex, Scopus, journals and conferences databases, and 268 findaphd.com using a python data extraction tool "Scrapy". The output from System A is 269 processed data which feeds System B to produce a new EDP as output. The project titles 270 used as input to System A are important lines of words that represent EDPs in engineering 271 design projects [69]. A "project title should provide information about the topic being 272 studied, and may consist of the actual problem statement" [70]. According to Martin [30], 273 "the researcher should critically review the literature on a given topic in order to find an 274 important issue which previous research has failed to resolve successfully". This 275 'important issue' is usually formulated as a title -the problem solved, being solved, or yet 276 to be solved. Hence, the extracted titles in this paper are previous EDPs. As an example, 277 in the project title: "Design of an Automatic Sprinkler Fire Fighting System", the EDP of 278 fighting the fire using an automatic sprinkler system is described. The underlying EDP is: 279 "What is a better way, among the existing alternatives, to fight fire?" The design project 280 is conceived as a solution to fire fighting. It is believed that such a solution was not 281 available when the project was conceived. This makes the project an EDP that needs to 282 be addressed because there is a potential benefit in doing so. The EDP addressed in the 283 project is described with the title. Albeit the title appears as an EDS, it is an EDP if that 284 EDS is unavailable or yet to be realized. It remains an EDP until it is solved. Hence, 285 computationally exploring and identifying a new EDP could lead to an invention or 286

innovation. 287
The increasing volume of titles, continuously collected for EDPE, could be 288 regarded as structured big data of previous EDPs. Hence, this is the concept of the 'Big 289 data' indicated as input to the model in Fig. 1. To mimic the natural EDPE, the framework 290 in Fig. 1 is created to use or learn from only the natural EDPs to come up with a 291 computational equivalent. The output from System B (which contains the MM in Fig. 2) is 292 a unique EDP distinct from the input. The preprocessing and processing of the corpus in 293 System A for input in System B to produce a unique EDP are presented next. 294

Preprocessing of corpus for EDPE 295
The corpus extracted online is preprocessed in System A Fig. 1

using NLP and ML. 296
The corpus is first prepared as a "tab-separated value" with each line in the corpus ending 297 with a period. On inspection, some of the extracted titles in the corpus appear vague to 298 describe an EDP. This necessitates an ML classification model to classify subsequent 299 extracted titles that do not describe an EDP (Non-EDP). The extracted corpus is manually 300 separated by inspection as a dataset of EDP and non-EDP. This is to enable the training of 301 the algorithm for the classification model using supervised ML -an aspect of AI that 302 provides computer systems with the ability to learn from data. The dataset size is 2133 303 (comprising 1833 EDP and 300 non-EDP), and a 20% test size is used for the ML. The 304 training requires that the dataset is 'cleaned' and 'tokenized' as part of NLP [71, 72]. The 305 'cleaning' requires the removal of regular expressions or characters that specify a search 306 pattern in extracted texts such as "?", "@", and "$". It also requires the removal of 307 stopwords from the dataset such as "a", "for", and "the" which are insignificant in NLP. 308 Different algorithms are tried during the training, including RandomizedSearchCV, Naïve 309 Bayes (Gaussian and Multinomial), and Random Forest. These algorithms are part of 310 Scikit-learn -a library in Python that provides many unsupervised and supervised learning 311 algorithms. Trying different algorithms to select the best based on performance is a 312 common practice in ML. Two Scikit-learn performance evaluation libraries -classification 313 report and confusion matrix [73], are used during the training to evaluate the 314 performance of the algorithms in the classification model. For RandomizedSearchCV, the 315 accuracy calculated using the confusion matrix is 93%. The classification report shows the 316 precision, recall, and f1-score accuracy metrics as 94%, 93%, and 93%, respectively. These 317 metrics suggest that only a few EDPs are wrongly classified as non-EDPs and vice versa. Markov model (HMM). It has hidden and physically observable states (emissions) [74,75]. 329 What constitutes the hidden states and emissions in this paper are explained in Section 330 3.3. MC is used as an HMM in many real-life problems, such as handwriting recognition, 331 machine maintenance, and weather forecasting. This is because MC alone does not fully 332 represent the intent in many real-life problems [76]. In this paper, as shown in Fig. 2 In Eq. (1), as used in this paper, the output Sn+1 is a function of two arguments, sn 344 and λ. This is such that, sn ∈ Sn and λ ∈ Λ. The function f(sn, .) is a random variable (Sn+1) 345 for each sn ∈ Sn, while for each λ ∈ Λ, f(., λ) is a hidden function between Sn and Sn+1. This   Fig. 1 is deployed to produce a computational EDPE tool discussed next. 370 371

Pro-Explora -a computational support tool for EDPE 372
Pro-Explora is a computational support tool for EDPE. In Pro-Explora, the 373 theoretical EDP sequencing model in System B Fig. 1 is realized by processing the input 374 corpus in System B Fig. 1 as a python dictionary data structure. The corpus is split into 375 single words with each word as the dictionary key. The value list of each key contains all 376 words that come immediately after the key in all occurrences of the key in the corpus. To 377 closely mimic a natural EDP, the initial word/emission at epoch 0 in Fig. 2 is randomly 378 selected from the list of hidden states S10S20… Sm0. The hidden states S10S20… Sm0 comprise 379 the first words of each EDP in the extracted corpus. For example, S10 will be "Design" and 380 S20 will be "A" for a corpus that contains the two EDPs -["Design of a mechanical intrusive 381 force detection device.", "A design of an automatic bottle opener."]. The dictionary key 382 "of" in the corpus will have the values "a" and "an". After the initial emission, the rest of 383 the emissions are constrained to be randomly chosen from the mutually exclusive hidden 384 states at epochs 1,2,3,4,5,…n based on the Markov property. These observable states in 385 The number of words (n) in the new EDP from Pro-Explora is constrained to a minimum 399 of 6 and a maximum of 12 (6 ≤ n ≤ 12). This is based on findings from studies and the 400 result of the EDP word-count analysis on the extracted corpus as shown in Fig. 3. This range is significant in this paper and correlates with the scholarly suggestions that a 407 maximum of 12 words should be used to describe an EDP to inspire thoughts and attract 408 attention [80][81][82][83]. Also, as discussed in Section 2.3.1, the limit of words the brain can 409 process at once is between 5 and 9. Hence, the word count limit of 6 -12 for the Pro-410 Explora output is considered appropriate. Using python code, it is checked that the output 411 from System B Fig. 1 ending with a period satisfies the word count limit. Pro-Explora GUI in Fig. 4. The GUI has two settings that contain some options that could 417 be selected based on preference before EDPE as explained next. The Pro-Explora GUI, as shown in Fig.4, has the "Select explore domain" and "Select 422 number of problems to explore" settings. The "Select explore domain" setting has four 423 domain options -"Engineering design product", "Engineering design research", 424 "Engineering design machine intelligence", and "Engineering design cross-domain". The  As shown in Table 2 This part of the case study addresses RQ1. The aim is to subtly test the consciousness of 467 EDPE practice within the engineering design community. As previously mentioned, 468 creativity correlates strongly with EDPE and EDPS. The participants are given the 469 questions in Table 3 as an online questionnaire to respond to. In Table 3 Table 4 are presented to each 485 participant. However, each set contains a randomly arranged 5 EDPs framed by a design 486 engineer (naturally framed) and 15 EDPs framed using Pro-Explora (computationally 487 framed). The ratio (1:3) of the EDPs is intentionally not disclosed to the participants. Since 488 the participants are unaware of this ratio, it helps to eliminate bias in their judgements. 489 To make the reader guess, the categories of the EDPs -"naturally" or "computationally" 490 framed, are not indicated here but in Section 5.4.   4. Designing an interactive interface for collaborative engineering design. 5. A design of an automatic bottle opener. 6. Towards intelligent emotion detection system for video traffic surveillance. 7. Ai-based learning models for video traffic surveillance. 8. Design and material properties to minimize biofilm deposits. 9. Design of human-powered hybrid electric-power shovel for the physically challenged. 10. Design of self-reconfigurable production equipment during operation. 11. Anti riot drone without traffic lights. 12. Investigation of anomaly detection in a critical materials. 13. Design of a self-timing solar seawater desalination machine. 14. Staging co-design for reverse modeling of product development. 15. Detecting aggressive driving behavior using scilab. 16. Design of remote intelligent home finance software. 17. Designing products by artificial intelligence design approach. 18. A computationally efficient real-time vehicle and speed detection using federated learning. 19. Automatic mechanical footstep power tiller machine. 20. Design of production information retrieval system. 495

Case Study: Part 3 -Evaluating the value of a computational EDPE support tool 496
This is the last part of the case study and contributes to answering RQ2. It is about 497 evaluating a computational EDPE support tool -Pro-Explora, presented in Section 3.3. 498 Participants use the tool to come up with at least 5 EDPs in about 10 minutes. On a Likert 499 scale of 1 -10, the participants rated the usefulness of the EDP framed by the tool. They 500 also provide additional information on 1) the reason for the usefulness rating they 501 provide, and 2) whether the EDP inspired or prompted them to think of a different EDP 502 related or unrelated to the originally framed EDP. 503 504

Data analysis 505
Data from the case study is qualitatively and quantitatively analyzed. The 506 qualitative analysis is performed with NVivo 12 -powerful software for qualitative data 507 analysis, following the workflow in Fig. 6. various data collected are imported into NVivo and arranged. The data is coded -the 513 process of gathering materials (participants' responses) by topics or themes. This is 514 followed by querying the data for patterns and connections. The query results are 515 reflected upon and visualized. Although the workflow in Fig. 6 is iterative, the first 6 stages 516 should be sequentially completed before any iterative update can be made to any of 517 them. However, the last stage (Memo) can be referenced from any stage at any time. 518 519 520 5 RESULTS 521

General understanding of creativity 522
In Part 1 of the case study (Section 4.1), Questions 'a', 'b', and 'd' in Table 3 is to 523 address RQ1. They are designed to subtly reveal the participants' general understanding 524 of creativity relative to EDPE, which strongly correlates with creativity as mentioned in 525 Section 4.1. Responses to the questions are qualitatively analyzed using the WFA in Fig.  526 7, Cluster Analysis (CA) in Fig. 8, and text query search (Fig. 9.) 527 528

Fig. 7 Most frequent words in explaining creativity in engineering design 529
Presented in Fig. 7 are the 25 most frequent words used by the participants to 530 explain what creativity means to them. It could be seen that the participants mainly 531 associate the word 'problem' with creativity in engineering design. A cluster analysis (CA) 532 is performed to observe the relationship between the 25 words in Fig. 7. The CA result is 533 shown in Fig. 8. 'solutions', 'solving', and 'problem' could be seen encircled in Fig. 8. To understand the 540 context of the encircled cluster, a text query search is run with the 5 cluster words, and 541 the result is presented in Fig. 9. The Word Tree (WT) in Fig. 9 shows the root term as 'problem' which is the most 550 frequent word in the WFA in Fig. 7. For a clearer context of the relationships in the cluster 551 words in Fig. 8, five words are allowed on either side of the root term in Fig. 9. As the WT 552 shows, creativity is generally understood to be an EDPS phenomenon. This is likely to be 553 the participants' understanding of creativity from academia and literature. There is no 554 explicit association of creativity to EDPE by the participants. The next result presented is 555 on the teaching of creativity in engineering design in academia. The participants' responses to Questions 'c' and 'e' of Part 1 of the case study 561 (Table 3) are presented in Fig. 10 and Fig. 11. Questions 'c' and 'e' are designed to reveal 562 the adequacy and focus on creativity teaching in academia. As shown in Fig. 10, most of 563 the participants consider themselves creative which shows that creativity is a popular skill 564 in engineering design. 565 566

Fig. 11 An insight into creativity teaching in academia 567
Presented in Fig. 11, are the percentages of participants who are either taught 568 creativity in academia or industry and those who are not. It could be seen that not all the 569 participants who consider themselves creative (in Fig. 10) are formally taught creativity. 570 Some design engineers could be naturally creative without being formally taught as 571 shown in Fig. 11. However, this should not deter effort in teaching creativity techniques 572 and skills formally in academia and industry. Every good natural ability needs formal 573 support. For example, some people are naturally good at playing football but football 574 academies exist. There is the possibility that those who are creative (C -Yes) but not 575 taught creativity (CT -No) could have been more creative if formally taught creativity in 576 academia. Also, the possibility exists that those who are not creative (C -No) and not 577 taught creativity (CT -No) could have been creative if formally taught. The result 578 presented in Fig. 11 suggests that creativity teaching in academia may be below average 579 as over 50% of the participants are not formally taught creativity. For the lesser 580 percentage that is formally taught creativity, the focus is on EDPS while EDPE is ignored 581 as shown in Fig. 9. Following the completion of Part 1 of the case study, Part 2 is 582 commenced and the results are presented next. 583 584

Differentiating a computationally and naturally framed EDP 585
In Part 2 of the case study, as part of answering RQ2, the intent is to test if the participants 586 could differentiate between a computationally and naturally framed EDPs. In the set of 587 20 EDPs presented in Table 4, EDPs 1 -5 are framed by a design engineer while EDPs 6 -588 20 are framed by Pro-Explora. Participants are required to distinguish both categories of 589 EDPs, as explained in Section 4.1. The result of this activity for all the participants (Novice 590 and Experienced) is presented in Fig. 12. The "Novice" and "Experienced" participants are 591 code-named and presented in Table 5 for confidentiality. 592 593 Table 5 Novice and Experienced participants  594  Novice  Experienced  1013  1019  1026  1058  1022  1035  1041  1048  1014  1020  1027  1028  1036  1042  1049  1015  1021  1029  1030  1037  1044  1057  1016  1023  1043  1031  1038  1045  1059  1017  1024  1050  1032  1039  1046  1060  1018  1025  1051  1033  1040  1047  As shown in Fig. 12, some of the participants have zero failures in distinguishing a 601 naturally framed EDPs. These participants are "1015", "1016", "1021", "1028", "1029", 602 "1036", "1037", "1046", "1050", "1057", and "1060". It could be seen in Table 5 that some 603 of these participants are "Novice" while some are "Experienced". These zero failures 604 suggest that the "computationally framed EDP" judged by the respective participants as 605 a "naturally framed EDP" appears natural, useful, and meaningful. The correlation 606 between the participants' years of experience and failures in distinguishing a 607 computationally and naturally framed EDP is tested for statistical significance. The results 608 are presented in Table 6. Note that the participants are already categorized based on 609 years of experience in Table 2. Hence, the failures in Table 6 are relative to the 610 participants' (Novice and Experienced) years of experience. 611 Cosine similarity assessment 617 The result presented in Fig. 12 indicates a misjudgment of at least one naturally framed 618 EDP or computationally framed EDP by all the participants. This suggests a similarity 619 between the two categories of EDPs. As a further analysis, the naturally framed (EDP1 -620 EDP5) and computationally framed (EDP6 -EDP10) EDPs in Table 4 are assessed for 621 differences or similarities. The first EDP in Table 4 is named correspondingly as EDP1, the 622 second EDP2, and the tenth EDP10. A random quote (Q*) is added in Table 7 to see how 623 its similarity compares with the EDPs. The result of the assessment is presented in Table  624 7. The assessment is performed using cosine similarity, which measures similarity 625 between texts by "calculating the cosine of the angle between the two vectors" [90]. A 626 web text trained Spacy pipeline, en_core_web_lg, is used to compute the cosine 627 similarity. Spacy is an open-source python library for NLP. Cosine similarity ranges from 0 628 -1 with 1 indicating 100% similarity. It could be seen in Table 7 that most similarities 629 between the naturally and computationally framed EDPs are above 65%. This justifies the 630 failures in judgments in Fig. 12. Since Q* in Table 7 is a quote, its similarity with the EDPs 631 is the lowest across rows. 632 633 Table 7 Cosine similarity assessment result 634 Q* -"Anyone who has never made a mistake has never tried anything new." (Albert Einstein) 635 636

The value of computational support tool in EDPE 637
To address RQ2, in Part 3 of the case study, the participants used Pro-Explora as a support 638 tool to come up with some EDPs. They rated 5 of the EDP on a Likert scale of 1 -10 (with 639 10 being the highest). In Fig. 13,  The overall mean usefulness rating of the participants (Novice and Experienced) 644 shown in Fig.13 is 7.74. Coincidentally, the separate mean usefulness ratings of the 645 "Novice" and "Experienced" participants is 7.74 and 7.74, respectively. The mean for a 646 Likert scale of 1 -10 is 5.5. Hence, the overall usefulness rating of the participants (7.74) 647 for Pro-Explora generated EDPs is above the mean value (5.5)  "1030" up to participant "1033". This indicates that some of the separate ratings of the 653 participants to the right of "1030" are higher than that of participant "1030". As seen in 654 Table 5, these participants belong to either the "Novice" or "Experienced" category. 655 Some of the participants mention that they are inspired or prompted to think of a 656 different or related EDP based on the EDP framed by Pro-Explora. The correlation 657 between the participants' experience and their usefulness ratings is statistically analyzed 658 and presented in Table 8. The analyses in Table 8 are relative to the participants' years of 659 experience, as indicated in Table 2. As shown in Table 8, the analysis is performed for the 660 overall participants (Novice and Experienced). As a further confirmation, the analysis is 661 also performed separately for only the "Novice" and only the "Experienced" participants. 662 Coincidentally, as shown in Table 8, the p-value for the overall rating is the same as that 663 of the "Experienced" participants. 664

Academic implications 672
The findings and results contribute to knowledge by providing empirical evidence 673 on the 1) lack of focus on EDPE within the engineering design community and 2) value of 674 computational support in the EDPE process for the first time. The lack of attention on 675 EDPE contrasts with the standard expectation of design engineers in identifying societal 676 EDP using their experience, knowledge, and background [4,10,13,91]. The natural EDPE 677 process investigated in this paper requires creativity. Over 50% of the participants in the 678 case study indicate that they were not formally being taught creativity. This suggests a 679 lack of creativity teaching in academia within engineering design disciplines [92]. Also, the 680 effort in teaching creativity in academia is focused on EDPS while EDPE is ignored. The 681 case study result indicates that the general understanding of creativity is about EDPS 682 within the engineering design community. This understanding is likely from the teachings 683 provided in academia. Hence, effort in teaching creativity in engineering design disciplines 684 should equally focus on both EDPS and EDPE. 685

Industry implications 687
There are scholarly opinions, as mentioned previously, that EDPE is a challenging 688 activity. However, a paper on why EDPE is challenging lacks. This paper highlights the 689 possible determinants of the challenges associated with EDPE. Hence, this makes it 690 possible to extensively investigate some computational technologies that could support 691 the natural EDPE process, while it was previously indicated that computational EDPE it is considered "entirely reasonable to spend several months or longer thinking about 699 potential problems" to solve [94]. The EDP framed by Pro-Explora is given an average 700 "usefulness" rating of 7.74 out of 10 by both novice and experienced design engineers 701 (Section 5.5). This indicates that design engineers could be computationally and 702 intentionally inspired, prompted, or supported in using their knowledge in EDPE. The 703 inspiration occurs when the Pro-Explora framed EDP coincides with the design engineer's 704 knowledge, experience, and/or background. This is similar to serendipity discovery 705 (Section 2.1), and some participants agree that the EDP framed by Pro-Explora inspired 706 them to think of a different EDP. Knowledge is infinite, and design engineers cannot 707 measure their knowledge or intentionally recall all they know [95,96]. Hence, computationally prompting the design engineer of an EDP that may be within the domain 709 of their knowledge to solve is advantageous and a rapid way of discovering a new EDP, an 710 invention, or innovation. 711 712

Limitations and opportunities 713
The results are based on the direct responses provided by the participants. No further 714 verification of the information is carried out. For example, the Universities attended by 715 the participants who reported that they are not taught creativity are not contacted for 716 verification. Also, being an online activity, it is not certain whether the participants spent 717 longer or lesser than 10 minutes during EDPE with Pro-Explora. However, an instruction 718 to spend 10 minutes on the task is provided. During the case study, the participants used 719 Pro-Explora once for EDPE, rated its outputs above average, and requested access for 720 continued use which is granted. Further trials would be necessary to monitor the 721 subsequent rating for Pro-Explora and ensure an increased rating. The uniqueness of Pro-722 Explora framed EDP is based on a duplication recognition search in the original corpus 723 used in generating the EDP. This search is not extended to the google and patent 724 databases which are popular for verifying uniqueness. However, during a pilot test, a 725 manual search on google returned no duplicate for any Pro-Explora framed EDP. 726 Although BERT and LSTM technologies are potential computational technologies 727 for EDPE (Section 2.3), they have not been used to compare with the MM used in this 728 paper. Being in its infancy (Version 1), Pro-Explora will be improved further based on the 729 feedback received from the participants. This will include optimizing its outputs and exploring other related NLP technologies including BERT and LSTM. Data collection for 731 Pro-Explora database will continue, and its model will be updated continuously. 732 733

CONCLUSIONS 734
In this paper, case study-based evidence is provided to highlight the lack of 735 attention on EDPE -an important aspect of engineering design at early-stage product 736 design and development. Albeit there are few studies on the lack of attention on EDPE, a 737 study providing empirical evidence and determinants for it lacks. The natural approaches 738 related to EDPE are investigated including the "garbage can" model and serendipity 739 phenomenon. Some challenges and natural limitations associated with the natural EDPE 740 approach are identified including cognitive fatigue. This suggests that computational 741 support could be advantageous in the process. In response, a data-driven computational 742 EDPE framework and support tool -Pro-Explora are presented. The tool is the first-of-its-743 kind computational technology that mimics the natural EDPE process. It is based on a 744 synergy of the MM and some big data technologies including ML and NLP. 745 A case study is conducted with 43 participants including novice and experienced 746 design engineers. During the case study, the participants could not distinguish EDP 747 framed by Pro-Explora when presented alongside naturally framed ones. Using Pro-748 Explora as support, novice and experienced participants come up with at least 5 new EDPs 749 in about 10 minutes. This would be difficult or impossible with the natural EDPE approach. 750 The overall average rating provided by the participants on the usefulness of Pro-Explora 751 framed EDP is 7.74 out of 10. This is promising for accelerated innovations and inventions 752 in the industry. Further, the result shows that over 50% of the participants in the case 753 study did not receive any formal teaching on creativity in academia. This highlights the 754 importance of focusing on teaching creativity in engineering design-related disciplines 755 which is fundamental in EDPE. 756 757 ACKNOWLEDGMENT 758 All the students and professionals including academics and inventors who contributed to 759 this study by participating in the case study are hereby acknowledged. Pro-Explora GUI with some framed design problems Participants' usefulness rating for Pro-Explora framed EDP (with ±1 SE bar) 1010 1011 Table Caption List  1012  1013  Table 1 Comparison of algorithms for natural and computational EDPE Table 2 Case study participants' detail Table 3 Questionnaire for Part 1 of the case study Table 4 A sample set of 20 EDP for participants Table 5 Novice and Experienced participants Table 6 Relationship between experience and distinguishing a computational EDP Table 7 Cosine similarity assessment result Table 8 Relationship between experience and rating of a computational EDP  Identify an EDP of societal relevance by accident (serendipity), stochastic synthesis ("garbage can" model), logical progression ("rational" model), and/or conceptualization (apophenia) Frame an EDP of societal relevance by stochastic synthesis of big data, computational technologies (data extraction, ML, NLP), coding capabilities, connectionist theory, deterministic chaos, MM, BERT, and/or LSTM Search manually for prior existence in relevant databases using search engines.
Make an automated search for prior existence in relevant databases using duplication recognitions.
Decide, subject to acceptance by the society or a relevant authority Table 2 Case study participants' detail 1283 Table 4 A sample set of 20 EDP for participants 1353