Data-mining approach to road maintenance support system (知能ソフトウェア工学)

We developed a road maintenance support system named “Kyoto-Michimorikun” to manage the road maintenance operations such as repairing the damage, accident and other troubles on the roads. Currently, the road maintenance operations tend to be passive handling because the operation starts after the notice from inhabitants or the road inspection staffs. To improve this situation, this paper proposes a new data-mining based method to achieve active road maintenance scheme. Proposed method adopts a clustering algorithm based on the location and accident type of the each case. The pilot evaluation shows that there remain two problems: the ambiguity of the case type selection and the lack of causal analysis in this operation. Consequently, we reach the setting of a “Knowledge-Concierge” to resolve these problems. Keyword Road maintenance,Kyoto-Michimorikun,Data-mining technology Introduction Recently, extracting the knowledge from accumulated data of everyday’s operation activities receive much attention. Various kinds of information systems are supporting operations in business organizations and local governments. These systems store huge amount of data everyday. It is expected that if the system can extract knowledge from these data by using data-mining technology, these knowledge will improve the quality of operations. We have developed and operated the road maintenance support system named “Kyoto-Michimorikun”. In this paper, the term “road maintenance” is defined as the everyday’s operations such as repairing damaged facilities and illegal dump on roads based on the information by the inhabitants or the road inspection by the staffs. This operation is one of the most important services for inhabitants in local community. As the needs and requests of inhabitants, which comes from the multiplicity of them, becomes diverse, various kinds of reports and requests are sent to the division in charge of road maintenance. However because of the shortage of budget and human resources, road maintenance section cannot fully correspond to these requests. Therefore, the operation often becomes (1) passive, which means that after the request is sent to the section, the section will cope with this request, and (2) the section often make supportive care for each request, which is not the essential resolution of problems. These operations may increase the reoccurrence of the same requests and the number of requests. If we can resolve these current problems and predict the occurrence of requests, it is easy to establish the road repair operation strategy depending on degree of urgency. This preventive maintenance operation allows the well-planned budget use and human resources, and optimizes whole operation. As a result, preventive maintenance will rise up the satisfaction level of inhabitants. In this paper, we applied the data-mining technology to the data accumulated in “Kyoto-Michimorikun”. There were about 800 cases of request that had stored for five years, and we conducted the clustering based on the latitude and longitude of each case and type of request. The clustering operation could find the specific location where many similar troubles occurred repetitively. Furthermore, the importance of the cause of each case was emphasized. In order to set the appropriate cause information of each case, the setting of “Knowledge-Concierge” is advisable in unified type selection. Figure.1 Workflow on the road maintenance section 1. Application of Data-Mining Technology to The Current Road Maintenance In current road maintenance, operations often become passive. Staffs often give higher priority to remove unfavorable situations (Figure.1). We have developed and operated “Kyoto-Michimorikun”. This system could realize the rapid response to requests from inhabitants and road inspection staffs. As the next step of enhanced usage of this system, we applied the data mining method to accumulated data for optimizing the whole operation process. These data accumulated in “Kyoto-Michimorikun”. Figure.2 The image on applying the data-mining technology Figure.2 shows schematic illustration of the application of data-mining technology to accumulated data. The top of Figure.2 shows that many similar troubles occurred repetitively in the current operation. For example, many insect troubles occurred in specific location every year. In the middle and bottom of Figure.2, data-mining technology can extract troubles which occurred around specific location. Consequently, staffs can address and handle the essential causes before the occurrence of each case and thus this preventive maintenance will improve the satisfaction level of inhabitants. The next chapter describes about the algorithm and result of data-mining technology in detail. 2. Data-Mining based on the Latitude and Longitude We applied the data-mining to accumulated cases because current operation becomes the preventive maintenance which we described in the 1st chapter. We will show the algorithm and result of data-mining technology in this chapter. 2.1. Algorithm of Data-Mining We applied the data-mining to about 800 cases which were occurred at a city in Kyoto Prefecture. We took following steps. 2.1.1. Morphological Analysis The title and content of each request is written in natural language. Therefore, we break the comment of each case into single word by morphological analysis technology. We have used “ChaSen” for morphological analysis. “ChaSen” split input sentences into single word and add word class. We do not consider the addresses of trouble locations because of causes of missed classification. 2.1.2 Counting up the Occurrence of each Word We counted up the occurrence of each word generated by the morphological analysis and extracted the words of middle degree of occurrence frequencies. We define words with too many frequencies as common words and ones with too low as typical words. In the following evaluation experiment, there were not many for applying data-mining technology. Therefore, we include some frequency words in the evaluated candidates. We will show the result on the Table.1. 2.1.3 Clustering based on the Latitude and Longitude of the location of Each Case We apply the clustering based on the latitude and longitude of the location of each case. As a result of this experiment, we could extract some clusters, each of which include at least three cases. The result will be shown on the screen using Web-GIS Table.1 the clustering based on the latitude and longitude of each case 2.2 Result of 3 Steps Figure.3 Specifying the location of the cases with insects We have conducted above-mentioned 3 steps to accumulated 800 cases, and could visually confirm several cases in which some similar troubles continually occur in the vicinity. Figure.3 is an example of the cluster. In this case, the location is on a famous bridge as sightseeing spot. But staffs who did not have practical experiences for corresponding troubles could not find out the cause of troubles. For example, there were the cases with insects. These troubles were found near the bridge. In the vicinity of this bridge, there were several trees, and these trees cultivated the insects. In order to response this case, we advised to remove these trees, which was one of the essential actions for resolving these cases. In this example, the essential resolution needed the following several knowledge. 1) the cycle of insect troubles have occurred in specific location. 2) the tree cultivated the insect. 1) is the knowledge which can identify the location where staffs should repair. This knowledge can be extracted from accumulated data by data mining. 2) is the knowledge which identifies the cause of cases and how to deal with them. But the result by data mining cannot derive the latter knowledge. The staffs of local governments often move to another section on every a few years. Therefore, sometimes the staff does not have practical experience of corresponding previous case derived by data-mining technology. When the staff correspond the case about insect troubles, the staff should analyze the cause and write documents about the cause and action of each case. 3. Analyzing the Evaluation Experiment 3.1 Importance of the Cause for each case In order to realize the appropriate road maintenance activities, organization of local governments should share the cause of cases such as the trees cultivated the insect were planted in the vicinity of that bridge. The staffs who belong to the road maintenance section do not have shared the cause now. In the operation, doing successive correspondence to each case from inhabitants or road inspection staffs is a principal duty. This operation require the staffs to promptly dispose the successive correspondence: “information from the inhabitants or the staffs” → “survey the location of case and write the report about case” → “request to the contractor of engineering bureau for repairing the location of trouble”. Therefore, the staffs do not have to describe the cause, and the causes of each case do not have been recorded in the document shared within the road maintenance section. In applying data-mining to road maintenance operation, the cause is essentially important. Data-mining is one of the machine learning technologies. In the machine learning, the target concept is represented by extracting some attributes from huge amount of data. But in the operation, target concept cannot be represented unless using a new attribute which each case does not hold. As an typical application of data-mining technology, the market basket analysis is very popular. In the market basket analysis, the clusters of “advisable data” for the operations are extracted. This target concept extracted by data mining can directly apply to the operation as “useful knowledge“. On the other hand, in the road maintenance operation, the clusters of “inadvisable data” are extracted. This target concept cannot directly apply to th