Case: Automotive Customers & Usage of the Web

The first case study to present refers to a survey conducted on customers of the automotive industry. As the feature under focus served the frequency of internet usage, regarding topics related to the industry. 

The initial data set is comprised of 37 questions (attributes, all of structured textual nature) and answers from 319 customers (instances). The questions’ titles are presented in the following Table 1.

Table 1: Question titles (attributes)

Table 2 gives further information on the target variable, namely the internet usage and the possible answers to this question.

Table 2: Target attribute details

Extensive experimentation with some advanced filtering techniques resulted in introducing the most significant questions, in regards to their informational value according to the target question. Table 3 gives the 10 attributes of most importance.

Table 3: Questions of most importance

While the latter Table clearly provides some critical insights on the factors that correlate with the web surfing patterns of customers, the real payoff comes when performing data mining on these inputs. Extensive experimentation was conducted with a bunch of sophisticated machine learning algorithms, which were fine tuned to finally result in extracting rules and patterns like the ones that follow.

Rule 01: If car_owner=yes and carvalue=35-50k then internet_use=frequently

For example, Rule 01 introduces that  a typical owner of a car valued between 35 to 50 thousand euros is expected to use the internet frequently for searching relevant to cars information.

Rule 02: If 16v=no and age=18-25 and spoiler=yes then internet_use=never

On the other hand, Rule 2 suggests, with a 80% certainty, that a young user (aged 18-25) who does understand about spoilers but don’t know about 16v engine features, is not expected to search for relevant to cars information on the internet.

The full report, available for download here (.pdf), finally contributes more than 50 such patterns about the data set under focus, and you may consider it as indicative of our services.



