JIM - survey data analysis

Problem 1

In the course of the so-called JIM survey, in 2012 the use of information and media by adolescents of an age between 12 and 19 years was studied in Germany. The following table represents a subset of results for a representative sample of adolescents, among them 102 boys. For four kinds of devices, the number of girls and boys within the sample of 200 adolescents possessing the respective device is given.

Girls

Boys

Smart phone

42

52

Computer

77

87

TV set

54

65

Stationary game console

37

62

  1. Determine the probability that one person chosen at random out of the 200 adolescents is female and does not possess a TV set.

  2. Out of the 200 adolescents, one person possessing a TV set was chosen at random. Find the probability that this person is female.

  3. Justify that the events ”One person chosen at random out of 200 adolescents possesses a TV set“ and ”One person chosen at random out of 200 adolescents is a girl.“ are not independent.

  4. According to the survey, 55% of the girls of an age between 12 and 19 years possess a TV set. Give the value of the sum

    \[\sum\limits_{i=0}^{12}B(25;0.55;i)\]

    in percent. Justify that this value in general does not represent the probability that among 25 girls of a class in 9th grade less than half possess a TV set.

Solution of part 1a

There is a total of 98 girls in the group, 54 of them owning a TV set. Accordingly, 44 girls do not own a TV set. The probability to find a girl not owning a TV set therefore is given by

\[\frac{44}{200}=22\%.\]

We check the result by generating a list of 200 adolescents either being a girl with or without a TV set or a boy with or without a TV set. Then we draw at random out of this list and compile the number of persons in each category.

Now we can determine the probability of finding a girl without a TV set.

Solution of part 1b

It is stated in the problem text that the randomly chosen person owns a TV set and thus is either one of the 65 boys owning a TV set or one of the 54 girls with a TV set. The total number of persons owning a TV set thus amounts to 119. The probability that this person is a girl then is found as

\[\frac{54}{119}\approx 45.4\%\]

We make use of the simulation of part a) in order to empirically check this result.

Solution of part 1c

The two events \(A\) ”One person chosen at random out of 200 adolescents possesses a TV set“ and \(B\) ”One person chosen at random out of 200 adolescents is a girl.“ were independent provided

\[P(B|A) = P(B|\bar{A}) = P(B)\]

holds.

In part b) we already evaluated the probability for a person owning a TV set to be a girl. This value corresponds to \(P(B|A)\). It remains to determine the probability that a person chosen at random is a girl:

\[P(B)=\frac{98}{200} = 49\%.\]

It follows

\[P(B|A) = \frac{54}{119} \neq \frac{49}{100} = P(B)\]

and therefore the events \(A\) und \(B\) are not independent.

Solution of part 1d

We determine the sum by means of Sage and obtain approximately 30.6%.

Alternatively, the result can be obtained directly as:

The survey was carried out with adolescents between the ages of 12 and 19. However, it is not known whether it is representative for the 9th grade (about 14 to 15 years of age). Therefore, it is not premissible to make use of the value of the sum as the probability that out of 25 girls of 9th grade less than half own a TV set.

If, however, we assume that indeed 55% of the girls in 9th grade possess a TV set, we can use Sage to empirically check that the sum represents the probability that less then half of 25 girls possess a TV set.

Problem 2

According to the JIM survey, considerably less than 90% of the adolescents own a computer. Therefore, the city council of a provincial town is approached to install a workspace with computers in the local youth centre. The city council is only willing to invest the requested funds if less than 90% of the adolescents in the provincial town own a computer.

  1. The decision on the approval of the funds shall be based on an inquiry in the provincial town among 100 randomly chosen adolescents between 12 and 19 years of age. The probability that the funds are mistakenly approved, shall be at most 5%. Determine the corresponding rule for which at the same time the probability is minimal that the funds are mistakenly not approved.

  2. Determine the probability that exactly 85 among the 100 adolescents interviewed own a computer, provided the percentage of adolescents owning a computer among the adolescents in the provincial town is as large as among the adolescents represented in the table.

Solution of part 2a

For the given hypothesis test, we need to check how many of the 100 adolescents interviewed may own a coomputer such that the probability that more than 90% of the adolescents own a computer is at most 5%.

We assume that the random variable \(X\) representing the number of adolescents owning a computer is binomially distributed. Assuming a threshold of 90% of adolescents owning a computer, we determine the maximum value \(C\) for a sample size of 100 for which the probability does not exceed 5%:

\[\sum\limits_{i=0}^C B(100; 0.9; i)\leq 5\%\]

We determine the sum by means of Sage:

We can check the limiting value \(C=84\) by means of a simulation.

Solution of part 2b

The percentage recorded in the table of adolescents owning a computer is

\[\frac{77+87}{200} = 82\%.\]

At a probability of 82% for owning a computer, the probability that exactly 85 out 100 adolescents own a computer amounts to

\[P(X=85) = B(100;0.82;85)\]

With Sage we find \(P(X=85) \approx8.1\%\).

Problem 3

It can be assumed that among the adolescents owning a smart phone, the percentage of those owning a stationary game console is larger than among those not owning a smart phone. Determine for the 200 adolescents recorded in the table, how big the number of persons owning, both, a smart phone and a stationary game console must be, so that the assumption is valid for the adolescents recorded in the table.

Solution of part 3

This problem is concerned with the dependence of events. For the following, we introduce the events \(A\) „One person chosen at random out of 200 adolescents owns a stationary game console.“ and \(B\) „One person chosen at random out of 200 adolescents owns a smart phone.“

We demand that the two events are statistically dependent in a way that

\[P(A|B) > P(A|\bar{B})\]

is fulfilled. From the table we obtain \(P(A) = (37+62)/200 = 49.5\%\) and \(P(B) = (42+52)/200 = 47\%\).

By means of

\[P(A|B) = \frac{P(A\cap B)}{P(B)}\]

and

\[P(A\cap B) + P(A \cap \bar{B}) = P(A)\]

the above condition can be transformed into

\[\begin{split}\frac{P(A\cap B)}{P(B)} > \frac{P(A)-P(A\cap B)}{P(\bar{B})}\\ P(A\cap B)P(\bar{B}) > P(A)P(B)-P(A\cap B)P(B)\\ P(A\cap B)[P(\bar{B})+P(B)] > P(A)P(B)\\ P(A\cap B) > P(A)P(B)\\ P(A\cap B) > 0.495\cdot0.47\end{split}\]

Out of 200 adolescents, at least 23.3% must own a smart phone and a stationary game console for the hypothesis formulated in the problem to hold. This threshold amounts to 47 adolescents.

The limit for \(P(A\cap B)\) beyond which \(A\) and \(B\) depend on each other as requested, can also be determined with the help of Sage by solving a linear system of equations: