Decorative banner

Topic 4 - Statistics and Probability

Question 1

SLPaper 2

A dice manufacturer claims that for a novelty die he produces the probability of scoring thenumbers 1 to 5 are all equal, and the probability of a 6 is two times the probability ofscoring any of the other numbers.

To test the manufacture’s claim one of the novelty dice is rolled 350 times and the numbersscored on the die are shown in the table below.

A χ2 goodness of fit test is to be used with a 5% significance level.

1.

Find the probability of scoring a six when rolling the novelty die.

[3]
2.

Find the probability of scoring more than 2 sixes when this die is rolled 5 times.

[4]
3.

Find the expected frequency for each of the numbers if the manufacturer’sclaim is true.

[2]
4.

Write down the null and alternative hypotheses.

[2]
5.

State the degrees of freedom for the test.

[1]
6.

Determine the conclusion of the test, clearly justifying your answer.

[4]

Question 2

HLPaper 2

In a reforested area of pine trees, heights of trees planted in a specific year seem to follow a normal distribution. A sample of 100 such trees is selected to test the validity of this hypothesis. The results of measuring tree heights, to the nearest centimetre, are recorded in the first two columns of the table below.

Describe what is meant by

1.

a goodness of fit test (a complete explanation required);

[2]
2.

the level of significance of a hypothesis test.

[1]
3.

Find the mean and standard deviation of the sample data in the table above. Show how you arrived at your answers.

[4]
4.

Most of the expected frequencies have been calculated in the third column. (Frequencies have been rounded to the nearest integer, and frequencies in the first and last classes have been extended to include the rest of the data beyond 15 and 225. Find the values of a , b and c and show how you arrived at your answers.

[4]
5.

In order to test for the goodness of fit, the test statistic was calculated to be 1.0847. Show how this was done.

[3]
6.

State your hypotheses, critical number, decision rule and conclusion (using a 5% level of significance).

[5]

Question 3

SLPaper 2

A survey was conducted on a group of people. The first question asked how many pets they each own. The results are summarized in the following table.

The second question asked each member of the group to state their age and preferred pet. The data obtained is organized in the following table.

A χ 2 test is carried out at the 10 % significance level.

1.

Write down the total number of people, from this group, who are pet owners.

[1]
2.

Write down the modal number of pets.

[1]
3.

For these data, write downthe median number of pets.

[1]
4.

For these data, write down the lower quartile.

[1]
5.

For these data, write down the upper quartile.

[1]
6.

Write down the ratio of teenagers to non-teenagers in its simplest form.

[1]
7.

Statethe null hypothesis.

[1]
8.

State the alternative hypothesis.

[1]
9.

Write down the number of degrees of freedom for this test.

[1]
10.

Calculate the expected number of teenagers that prefer cats.

[2]
11.

State the conclusion for this test. Give a reason for your answer.

[2]

Question 4

HLPaper 2

Long term experience shows that if it is sunny on a particular day in Vokram, then the probabilitythat it will be sunny the following day is 0.8. If it is not sunny, then the probability that it will besunny the following day is 0.3.

The transition matrix T is used to model this information, whereT=0.80.30.20.7.

The matrix T can be written as a product of three matrices, PDP-1 , where D is adiagonal matrix.

1.

It is sunny today. Find the probability that it will be sunny in three days’ time.

[2]
2.

Find the eigenvalues and eigenvectors of T.

[5]
3.

Write down the matrix P.

[1]
4.

Write down the matrix D.

[1]
5.

Hence find the long-term percentage of sunny days in Vokram.

[4]

Question 5

HLPaper 3

A firm wishes to review its recruitment processes. This question considers the validityand reliability of the methods used.

Every year an accountancy firm recruits new employees for a trial period of one year from alarge group of applicants.

At the start, all applicants are interviewed and given a rating. Those with a rating of eitherExcellent, Very good or Good are recruited for the trial period. At the end of this period,some of the new employees will stay with the firm.

It is decided to test how valid the interview rating is as a way of predicting which of the newemployees will stay with the firm.

Data is collected and recorded in a contingency table.

The next year’s group of applicants are asked to complete a written assessment whichis then analysed. From those recruited as new employees, a random sample of size 18is selected.

The sample is stratified by department. Of the 91 new employees recruited that year,55 were placed in the national department and 36 in the international department.

At the end of their first year, the level of performance of each of the 18 employees in thesample is assessed by their department manager. They are awarded a score between1 (low performance) and 10 (high performance).

The marks in the written assessment and the scores given by the managers are shown inboth the table and the scatter diagram.

The firm decides to find a Spearman’s rank correlation coefficient, rs, for this data.

The same seven employees are given the written assessment a second time, at the end ofthe first year, to measure its reliability. Their marks are shown in the table below.

The written assessment is in five sections, numbered 1 to 5. At the end of the year,the employees are also given a score for each of five professional attributes:V,W,X,Yand Z.

The firm decides to test the hypothesis that there is a correlation between the markin a section and the score for an attribute.

They compare marks in each of the sections with scores for each of the attributes.

1.

Use an appropriate test, at the 5% significance level, to determine whether a newemployee staying with the firm is independent of their interview rating. State the nulland alternative hypotheses, the p-value and the conclusion of the test.

[6]
2.

Show that 11 employees are selected for the sample from the national department.

[2]
3.

Without calculation, explain why it might not be appropriate to calculate acorrelation coefficient for the whole sample of 18 employees.

[2]
4.

Find rs for the seven employees working in the international department.

[4]
5.

Hence comment on the validity of the written assessment as a measure of thelevel of performance of employees in this department. Justify your answer.

[2]
6.

State the name of this type of test for reliability.

[1]
7.

For the data in this table, test the null hypothesis, H0:ρ=0, against thealternative hypothesis, H1:ρ>0, at the 5% significance level. You mayassume that all the requirements for carrying out the test have been met.

[4]
8.

Hence comment on the reliability of the written assessment.

[1]
9.

Write down the number of tests they carry out.

[1]
10.

The tests are performed at the 5% significance level.

Assuming that:

  • there is no correlation between the marks in any of the sections and scoresin any of the attributes,
  • the outcome of each hypothesis test is independent of the outcome of theother hypothesis tests,

find the probability that at least one of the tests will be significant.

[4]
11.

The firm obtains a significant result when comparing section 2 of the writtenassessment and attribute X. Interpret this result.

[1]

Question 6

HLPaper 2

Steffi the stray cat often visits Will’s house in search of food. Let X be the discrete randomvariable “the number of times per day that Steffi visits Will’s house”.

The random variable X can be modelled by a Poisson distribution with mean 2.1.

Let Y be the discrete random variable “the number of times per day that Steffi is fed at Will’shouse”. Steffi is only fed on the first four occasions that she visits each day.

1.

Find the probability that on a randomly selected day, Steffi does not visit Will’s house.

[2]
2.

Copy and complete the probability distribution table for Y.

[4]
3.

Hence find the expected number of times per day that Steffi is fed at Will’s house.

[3]
4.

In any given year of 365 days, the probability that Steffi does not visit Will for at most n days in total is 0.5 (to one decimal place). Find the value of n.

[3]
5.

Show that the expected number of occasions per year on which Steffi visits Will’shouse and is not fed is at least 30.

[4]

Question 7

HLPaper 1

A factory, producing plastic gifts for a fast food restaurant’s Jolly meals, claims that just 1% ofthe toys produced are faulty.

A restaurant manager wants to test this claim. A box of 200 toys is delivered to the restaurant.The manager checks all the toys in this box and four toys are found to be faulty.

The restaurant manager performs a one-tailed hypothesis test, at the 10% significance level,to determine whether the factory’s claim is reasonable. It is known that faults in the toysoccur independently.

1.

Identify the type of sampling used by the restaurant manager.

[1]
2.

Write down the null and alternative hypotheses.

[2]
3.

Find the p-value for the test.

[2]
4.

State the conclusion of the test. Give a reason for your answer.

[2]

Question 8

HLPaper 2

A Principal would like to compare the students in his school with a national standard.He decides to give a test to eight students made up of four boys and four girls. One ofthe teachers offers to find the volunteers from his class.

The marks out of 40, for the students who took the test, are:

25,29,38,37,12,18,27,31.

For the eight students find

The national standard mark is 25.2 out of 40.

Two additional students take the test at a later date and the mean mark for all ten studentsis 28.1 and the standard deviation is 8.4.

For further analysis, a standardized score out of 100 for the ten students is obtained bymultiplying the scores by 2 and adding 20.

For the ten students, find

1.

Name the type of sampling that best describes the method used by the Principal.

[1]
2.

the mean mark.

[2]
3.

the standard deviation of the marks.

[1]
4.

Perform an appropriate test at the 5% significance level to see if the mean marksachieved by the students in the school are higher than the national standard. It canbe assumed that the marks come from a normal population.

[5]
5.

State one reason why the test might not be valid.

[1]
6.

their mean standardized score.

[1]
7.

the standard deviation of their standardized score.

[2]

Question 9

HLPaper 2

The number of marathons that Audrey runs in any given year can be modelled by a Poisson distribution with mean 1.3 .

1.

Calculate the probability that Audrey will run at least two marathons in a particular year.

[2]
2.

Find the probability that she will run at least two marathons in exactly four out of the following five years.

[4]

Question 10

SLPaper 2

Jim writes a computer program to generate 500 values of a variable Z. He obtains the following table from his results.

In this situation, state briefly what is meant by

1.

Use a chi-squared goodness of fit test to investigate whether or not, at the 5 % level of significance, the N(0, 1) distribution can be used to model these results.

[12]
2.

a Type I error.

[2]
3.

a Type II error.

[2]
Jojo

Intern at RevisionDojo this summer!

Gain work experience and make an impact on thousands of students worldwide. Limited spots available.