BSB123 Data Analysis
Take Home Assessment 5 – Hypothesis Testing and Simple Regression
For both questions please show working. You may use either tables or Excel to get the final calculations, however we do want to see the process that led you there. Information for Question One can be found in the file THA5 (2021-02).xlsx
Question One
You have been hired by RRB Bank which is interested in the factors leading to customers being at risk of defaulting on their credit cards. The associated spreadsheet contains information on three variables:
• Gender
• Risk Rating – Internal assessment comparing monthly credit spending to credit level. 0 – 4 with 0 as the lowest risk rating.
• Average Monthly Balance – that is the amount not paid off at the end of the month.
a. Many banks, including the Grameen Bank, consider that females are better at managing budgets. In order to determine if this is true RRB has asked you to conduct a test at the 5% level of significance to see if the average monthly balance of female clients is less than that of male clients. (6 Marks)
b. RRB is also concerned that there may be gender bias in lending practices and decide to look at the risk rating factor they have assigned on their clients. Conduct a test at the 5% level of significance to determine if there is a difference in the proportion of female and male clients who have a risk rating of 0. (Hint: Count the number of females with a rating of 0 and the number of males who have a rating of 0 and convert these to proportion) (5 Marks)
Question Two
A university lecturer is interested in the extent to which outside commitments affects students marks. She randomly selects 20 students and asks them to report on the average number of hours per week they spend on work, sport, music or other regular commitments. At the end of the semester she compared the reported hours with the final mark in her subject:
Student Mark Hours
1 68 24
2 74 23
3 72 15
4 81 32
5 61 24
6 74 27
7 84 15
8 73 16
9 75 20
10 42 17
11 68 4
12 78 8
13 82 18
14 74 14
15 70 17
16 65 22
17 32 36
18 54 26
19 68 13
20 80 10
1. State the equation to be estimated including any expectation you have about the relationship between marks and the number of hours committed to extenal activities.
2. Estimate the relationship between Marks and Hours. Which result do you use to measure the strength of the relationship. Interpret it fully.
3. Conduct a test to determine if the relationship between the two variables is significant.
4. What assumptions did you need to make in order to conduct this test.
5. State the estimated equation and interpret the slope coefficient.
6. What mark would you expect for a student who had an outside commitment of 40 hours per week. What concerns would you have with the prediction. There should be at least two.