Monday, February 27, 2012

Human Computer Interaction CAP5100
Report – User Study on VOIP Interface Design
Study Design
1. Hypothesis: People who use an interface design which places the in-call features to the right of the number pad will take less time to perform VOIP tasks than one which places the in-call features above the number pad.
NULL Hypothesis: Assume that an interface design which has in-call feature buttons beside number pad takes more time for tasks than one which has in-call feature buttons above the number pad.
2. Hypothesis: Adding different colors for different features of the interface makes it easier for user to understand than an interface which doesn’t have different colors.
NULL Hypothesis: Assume that an interface without different colors (for different features) receives same mean score on a Likert scale as one with different colors.
3. Hypothesis: Using a single control for changing date and time is more pleasing than using different controls/drop down boxes for year, month, day, hour, minutes and seconds.
NULL Hypothesis: Assume that using a single control for changing date and time gets lesser mean score on a Likert scale as using different separate controls.
4. Hypothesis: An interface design with a backspace button is more useful than one without a backspace button.
NULL Hypothesis: Assume that an interface design without a backspace button receives same mean score on a Likert scale (usability) as an interface with a backspace button.
Metrics
The metrics which are used for this user study are
• Time taken for performing same set of tasks on user interface and TA’s interface.
• User survey on a Likert scale measuring ease of use, satisfaction and aesthetics.
• A free response from users giving difficulties faced and suggestions for improvement.
Procedure:
• The participants of the user study are 20 fellow graduate students.
• The participants are provided 4 links, which are: User interface with sample task list, User interface with main task list, TA interface with same main task list, Survey link which contains a Likert scale for usability, satisfaction and aesthetic design, and a free response field.
• In the first link, the user is provided a sample task list which helps user to get familiar with the new interface created. As tasks are completed successfully, system advances to following tasks. Here, no errors or time taken for tasks are recorded. It is assumed that the user tried only valid tasks. After the user is comfortable with the interface, the user can proceed to main task list.
• The second link has the main task list. The time taken to complete the entire task list is recorded. After the entire task list is completed, the TA’s interface is displayed.
• The third link has TA’s interface with same task list. The time and errors are recorded.
• The fourth link has a survey on the interface created with a 7 point Likert scale for Ease of Use, Satisfaction and Aesthetic Design, and a free response field. The user completes the survey and emails the survey and times and errors recorded for both the interfaces.
Data Analysis
To accept a hypothesis to be true (i.e. reject the NULL hypothesis), a good T-value needs to be above 1.96 (which indicates that ), and the P-value should be sufficiently small, say below 0.1 (which indicates that there is negligible probability of getting the obtained results without them existing).
Hypothesis 1:
Data: Time Taken + Free Comments
Interface Size Mean Standard Deviation T-Values P-Values
User 20 109.9 34.92834771 0.106077 0.918196564
TA 20 111.05 35.41702683
Based on data provided above, the P-Value is 0.91 which clearly shows that there is high probability of the results being a fluke. Though the average time taken for task set on User interface is less than that of the TA’s interface, this doesn’t hold significance because of the high P-value. So, next we consider the free comments section. There have been 4 comments on the survey stating that the in-call features being far away from the display made it more time consuming to check with the display, to see if they were working. Thus, the NULL hypothesis is proved true, thus given hypothesis is rejected.
Hypothesis 2:
Data: Aesthetics Likert Scale + Free Comments
Interface Size Mean Standard Deviation T-Values P-Values
User 20 5.5 1.05 0.910071 0.32442
TA 50 5.76 0.751597
The P-value is still 0.32, so the data has a slight probability of being random. Going through the free comments section of both interfaces, there are 7 comments in the TA’s survey mentioning lack of color, and 2 comments in the user interface mentioning that the colors are aesthetically pleasing. So the NULL hypothesis is rejected, and the hypothesis is accepted.
Hypothesis 3:
Data: Satisfaction Likert Scale + Free Comments
Interface Size Mean Standard Deviation T-Values P-Values
User 20 5.45 0.6475 1.096136 0.20017
TA 50 5.74 0.876216
Here, the p-value is 0.2 which is better than the previous values. The mean score is higher for the TA interface, so we can say that the satisfaction is greater for the TA’s interface than the user interface. Also, there are 3 user comments in the user interface mentioning that a single date/time control was hard to use. Thus the NULL hypothesis is true, so the proposed hypothesis is rejected.
Hypothesis 4:
Data: Ease of Use Likert Scale + Free Comments
Interface Size Mean Standard Deviation T-Values P-Values
User 20 5.3 0.732695 3.651484 0.00018
TA 50 6.08 0.751597
The P-value here is nearly zero which indicates that the results are probably authentic. The high T-value also vouches for the same. The TA interface is easier to use. But this cannot be taken as the only metric. There have been 7 free comments asking for backspace button, so we can safely conclude that adding a backspace button is more useful than an interface without it.
Conclusions
Bad design Decisions
1. I had added a backspace button right next to the display (to delete wrongly typed numbers) with the image of an arrow pointing right rather than left, because I felt it was more intuitive to show the direction where deletion would take place. This didn’t work as anticipated because
a. The users were confused because they were used to seeing backspace buttons with the arrow pointing to the left,
b. While displaying voicemail statuses such as “Playing message 1 of 4”, they assumed that the arrow was for navigating through voicemail.
2. I had placed the dial button below the number pad rather than above. This was because I thought it would take less time to finish dialing and number and moving mouse down, rather than up, but this was not a very good decision, because some users expected the dial button to be above, and had some trouble finding it (in spite of large size and color)
Bias and Confounds
1. The task list for evaluating my user interface joined multiple tasks together at certain instances, which might have contributed to difference in time taken on the user interface and TA interface.
2. Since the users had been testing other interfaces as well, the majority of which had in-call controls placed right above the number pad, my user interface might have been confusing to them, and caused them to spend more time on the task list. The graph on the following page showing the total time taken on the TA’s interface and user Interface clearly shows the variation where approximately half the users performed better on the TA’s interface and the remaining on my interface.
Design Lessons Learnt
1. Grouping up elements which logically belong together help in making user experience more satisfying eg. the in-call features above the number pad)
2. The fields which display important information to the user should be of a larger size compared to the rest of the elements around it.
3. Adding color and icons to buttons and other elements should be done appropriately, to avoid confusing the user.
4. A good interface design needs to separate elements into their logical parts rather than clumping them up together.
5. To improve aesthetic appeal to users, similar elements should be sized similarly, to provide visual cues.

Graphs Comparing User Interface with TA Interface

a) Ease of use – Likert Scale b) Satisfaction – Likert Scale

c) Aesthetics – Likert Scale d) Total time taken on task list - Seconds


Legend
Blue – Evaluation on User Interface
Red – Evaluation on TA’s Interface

Time Spent on Other Interfaces
S.No. Name Time spent (seconds)
1 Sushma Gopalakrishnan 161
2 Chirag Gupta 214
3 Keerthi Gurijala 97.467
4 Guangyan Hu 60
5 Milan Jape 119
6 Arundhati Jaswal 103
7 Arjun Kantamneni 211
8 Vaishnavi Krishnan 103
9 Ashok Rajendran 59
10 Neeraj Rao 136
11 Nachiketa Roy 127.147
12 Vikrant Sagar 115
13 Tejas Shah 54
14 Vaarij Shah 68
15 Amin Shams 93
16 minathan Sivasankaran 153
17 Arpit Tripathi 108
18 Neha Uppal 90.603
19 Shangqing Wang 129
20 Tianbo Xue 160
Time Spent on My Interface
S.No. Names Total Time on my interface
1 Navina Ramesh 215
2 Vaarji Shah 145
3 Amin Shams 139
4 Doaa Elsheikh 133
5 Yashwant Bisht 131
6 Arpit Tripathi 124
7 Kretika Gupta 120
8 Neha Uppal 119
9 Shangqing Wang 118
10 Harpreet Arora 110
11 Jose Dunia 108
12 Tejas Shah 105
13 Vikrant Sagar 93
14 Giridhar Vijaykumar 89
15 Sivagaminathan Sivasankaran 86
16 Dongsan Yoon 81
17 Rahul Bhoopalam 78
18 Andrew Cordar 75
19 Robert Edmondson 72
20 Tianbo Xue 57

hhh

Human Computer Interaction CAP5100
Report – User Study on VOIP Interface Design
Study Design
1. Hypothesis: People who use an interface design which places the in-call features to the right of the number pad will take less time to perform VOIP tasks than one which places the in-call features above the number pad.
NULL Hypothesis: Assume that an interface design which has in-call feature buttons beside number pad takes more time for tasks than one which has in-call feature buttons above the number pad.
2. Hypothesis: Adding different colors for different features of the interface makes it easier for user to understand than an interface which doesn’t have different colors.
NULL Hypothesis: Assume that an interface without different colors (for different features) receives same mean score on a Likert scale as one with different colors.
3. Hypothesis: Using a single control for changing date and time is more pleasing than using different controls/drop down boxes for year, month, day, hour, minutes and seconds.
NULL Hypothesis: Assume that using a single control for changing date and time gets lesser mean score on a Likert scale as using different separate controls.
4. Hypothesis: An interface design with a backspace button is more useful than one without a backspace button.
NULL Hypothesis: Assume that an interface design without a backspace button receives same mean score on a Likert scale (usability) as an interface with a backspace button.
Metrics
The metrics which are used for this user study are
• Time taken for performing same set of tasks on user interface and TA’s interface.
• User survey on a Likert scale measuring ease of use, satisfaction and aesthetics.
• A free response from users giving difficulties faced and suggestions for improvement.
Procedure:
• The participants of the user study are 20 fellow graduate students.
• The participants are provided 4 links, which are: User interface with sample task list, User interface with main task list, TA interface with same main task list, Survey link which contains a Likert scale for usability, satisfaction and aesthetic design, and a free response field.
• In the first link, the user is provided a sample task list which helps user to get familiar with the new interface created. As tasks are completed successfully, system advances to following tasks. Here, no errors or time taken for tasks are recorded. It is assumed that the user tried only valid tasks. After the user is comfortable with the interface, the user can proceed to main task list.
• The second link has the main task list. The time taken to complete the entire task list is recorded. After the entire task list is completed, the TA’s interface is displayed.
• The third link has TA’s interface with same task list. The time and errors are recorded.
• The fourth link has a survey on the interface created with a 7 point Likert scale for Ease of Use, Satisfaction and Aesthetic Design, and a free response field. The user completes the survey and emails the survey and times and errors recorded for both the interfaces.
Data Analysis
To accept a hypothesis to be true (i.e. reject the NULL hypothesis), a good T-value needs to be above 1.96 (which indicates that ), and the P-value should be sufficiently small, say below 0.1 (which indicates that there is negligible probability of getting the obtained results without them existing).
Hypothesis 1:
Data: Time Taken + Free Comments
Interface Size Mean Standard Deviation T-Values P-Values
User 20 109.9 34.92834771 0.106077 0.918196564
TA 20 111.05 35.41702683
Based on data provided above, the P-Value is 0.91 which clearly shows that there is high probability of the results being a fluke. Though the average time taken for task set on User interface is less than that of the TA’s interface, this doesn’t hold significance because of the high P-value. So, next we consider the free comments section. There have been 4 comments on the survey stating that the in-call features being far away from the display made it more time consuming to check with the display, to see if they were working. Thus, the NULL hypothesis is proved true, thus given hypothesis is rejected.
Hypothesis 2:
Data: Aesthetics Likert Scale + Free Comments
Interface Size Mean Standard Deviation T-Values P-Values
User 20 5.5 1.05 0.910071 0.32442
TA 50 5.76 0.751597
The P-value is still 0.32, so the data has a slight probability of being random. Going through the free comments section of both interfaces, there are 7 comments in the TA’s survey mentioning lack of color, and 2 comments in the user interface mentioning that the colors are aesthetically pleasing. So the NULL hypothesis is rejected, and the hypothesis is accepted.
Hypothesis 3:
Data: Satisfaction Likert Scale + Free Comments
Interface Size Mean Standard Deviation T-Values P-Values
User 20 5.45 0.6475 1.096136 0.20017
TA 50 5.74 0.876216
Here, the p-value is 0.2 which is better than the previous values. The mean score is higher for the TA interface, so we can say that the satisfaction is greater for the TA’s interface than the user interface. Also, there are 3 user comments in the user interface mentioning that a single date/time control was hard to use. Thus the NULL hypothesis is true, so the proposed hypothesis is rejected.
Hypothesis 4:
Data: Ease of Use Likert Scale + Free Comments
Interface Size Mean Standard Deviation T-Values P-Values
User 20 5.3 0.732695 3.651484 0.00018
TA 50 6.08 0.751597
The P-value here is nearly zero which indicates that the results are probably authentic. The high T-value also vouches for the same. The TA interface is easier to use. But this cannot be taken as the only metric. There have been 7 free comments asking for backspace button, so we can safely conclude that adding a backspace button is more useful than an interface without it.
Conclusions
Bad design Decisions
1. I had added a backspace button right next to the display (to delete wrongly typed numbers) with the image of an arrow pointing right rather than left, because I felt it was more intuitive to show the direction where deletion would take place. This didn’t work as anticipated because
a. The users were confused because they were used to seeing backspace buttons with the arrow pointing to the left,
b. While displaying voicemail statuses such as “Playing message 1 of 4”, they assumed that the arrow was for navigating through voicemail.
2. I had placed the dial button below the number pad rather than above. This was because I thought it would take less time to finish dialing and number and moving mouse down, rather than up, but this was not a very good decision, because some users expected the dial button to be above, and had some trouble finding it (in spite of large size and color)
Bias and Confounds
1. The task list for evaluating my user interface joined multiple tasks together at certain instances, which might have contributed to difference in time taken on the user interface and TA interface.
2. Since the users had been testing other interfaces as well, the majority of which had in-call controls placed right above the number pad, my user interface might have been confusing to them, and caused them to spend more time on the task list. The graph on the following page showing the total time taken on the TA’s interface and user Interface clearly shows the variation where approximately half the users performed better on the TA’s interface and the remaining on my interface.
Design Lessons Learnt
1. Grouping up elements which logically belong together help in making user experience more satisfying eg. the in-call features above the number pad)
2. The fields which display important information to the user should be of a larger size compared to the rest of the elements around it.
3. Adding color and icons to buttons and other elements should be done appropriately, to avoid confusing the user.
4. A good interface design needs to separate elements into their logical parts rather than clumping them up together.
5. To improve aesthetic appeal to users, similar elements should be sized similarly, to provide visual cues.

Graphs Comparing User Interface with TA Interface

a) Ease of use – Likert Scale b) Satisfaction – Likert Scale

c) Aesthetics – Likert Scale d) Total time taken on task list - Seconds


Legend
Blue – Evaluation on User Interface
Red – Evaluation on TA’s Interface

Time Spent on Other Interfaces
S.No. Name Time spent (seconds)
1 Sushma Gopalakrishnan 161
2 Chirag Gupta 214
3 Keerthi Gurijala 97.467
4 Guangyan Hu 60
5 Milan Jape 119
6 Arundhati Jaswal 103
7 Arjun Kantamneni 211
8 Vaishnavi Krishnan 103
9 Ashok Rajendran 59
10 Neeraj Rao 136
11 Nachiketa Roy 127.147
12 Vikrant Sagar 115
13 Tejas Shah 54
14 Vaarij Shah 68
15 Amin Shams 93
16 minathan Sivasankaran 153
17 Arpit Tripathi 108
18 Neha Uppal 90.603
19 Shangqing Wang 129
20 Tianbo Xue 160
Time Spent on My Interface
S.No. Names Total Time on my interface
1 Navina Ramesh 215
2 Vaarji Shah 145
3 Amin Shams 139
4 Doaa Elsheikh 133
5 Yashwant Bisht 131
6 Arpit Tripathi 124
7 Kretika Gupta 120
8 Neha Uppal 119
9 Shangqing Wang 118
10 Harpreet Arora 110
11 Jose Dunia 108
12 Tejas Shah 105
13 Vikrant Sagar 93
14 Giridhar Vijaykumar 89
15 Sivagaminathan Sivasankaran 86
16 Dongsan Yoon 81
17 Rahul Bhoopalam 78
18 Andrew Cordar 75
19 Robert Edmondson 72
20 Tianbo Xue 57

Thursday, January 19, 2012