KUL Gebruikersinterfaces 2012 Team CHIll: Google+ evaluation results

On the 20^th of September 2011 after a test period of about three months, Google released Google+, its social network site, for the general public. It introduces some new concepts such as Circles to organise contacts and Hangouts for video chat. In order to get an idea of what users think of the application we conducted a small evaluation. This article presents a summary of our results. A more extensive document can be found here.

Approach and setting

Our evaluation is based on a QUIS 5.0 questionnaire (Questionnaire for User Interaction Satisfaction) with some additional questions about age, IT experience, social network experience, Google+ experience and Google experience in general. We chose this approach because it is an established method to gather user satisfaction information in a convenient way in a short timespan. Care has been taken in its formulation and goals. Other questionnaires or heuristics are sometimes too high-level (e.g. NAU, ASQ) or too long (e.g. PUTQ). Moreover, QUIS classifies the questions in categories, allowing us to judge the application at different levels of abstraction. Other questionnaires do this as well (e.g. USE, CSUQ). They mostly deal with similar issues, so we chose QUIS as a standard.

As this gives quite general information, some additional users have been interviewed as well. For this, light-weight 'think aloud' interviews have been used. Light-weight in the sense that the user was monitored by one person, although audio and screen were also captured for further study. We chose this dual approach to include both general and specific information.

With this information, we want to draw some conclusions about the user satisfaction for the various points of interest in the Quis questionnaire. This can guide developers in their general design. On the other hand, we will highlight some specific flaws noticed by several users. In this way, more specific corrections are possible as well.

Participants

Because questionnaires have become very common, people usually delay their response or don't answer at all if not asked personally. A time span of a week is rather short as well. These two effects lead to a rather low response of 11 people, nearly all in the age class 20-25. The chart below shows some other user characteristics.

IT experience is spread evenly. Time spend on networks is rather high, which might be due to the age class the participants belonged to. Whereas people have used Google products quite often, Google+ seems to have a smaller audience so far.

For the interviews, we asked 4 inexperienced users to perform some specific tasks. Their profile is consistent with the questionnaire participants.

Evaluation method

We evaluated the questionnaire graphically with boxplots to draw conclusions. On the one hand, we made a plot containing all of the questions. On the other hand, we made a per-category plot. Then we reasoned about the distribution of the users over the scores and which opinions were outliers for each category in general and for all of the questions in it. We referred to positive or negative comments we received as well.

The user interviews were compared horizontally, that is: we looked for difficulties experienced by multiple users when performing a task. We also tried to match this with the results from the questionnaire.

Evaluation results

The chart below shows the results of the QUIS questions. The numbers refer to the questions as shown on this questionnaire.

The graph below specifies the time (min:s) it took users to complete a task, thus giving a general quantified overview of the user interviews.

*estimated timings.

User 3 and 4 are significantly slower to create an account because they had to create a google account as well. Signing out was the easiest to find out. Deleting a profile was not an obvious thing to do. One user used Google search to find it eventually.

A more detailed analysis is provided in our more extensive document.

Conclusions

When looking at the representation on the screen, it was deemed slightly better than average. The use of highlighting could be improved. Two interviewees experienced the same problem: they didn't see they could created an account with their Google account, so they started from scratch. Highlighting could solve this.

Terminology is evaluated the same. Although terms are used consistently throughout the system, they represent some unknown concepts and seem therefore unrelated. A typical example are the Circles and Hangouts. All interviewees had troubles starting a 'video chat' because they didn't link it to Hangouts. Yet, pictorial representations are used as a useful clue.

Learning is also evaluated better than average. The availability of supplementary material could be improved. We know some material is available, but it is hard to find. Users also thought it was easy to learn by trial and error, as they are used to the kind of layout from other network sites. In interviews, people stated that even the hangouts were not much of an issue, once you know the term.

Capabilities of the system are considered the best part of it. We can conclude that the system is fast and reliable. Yet, we wonder if for instance the positive rating for noise is not based on a purely auditive interpretation, whereas we rather see it as whether the system is intrusive or not. Many people complained about clutter and irritating pop-ups.

As a result, the overall reaction to the software was rather positive. This conclusion is backed up by the impressions of the interviewees.

14 comments:

Gert Thijs24 February 2012 at 14:37
There seems to be something wrong with your pdf (at least when I try to open it), as it only shows the graphs and weird grayscale-gradient bars instead of text. The link to the QUIS questions also results in a 404...
Unknown25 February 2012 at 15:53
You have performed the questionnaire approach and the think aloud approach? Which in your experience would be the most useful for our own software project?
greetrobijns25 February 2012 at 17:15
Personally I think of advertisements when you ask about noise and clutter on Google+. Maybe that is what the other people were thinking about when answering your question, because there are no adds on Google+ (yet)
Dommicentl28 February 2012 at 12:16
We also did an evaluation using the QUIS questionnaire and became similar results. Our test subjects also thought that there wasn't enough reference material available. I think this is quite odd because there is material available people just don't seem to find it for some reason.
Ward Cools28 February 2012 at 12:53
Indeed I don't think the problem is that there's not enough reference material, people just don't seem to find it that easy.
Unknown29 February 2012 at 16:35
Very nice report. Our group evaluated Google+ using a usability lab. The results of starting a video chat are quite interesting. We didn't have this in our assignments (tough one of our subjects did start a hangout when he was asked to comment on a message). The term hangout isn't very clear indeed, but perhaps it is for native English speakers?
HeRoeland2 March 2012 at 13:14
The problem with the terminology is something that showed in our results as well. I actually find it a pitty, because they clearly put much effort in making an intiutive design, putting options where you want them to be, organizing the flow of screens,... But then users cannot find the option or start the flow because they don't understand the name on the button...
Tim Op De Beeck4 March 2012 at 17:47
Good idea to use QUIS questionnaire (subjective) in combination with an objective parameter such as the time it took users to complete certain tasks. However, how did you control in your test the influence of other parameters that can influence this ‘time’ parameter (e.g. environmental parameters such as noise, light, presence of other persons, etc.)?
Unknown5 March 2012 at 23:29
This comment has been removed by the author.

Friday, 24 February 2012

Google+ evaluation results