Numbers vs Words

Presenting Your Rating Scales -Numbered versus Worded Lists

Creating proper rating scales is a skill that separates the survey design experts from the amateurs. Rating scales allow the researcher to measure the opinions and behaviours of respondents in a quantitative manner. Without the proper survey scaling, collected information runs the risk of containing bias and negatively impacting the survey results.

Last week, we discussed the different effects of an even and odd question scale. I thought it would be great to continue the conversation on proper survey scaling by discussing the difference between using a number rating scale (ex: 1-10) and a word rating scale (ex: Very Unlikely-Very Likely). Both are incredibly useful in their own ways and have become common place in today’s online surveys. In this article, we will go over the strengths and weaknesses of using each in your research projects.

Pick a Number! Any Number!

The number scale is a universally accepted form of survey measurement. With regards to a numbered scale question, the number selected will indicate the strength of the respondent’s opinion.


The greatest strength of a number scale is its simplicity. When conducting a survey internationally or with low education populations, it is sometimes hard to predict how a question could be misinterpreted. However, almost all cultures around the world are familiar with the standard number system and have seen a 1-10 rating scale before.

Another great strength of a number scale is the ease of conducting statistical analyses. With the simple numbering of options, each category label can represent the same value as its score. Since the label for each category reflects its score, there is no need to code information before crunching numbers. Conversely, a word scale requires the coding of the responses into its score values, then the researcher’s statistical analysis, and finally interpreting what the resulting numbers mean in accordance to the word scale.

Moreover, the numbered scale gives the researcher the ability to ask for a more precise answer. Word scales tend to become overwhelming when there are more than seven categories to choose from. With a number list, a scale can be as long as the researcher likes without confusing any participants. (Tip: Use the slider question type to best portray a number scale, with more than 10 intervals.)


Unfortunately, there is a major drawback to the number scale. A number scale is very subjective to the respondents. Depending on the person, a 5 on a scale from 1-10 could mean anything from good to barely a pass. Beyond this, it is much more difficult for some people to justify selecting a category on the extreme end of the scale than others. This all leads to respondents with the same opinion potentially selecting different categories, creating a source of response error in the survey. This error, will in turn, make it difficult to qualify what opinion the resulting data truly represents.

To combat this problem, it is useful to add descriptors either in the wording of the question or on the ends of your numbered scale. This will give respondents a better idea of what the scale represents to them, and also allow the researcher to more accurately define the meaning of the data he/she is collecting.

Using Your Words

Similar to the number scale, the word scale provides a list of scored categories for the respondent to select from. However, instead of each category being identified by its score value, the word scale uses a description that indicates what each category represents.

Scoring values are customizable, but would typically be as follows; Very Satisfied = 5, Satisfied = 4, Neutral =3, Dissatisfied = 2, Very Dissatisfied = 1, N/A = Void Response.


The largest strength of a word scale is the description it provides the respondent for each category in the question. Respondents can internalize their own feelings on the subject and decide which label reflects their opinion best. The word labels also allow the respondent to know exactly how their answers will be interpreted.

Not only does the word scale help describe each category, it allows researchers to present findings verbatim to the respondent’s opinions. For example, if 20% of respondents answered 5 on a scale of 1-5, the researcher would still not be able to qualify that in terms of wording. On the other hand, if 20% of people answered ‘Very Satisfied,’ the researcher can safely report that 20% of people in the study were very satisfied. Of course, there will always be subjectivity based on how generous respondents are on each word scale, but at the very least, it provides a direct connection between the respondent and researcher on the meaning of each score.

Another strength to a word scale is the flexibility in its scoring. With worded labels, the researcher has the freedom to score and label categories however they feel without confusing the respondent.

Scaling can also be unbalanced, and the scoring could look something like this; Extremely Good = 10, Very Good = 8, Good = 6, Not Bad = 5, Bad = 3, The Worst = 0. Even though this is a complex scoring system, the respondent will decide what category to select based solely on each option’s wording, allowing the researcher free reign on scoring without fear of creating confusion.


Of course, the downside to creating a word scale is the potential difficulty for respondents who are not fluent in the language to understand. Beyond this, the worded labels force respondents to fit into the researcher’s categories instead of expressing their own opinion. This potential for survey bias is best combatted through pretests designed to ensure respondents are comfortable with answering your scale, as well as the addition of an opt-out category like ‘N/A’, ‘Not Sure,’ or ‘Don’t Know.’

Moreover, a word scale is limited in the number of categories it can include. As mentioned earlier, any more than seven categories in a word scale will result in confusion and overwhelm many respondents. This handicaps the level of precision achievable by a word scale as compared to a number scale.

We Want to Hear from You!

FluidSurveys wants to know whether you’re a number scale or a word scale person and why. How have you taken advantage of each rating question types in your research projects? What are your rating scale pet peeves? Give us your take on the topic in the comments box below!

FluidSurveys Presents

Free Survey Q&A

Join our survey & research expert Rick Penwarden as he answers all of your questions every Wednesday at 1PM EST!


  • Confused customer says:

    I am not sure if you are familiar with Chrysler’s customer service surveys that ask customers of the service departments at dealerships to rate their experience on a number scale of 1-10. When reporting data back to the dealer, Chrysler only counts 9 and 10’s. Anything less than a 9 is basically the same as a zero. This is how Chrysler gets a Customer service score for a dealer. Does this make sense? As a customer I would think an 8 is pretty good but found out that I am really giving the service department a 0. I felt terrible.

    • RickPenwarden says:

      Hi confused customer!
      In the business world, it is quite common for companies to use 10 rating scales but having a cut off at a certain number. That number is usually based on past surveys that showed customers who were incredibly dissatisfied would actually give a response of 4-6 instead of 1 or 2 and therefore anything lower than a 6 shouldn’t have any value.
      In fact the question type that is all the rage right now is the net promoter score, check out the wikipedia page to see how it works:
      The short version is the Net Promoter Question has a predetermined segmentation of your respondents based on a loyalty rating scale from 0-10. 0-6 represents detractors, 7-8 represent passives and 9-10 represent promoters.
      Chrysler is probably trying to measure how many of their customers are loyal promoters. A response of 8 is good but chances are you won’t be belting out how great an experience it was on the rooftops 🙂
      Remember, number scales are very abstract. We don’t know what these numbers truly mean to people who select them. Is 5 average or barely passing? After a few surveys you can start to see patterns in your data and attribute better values for each number.
      Hopefully this helps, it’s always fun to talk about survey research theory!

Leave a Reply

Your email address will not be published. Required fields are marked *