Project Overview

Measuring aspects of the multilingual web

Faculty Sponsor

Joel Sommers (jsommers@colgate.edu)

Department(s)

Computer Science

Abstract

Many websites have capabilities for displaying content in different languages. Although there are websites frequently available to bilingual speakers of lesser-used (or non-dominant) languages in a region (e.g., Welsh within Wales, Maori within New Zealand, Spanish within the USA), those persons do not always make use of the non-dominant language services, preferring instead the dominant language of the region (e.g., English over Welsh, Maori, and Spanish). There are many reasons speakers of non-dominant languages may not make use of web-based services available in that language, but despite the relatively low reported usage of non-dominant language services users of those languages actually state that they want to use services in the non-dominant language and say that they will use them. So why aren't they? The goal of this summer research project is to carry out measurement experiments to better understand the mechanisms by which multilingual and bilingual websites determine whether to display content in a particular language. There are a few different mechanisms that are available to website developers to use, but it is presently unknown how these techniques are used in practice and the relative prevalence of the different techniques. Some of the ways in which a server may determine the language in which to render content include a "hardcoded" default (i.e., all clients see the same language, typically a "dominant" language such as English), the server may perform geolocation on the client's IP address and select a language based on that location, the client may send as part of a web request a language preference that the server may (or may not) act on, or the server may infer the language based on other protocol header information. Students on this project will design experiments to measure characteristics of multilingual and bilingual websites, create software to carry out these experiments, and collect and analyze the resulting data.

Student Qualifications

Some basic knowledge of HTML and web protocols is helpful, but not necessary. Students will need to have taken at least COSC102 (or the equivalent) and have a strong proficiency in the Python and/or Ruby languages as well as in object-oriented programming and basic data structures. A desire to learn new languages and tools through this work is essential!

Number of Student Researchers

2 students

Project Length

8 weeks


Applications open on 01/15/2017 and close on 02/07/2017


<< Back to List





If you have questions, please contact Karyn Belanger (kgbelanger@colgate.edu).