Less than half of the world uses the basic Latin alphabet in their native languages. So it’s somewhat surprising that most of the world only has the ability to use basic characters in their email addresses and domain names. Part of the blame lies on ISP’s, some on the largest email providers, and another aspect of it results from the way that the internet has evolved. Google has taken positive steps in the past month to implement international address functionality to the Gmail product line, and will be introducing the feature in to Google Calendar in the coming months.
International Email, What Is It, and What Are the Challenges?
When international email is mentioned, the concern is actually with email addresses that use international character sets. These characters could come from languages with extended Latin alphabets, or even completely different alphabets such as Cyrillic.
Take these examples,
Both addresses include symbols from an alphabet outside of the ASCII character set. Up until now, large providers like Gmail, Microsoft, Yahoo, and the majority of internet service providers in the United States have supported only email headers encoded in ASCII.
Gmail has implemented a system based on an RFC put forward by the Internet Engineering Task Force (IETF). The system offers a way for mail providers to allow their systems to accept emails originating from addresses using full Unicode character sets. This includes addresses like the examples shown above, as well as extended Latin characters using accents such as ë, ö, ù, etc.
This extended Unicode character set, known as UTF-8 utilizes an 8-bit encoding method that is compatible with every Unicode character in existence.
How Does It Work?
Whenever an email is sent, it passes through a POP or IMAP server at different stages of delivery. Conventional mail servers, mail clients, and online email services like Gmail and Outlook Mail will traditionally only handle message headers containing ASCII characters.
Users trying to receive email from senders with addresses using extended character sets would not be able to do so in a conventional mail client, or through a web based mail system. Previously, a mail system might try to re-encode data in a non-native way that caused the recipient to receive incomplete headers or information within an email. This would occur because a mail system would first take a message header and convert it to the character set native to the system, which is not always successful.
Google has implemented a system that essentially ‘downgrades’ the way their mail system works to receive and be able to reply to emails originating from sources using full UTF-8 character encoding. RFC6855, 6856, 6857, and 6858 are all guideline implementations that allow Google to adjust their mail protocols to handle email headers encoded with the UTF-8 character set.
What Does This Mean for Gmail Users?
Gmail users won’t need to change anything in their accounts. Users will notice the improved support when communicating with their colleagues, friends, and associates.
At this stage there is no functionality for google mail users to create email addresses using extended character sets, but this is something that Google is considering for future updates.
Widespread Implications and Problems
Although this is a significant step for Google, it doesn’t reflect the position of the rest of the world’s largest email providers. The implementations are there for anyone to make these changes, but it will require widespread adoption before email across the internet becomes truly internationalized. Even some simple processes such as on site email verification don’t support the full ASCII character set, let alone full UTF-8 encoding.
Google has shown initiative by being one of the first major service providers to internationalize their email systems. The speed at which the rest of the industry removes similar barriers will determine how soon the internet realizes its potential as the true global communication tool that it is.