SMS Language

In your SMS business usage, you might have noticed that if your text contains characters specific to your country’s alphabet, the number of remaining SMS characters can drop significantly. This can cause longer messages to be split into two, or more, separate messages which will double or even triple the cost of reaching your customer.

By using standard encoding for GSM messages, the 7-bit default alphabet, you can fit 160 characters in a single SMS message. Here is a list of allowed characters displayed as Basic Character Set (left) and Basic Character Set Extension (right).

Character set

If you include even a single character that is not supported in the default alphabet, all message characters will be encoded in a different standard which will cause a maximum number of characters to drop to only 70 per message! If the message is longer than 70 characters it will be divided into two parts, where the second message will also be limited to 70 characters, even if the second message contains only the basic GSM alphabet. The same rules are applied to every consequent message part.

MESSAGE ENCODING

We will not get into message encoding details in this tutorial. If you wish to learn more about the subject, visit this page.

If you need to handle special characters in your SMS messages, there are two main approaches that can be taken in order to increase character capacity closer to standard SMS size.

  • Transliteration
  • National Language Identifier (NLI)

National Language Identifier 

National Language Identifier (NLI) is an encoding technology which allows an SMS containing language-specific characters usually treated as 16bit Unicode to be delivered as the original text, while only deducting 5 characters from the maximum SMS length—155 characters allowed. The remaining 5 characters are used in the background to instruct the receiver’s device about the selected language and how to properly display it on screen.

By sending a fully featured textual message and setting the languageCode parameter you can send your language-specific characters. Supported languages are:

Language code Language
TR Turkish
ES Spanish
PT Portuguese

In this example, a message containing Turkish alphabet will be sent.

POST /sms/1/text/advanced HTTP/1.1
Host: api.infobip.com
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==
Content-Type: application/json

{
   "messages":[
      {
         "from":"InfoSMS",
         "destinations":[
            {
               "to":"41793026727"
            }
         ],
         "text":"Artık Ulusal Dil Tanımlayıcısı ile Türkçe karakterli smslerinizi rahatlıkla iletebilirsiniz.",
         "language":{
            "languageCode":"TR"
         }
      }
   ]
}

Here is a list of supported characters for each of the supported languages:

Supported languages

Turkish

Portuguese

Spanish

PREVIEW MESSAGES BEFORE SENDING!

Nonstandard characters may cause messages to encode in Unicode, which can considerably reduce the number of available characters per message. We recommend using the SMS preview method to explore all options before sending it.

IMPORTANT

There is a chance that certain networks don’t support the Language feature, so we can’t guarantee 100% that this functionality will work for all destinations. For example, if a message with the Turkish language is sent over a Chinese provider it might not display properly on the recipient’s device.

SMS transliteration

Sending messages with special characters can be quite expensive in terms of maximum characters per message. This should be taken seriously because if even one illegal character is included, message capacity drops from 160 to only 70 characters! This can cause a message to split into multiple parts, increasing the price significantly.

Transliteration is a method of replacing special (unsupported) characters with similar or related characters that are part of the default alphabet. This process ensures that a maximum number of 160 characters per message can still be used in a message, instead of only 70 (because of the different encoding standards). The downside of this approach is that the delivered message will look slightly different.

With this method, you can send messages in your preferred alphabet and they will automatically convert into the appropriately transliterated script. This way you can use the full capacity of the message text without sending any Unicode characters.

Supported alphabets:

  • TURKISH
  • GREEK
  • "CYRILLIC"
  • SERBIAN_CYRILLIC
  • PORTUGUESE
  • SPANISH
  • BALTIC
  • NON_UNICODE

By specifying the desired output alphabet, some unsupported characters will be converted differently, depending on which character is the most appropriate for the selected language.

WARNING

Using any of the alphabets available, transliteration will be done on matching characters recognized by the selected language, leaving other characters untouched. Any character that is not recognized by the selected language and not part of the default alphabet, will be replaced by (.).

NOTE

Using "NON_UNICODE" transliteration message text will be converted from Unicode to GSM charset using all of the available alphabets conversions, leaving unmatched characters replaced with dots (ie. "©™ø- ˆ¨л- ˙˚λ- ∆ƒ∂" will become "..ø- ..l- ..A- ...").

The example below shows how to send a transliterated message. Just put one of the supported alphabets in the transliteration parameter.

POST /sms/1/text/advanced HTTP/1.1
Host: api.infobip.com
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==
Content-Type: application/json

{
    "messages":[
        {
            "from":"InfoSMS",
            "destinations":[
                {
                    "to":"41793026727"
                }
            ],
            "text":"Ως Μεγαρικό ψήφισμα είναι γνωστή η απόφαση της Εκκλησίας του δήμου των Αθηναίων (πιθανόν γύρω στο 433/2 π.Χ.) να επιβάλει αυστηρό και καθολικό εμπάργκο στα",
            "transliteration":"GREEK"
        }
    ]
}

Text sent:

Ως Μεγαρικό ψήφισμα είναι γνωστή η απόφαση της Εκκλησίας του δήμου των Αθηναίων (πιθανόν γύρω στο 433/2 π.Χ.) να επιβάλει αυστηρό και καθολικό εμπάργκο στα

Text received by the recipient of the message:

ΩΣ MEΓAPIKO ΨHΦIΣMA EINAI ΓNΩΣTH H AΠOΦAΣH THΣ EKKΛHΣIAΣ TOY ΔHMOY TΩN AΘHNAIΩN (ΠIΘANON ΓYPΩ ΣTO 433/2 Π.X.) NA EΠIBAΛEI AYΣTHPO KAI KAΘOΛIKO EMΠAPΓKO ΣTA

By using transliteration, Greek lower case letters that are not supported in the default alphabet were converted to upper case letters which are supported, as you can see in the table below.

 

 
PREVIEW MESSAGES BEFORE SENDING!

Transliteration may cause unexpected output message text. We recommend using the SMS preview method to explore all options before sending.