Back to 2015-proposals
Title: An Introduction to Unicode in C++
Proposer: James McNellis
Type: Tutorial
Duration: 90 mins
Description:
At first glance, text encoding and processing text appears to be a rather dull, mundane subject. But in modern software where we often need to handle text from different languages—and even different alphabets and character sets—text processing is anything but mundane. Some programming languages have nice, simple facilities that make most text processing operations fairly straightforward. Unfortunately, C++ is not one of those languages. C++ lacks good, built-in support for Unicode, though the situation is starting to improve.
This session will cover:
- A brief introduction to text encoding and the problems that prompted creation of Unicode
- The initial design of Unicode (Unicode 1.0) and the UCS-2 encoding
- “Modern” Unicode encodings (UTF-8, UTF-16, UTF-32)
- Dynamic composition, normalization, string equality and ordering, string length, and basic text operations
- Unicode support in the C++ language and the C++ Standard Library
- The open source International Components for Unicode (ICU) Library and Boost.Locale
- Possible future directions for better built-in support for Unicode in C++
While the last part of this session is about C++ and most of the examples use C++, most of the content of the talk is programming language neutral.
JJ: James add… A note to reviewers: I have previously spoken on this subject at C++Now 2014, where my talk won the Best Tutorial Session award, and at CppCon 2014, where my talk was similarly well received.)