std::codecvt

ヘッダ `<locale>` で定義
`template< class InternT, class ExternT, class State > class codecvt;`

クラス std::codecvt は、文字列 (ワイドとマルチバイトを含む) のあるエンコーディングから別のエンコーディングへの変換をカプセル化します。 std::basic_fstream<CharT> を通して行われるすべてのファイル入出力操作は、そのストリームに設定されているロケールの std::codecvt<CharT, char, std::mbstate_t> ファセットを使用します。

継承図

以下のスタンドアロンな (ロケール非依存な) 特殊化が標準ライブラリによって提供されます。

ヘッダ `<locale>` で定義
`std::codecvt<char, char, std::mbstate_t>`	恒等変換
`std::codecvt<char16_t, char, std::mbstate_t>`	UTF-16 と UTF-8 の間の変換 (C++11以上)(C++20で非推奨)
`std::codecvt<char16_t, char8_t, std::mbstate_t>`	UTF-16 と UTF-8 の間の変換 (C++20以上)
`std::codecvt<char32_t, char, std::mbstate_t>`	UTF-32 と UTF-8 の間の変換 (C++11以上)(C++20で非推奨)
`std::codecvt<char32_t, char8_t, std::mbstate_t>`	UTF-32 と UTF-8 の間の変換 (C++20以上)
`std::codecvt<wchar_t, char, std::mbstate_t>`	システムのネイティブなワイド文字集合とシングルバイトナロー文字集合の間の変換

さらに、 C++ のプログラム内で構築されるすべてのロケールオブジェクトは、これら4つの特殊化の独自の (ロケール固有の) バージョンを実装します。

メンバ型

メンバ型	定義
`intern_type`	`InternT`
`extern_type`	`ExternT`
`state_type`	`State`

メンバ関数

コンストラクタ	新しい codecvt ファセット (パブリックメンバ関数)
デストラクタ	codecvt ファセットを破棄します (プロテクテッドメンバ関数)
out	`do_out` を呼びます (パブリックメンバ関数) [edit]
in	`do_in` を呼びます (パブリックメンバ関数) [edit]
unshift	`do_unshift` を呼びます (パブリックメンバ関数) [edit]
encoding	`do_encoding` を呼びます (パブリックメンバ関数) [edit]
always_noconv	`do_always_noconv` を呼びます (パブリックメンバ関数) [edit]
length	`do_length` を呼びます (パブリックメンバ関数) [edit]
max_length	`do_max_length` を呼びます (パブリックメンバ関数) [edit]

メンバオブジェクト

メンバ名	型
`id` (static)	std::locale::id

プロテクテッドメンバ関数

do_out [仮想]	ファイルを書き込む時などのために、文字列を internT から externT に変換します (仮想プロテクテッドメンバ関数) [edit]
do_in [仮想]	ファイルから読み込む時などのために、文字列を externT から internT に変換します (仮想プロテクテッドメンバ関数) [edit]
do_unshift [仮想]	不完全な変換に対する externT 文字の終了文字シーケンスを生成します (仮想プロテクテッドメンバ関数) [edit]
do_encoding [仮想]	一定であれば、ひとつの internT 文字を生成するのに必要な externT の文字数を返します (仮想プロテクテッドメンバ関数) [edit]
do_always_noconv [仮想]	ファセットがすべての有効な引数の値に対して恒等変換をエンコードするかどうか調べます (仮想プロテクテッドメンバ関数) [edit]
do_length [仮想]	与えられた internT バッファへの変換によって消費されるであろう externT 文字列の長さを計算します (仮想プロテクテッドメンバ関数) [edit]
do_max_length [仮想]	単一の internT 文字に変換される可能性のある externT の最大文字数を返します (仮想プロテクテッドメンバ関数) [edit]

std::codecvt_base から継承

メンバ型	定義
`enum result { ok, partial, error, noconv };`	スコープなし列挙型

列挙定数	定義
`ok`	変換はエラーなしで完了しました
`partial`	変換元の文字は全部は変換されませんでした
`error`	無効な文字に遭遇しました
`noconv`	変換は必要ありません、入力と出力の型は同じです

例

以下の例は、 codecvt<wchar_t, char, mbstate_t> で UTF-8 の変換を実装するロケールを用いて UTF-8 ファイルを読み込み、 std::codecvt の標準の特殊化のひとつを用いて UTF-8 文字列を UTF-16 に変換します。

Run this code

#include <iostream>
#include <fstream>
#include <string>
#include <locale>
#include <iomanip>
#include <codecvt>

// wstring/wbuffer_convert に対してロケール束縛されたファセットを適合させるためのユーティリティラッパー。
template<class Facet>
struct deletable_facet : Facet
{
    template<class ...Args>
    deletable_facet(Args&& ...args) : Facet(std::forward<Args>(args)...) {}
    ~deletable_facet() {}
};

int main()
{
    // UTF-8 ナローマルチバイトエンコーディング。
    std::string data = reinterpret_cast<const char*>(+u8"z\u00df\u6c34\U0001f34c");
                       // または reinterpret_cast<const char*>(+u8"zß水🍌")
                       // または "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9f\x8d\x8c"

    std::ofstream("text.txt") << data;

    // システム供給ロケールの codecvt ファセットを使用。
    std::wifstream fin("text.txt");
    // wifstream からの読み込みは codecvt<wchar_t, char, mbstate_t> を使用します。
    // このロケールの codecvt は UTF-8 を UCS4 に変換します (Linux などのシステムでは)。
    fin.imbue(std::locale("en_US.UTF-8"));
    std::cout << "The UTF-8 file contains the following UCS4 code points: \n";
    for (wchar_t c; fin >> c; )
        std::cout << "U+" << std::hex << std::setw(4) << std::setfill('0') << c << '\n';

    // 標準の (ロケール非依存の) codecvt ファセットを使用。
    std::wstring_convert<
        deletable_facet<std::codecvt<char16_t, char, std::mbstate_t>>, char16_t> conv16;
    std::u16string str16 = conv16.from_bytes(data);

    std::cout << "The UTF-8 file contains the following UTF-16 code points: \n";
    for (char16_t c : str16)
        std::cout << "U+" << std::hex << std::setw(4) << std::setfill('0') << c << '\n';
}

出力:

The UTF-8 file contains the following UCS4 code points:
U+007a
U+00df
U+6c34
U+1f34c
The UTF-8 file contains the following UTF-16 code points:
U+007a
U+00df
U+6c34
U+d83c
U+df4c

言語
標準ライブラリヘッダ
フリースタンディング処理系とホスト処理系
名前付き要件
言語サポートライブラリ
コンセプトライブラリ (C++20)
診断ライブラリ
ユーティリティライブラリ
文字列ライブラリ
コンテナライブラリ
イテレータライブラリ
範囲ライブラリ (C++20)
アルゴリズムライブラリ
数値演算ライブラリ
ローカライゼーションライブラリ
入出力ライブラリ
ファイルシステムライブラリ (C++17)
正規表現ライブラリ (C++11)
アトミック操作ライブラリ (C++11)
スレッドサポートライブラリ (C++11)
技術仕様書

メンバ関数
codecvt::codecvt
codecvt::~codecvt
codecvt::outcodecvt::do_out
codecvt::incodecvt::do_in
codecvt::unshiftcodecvt::do_unshift
codecvt::encodingcodecvt::do_encoding
codecvt::always_noconvcodecvt::do_always_noconv
codecvt::lengthcodecvt::do_length
codecvt::max_lengthcodecvt::do_max_length

文字変換	ロケール定義のマルチバイト (UTF-8, GB18030)	UTF-8	UTF-16
UTF-16	`mbrtoc16` / `c16rtomb`(C11のDR488あり)	`codecvt`<char16_t, char, mbstate_t> `codecvt_utf8_utf16`<char16_t> `codecvt_utf8_utf16`<char32_t> `codecvt_utf8_utf16`<wchar_t>	N/A
UCS2	`c16rtomb`(C11のDR488なし)	`codecvt_utf8`<char16_t> `codecvt_utf8`<wchar_t>(Windows)	`codecvt_utf16`<char16_t> `codecvt_utf16`<wchar_t>(Windows)
UTF-32	`mbrtoc32` / `c32rtomb`	`codecvt`<char32_t, char, mbstate_t> `codecvt_utf8`<char32_t> `codecvt_utf8`<wchar_t>(Windows以外)	`codecvt_utf16`<char32_t> `codecvt_utf16`<wchar_t>(Windows以外)
システム全体: UTF-32(Windows以外) UCS2(Windows)	`mbsrtowcs` / `wcsrtombs` `use_facet`<`codecvt` <wchar_t, char, mbstate_t>>(`locale`)	No	No

codecvt_base	文字変換エラーを定義します (クラステンプレート)
codecvt_byname	名前付きロケールに対する codecvt ファセットを作成します (クラステンプレート)
codecvt_utf8 (C++11)(C++17で非推奨)	UTF-8 と UCS2/UCS4 の間で変換を行います (クラステンプレート) [edit]
codecvt_utf16 (C++11)(C++17で非推奨)	UTF-16 と UCS2/UCS4 の間で変換を行います (クラステンプレート) [edit]
codecvt_utf8_utf16 (C++11)(C++17で非推奨)	UTF-8 と UTF-16 の間で変換を行います (クラステンプレート) [edit]

cppreference.com

検索

名前空間

変種

表示

操作