このページの英語版を見る

core.internal.utf

UTF-8、UTF-16、UTF-32文字列をエンコードおよびデコードする。

Win32システムでは、Cのwchar_t型はUTF-16であり、Dの wchar型に対応する。 Posixシステムでは、Cのwchar_t型はUTF-32であり、D utf.dchar型に対応する。 D utf.dchar型に対応する。

UTF文字のサポートは、( \u0000 <= character <= ˶U0010FFFF) に制限されている。

License:

Boost License 1.0.

Authors:

Walter Bright, Sean Kelly

ソース core/internal/utf.d

pure nothrow @nogc @safe bool isValidDchar(dchar c);: c が有効な UTF-32 文字であるかどうかを調べる。
\この関数では、"uFFFE"と "uFFFF"は有効であると見なされる、アプリケーションによる内部使用では許可されているため、この関数では有効であると見なされる、というのは、アプリケーションによる内部使用は許可されているが、Unicode 標準による交換は許可されていないからである。

Returns:
許可されていればtrue、許可されていなければfalseを返す。
pure nothrow @nogc @safe uint stride(scope const char[] s, size_t i);: stride() は、文字列 s のインデックス i から始まるUTF-8シーケンスの長さを返す。
Returns:
UTF-8シーケンスのバイト数、または 0xFF は s[i] が UTF-8 シーケンスの先頭でないことを意味する。
pure nothrow @nogc @safe uint stride(scope const wchar[] s, size_t i);: stride() は、文字列 s のインデックス i で始まる UTF-16 シーケンスの長さを返す。
pure nothrow @nogc @safe uint stride(scope const dchar[] s, size_t i);: stride() は、文字列 s の添字 i から始まる UTF-32 シーケンスの長さを返す。で始まる UTF-32 シーケンスの長さを返す。
Returns:
戻り値は常に1である。
pure @safe size_t toUCSindex(scope const char[] s, size_t i); pure @safe size_t toUCSindex(scope const wchar[] s, size_t i); pure nothrow @nogc @safe size_t toUCSindex(scope const dchar[] s, size_t i);: 文字の配列s[]のインデックスiが与えられたとする、インデックスiがUTF文字の先頭にあると仮定する、その添字iまでのUCS文字の数を決定する。
pure @safe size_t toUTFindex(scope const char[] s, size_t n); pure nothrow @nogc @safe size_t toUTFindex(scope const wchar[] s, size_t n); pure nothrow @nogc @safe size_t toUTFindex(scope const dchar[] s, size_t n);: 文字 s[] の配列に UCS インデックス n が与えられた場合、その UTF インデックスを返す。
pure @safe dchar decode(scope const char[] s, ref size_t idx); pure @safe dchar decode(scope const wchar[] s, ref size_t idx); pure @safe dchar decode(scope const dchar[] s, ref size_t idx);: s[idx]で始まる文字をデコードして返す。デコードされた文字より前に進む。文字が正しく形成されていない場合、UtfExceptionがスローされ、idxは変更されない。がスローされ、idxは変更されない。
pure nothrow @safe void encode(ref char[] s, dchar c); pure nothrow @safe void encode(ref wchar[] s, dchar c); pure nothrow @safe void encode(ref dchar[] s, dchar c);: 文字cをエンコードして配列s[]に追加する。
pure nothrow @nogc @safe ubyte codeLength(C)(dchar c);: C を符号点として使用したエンコードにおけるc の符号長を返す。のコード長を返す。コードはバイト数ではなく文字数で返される。
pure nothrow @safe bool isValidString(S)(scope const S s);: S 。 char wchar dchar の配列である。そうでない場合はfalse を返す。信頼できないすべての入力が正しいかどうかをチェックするために使用する。
pure nothrow @safe string toUTF8(return scope string s); pure @trusted string toUTF8(scope const wchar[] s); pure @trusted string toUTF8(scope const dchar[] s);: 文字列 s を UTF-8 にエンコードし、エンコード後の文字列を返す。
pure @trusted wstring toUTF16(scope const char[] s); pure @safe wptr toUTF16z(scope const char[] s); pure nothrow @safe wstring toUTF16(return scope wstring s); pure nothrow @trusted wstring toUTF16(scope const dchar[] s);: 文字列 s を UTF-16 にエンコードし、エンコード後の文字列を返す。 toUTF16z()は、Win32 APIの"W"関数を呼び出すのに適している。を呼び出すのに適している。
pure @trusted dstring toUTF32(scope const char[] s); pure @trusted dstring toUTF32(scope const wchar[] s); pure nothrow @safe dstring toUTF32(return scope dstring s);: 文字列 s を UTF-32 にエンコードし、エンコード後の文字列を返す。

DEEPL APIにより翻訳、ところどころ修正。
このページの最新版(英語)
このページの原文(英語)
翻訳時のdmdのバージョン: 2.108.0
ドキュメントのdmdのバージョン: 2.109.1
翻訳日付 :2024-04-13 03:10:31+09:00
HTML生成日時: 2025-01-09 08:11:17+09:00
編集者: dokutoku

言語リファレンス

core.internal.utf