MongoDB $ strLenBytes

MongoDB、$strLenBytes 集計パイプライン演算子は、指定された文字列内のUTF-8でエンコードされたバイト数を返します。

文字列内の各文字には、使用されている文字に応じて、異なるバイト数を含めることができます。 $strLenBytes 演算子は、各文字に含まれるバイト数を把握し、文字列全体に対して正しい結果を返すことができます。

例

englishというコレクションがあるとします。次のドキュメントを使用：

{ "_id" : 1, "data" : "Maimuang" }
{ "_id" : 2, "data" : "M" }
{ "_id" : 3, "data" : "a" }
{ "_id" : 4, "data" : "i" }
{ "_id" : 5, "data" : "m" }
{ "_id" : 6, "data" : "u" }
{ "_id" : 7, "data" : "a" }
{ "_id" : 8, "data" : "n" }
{ "_id" : 9, "data" : "g" }

$strLenBytesを適用できますそれらのドキュメントのデータフィールドへ：

db.english.aggregate(
   [
     {
       $project:
          {
            _id: 0,
            data: 1,
            result: { $strLenBytes: "$data" }
          }
     }
   ]
)

結果：

{ "data" : "Maimuang", "result" : 8 }
{ "data" : "M", "result" : 1 }
{ "data" : "a", "result" : 1 }
{ "data" : "i", "result" : 1 }
{ "data" : "m", "result" : 1 }
{ "data" : "u", "result" : 1 }
{ "data" : "a", "result" : 1 }
{ "data" : "n", "result" : 1 }
{ "data" : "g", "result" : 1 }

単語全体が8バイトで、各文字がそれぞれ1バイトであることがわかります。

タイ文字

これは、それぞれ3バイトのタイ文字を使用する例です。

thaiというコレクションがあります次のドキュメントを使用：

{ "_id" : 1, "data" : "ไม้เมือง" }
{ "_id" : 2, "data" : "ไ" }
{ "_id" : 3, "data" : "ม้" }
{ "_id" : 4, "data" : "เ" }
{ "_id" : 5, "data" : "มื" }
{ "_id" : 6, "data" : "อ" }
{ "_id" : 7, "data" : "ง" }

$strLenBytesを適用すると次のようになりますそれらのドキュメントへ：

db.thai.aggregate(
   [
     {
       $project:
          {
            _id: 0,
            data: 1,
            result: { $strLenBytes: "$data" }
          }
     }
   ]
)

結果：

{ "data" : "ไม้เมือง", "result" : 24 }
{ "data" : "ไ", "result" : 3 }
{ "data" : "ม้", "result" : 6 }
{ "data" : "เ", "result" : 3 }
{ "data" : "มื", "result" : 6 }
{ "data" : "อ", "result" : 3 }
{ "data" : "ง", "result" : 3 }

これらの文字のうち2つは発音区別符号を使用して変更されているため、6バイトが返されます。

その他のキャラクター

otherというコレクションがあるとします。次のドキュメントを使用：

{ "_id" : 1, "data" : "é" }
{ "_id" : 2, "data" : "©" }
{ "_id" : 3, "data" : "℘" }

そして、$strLenBytesを適用しましょうそれらのドキュメントへ：

db.other.aggregate(
   [
     { $match: { _id: { $in: [ 1, 2, 3 ] } } },
     {
       $project:
          {
            _id: 0,
            data: 1,
            result: { $strLenBytes: "$data" }
          }
     }
   ]
)

結果：

{ "data" : "é", "result" : 2 }
{ "data" : "©", "result" : 2 }
{ "data" : "℘", "result" : 3 }

最初の2文字は2バイトで、3番目は3バイトです。バイト数は文字によって異なります。一部の文字は4バイトを使用できます。

スペース文字はバイトを使用します。したがって、2つのスペース文字は2バイトを使用します。

次のドキュメントがあるとします。

{ "_id" : 4, "data" : " " }
{ "_id" : 5, "data" : "  " }

そして、$strLenBytesを適用しますそれらのドキュメントへ：

db.other.aggregate(
   [
     { $match: { _id: { $in: [ 4, 5 ] } } },
     {
       $project:
          {
            _id: 0,
            data: 1,
            result: { $strLenBytes: "$data" }
          }
     }
   ]
)

結果：

{ "data" : " ", "result" : 1 }
{ "data" : "  ", "result" : 2 }

空の文字列

空の文字列は0を返します。

空の文字列を含むドキュメントは次のとおりです：

{ "_id" : 6, "data" : "" }

$strLenBytesを適用すると次のようになりますそのドキュメントへ：

db.other.aggregate(
   [
     { $match: { _id: { $in: [ 6 ] } } },
     {
       $project:
          {
            _id: 0,
            data: 1,
            result: { $strLenBytes: "$data" }
          }
     }
   ]
)

結果：

{ "data" : "", "result" : 0 }

間違ったデータ型

間違ったデータ型を渡すと、エラーが発生します。

次のドキュメントがあるとします。

{ "_id" : 7, "data" : 123 }

データのfield 数字が含まれています。

$strLenBytesを適用しましょうそのドキュメントへ：

db.other.aggregate(
   [
     { $match: { _id: { $in: [ 7 ] } } },
     {
       $project:
          {
            _id: 0,
            data: 1,
            result: { $strLenBytes: "$data" }
          }
     }
   ]
)

結果：

Error: command failed: {
	"ok" : 0,
	"errmsg" : "$strLenBytes requires a string argument, found: double",
	"code" : 34473,
	"codeName" : "Location34473"
} : aggregate failed :
example@sqldat.com/mongo/shell/utils.js:25:13
example@sqldat.com/mongo/shell/assert.js:18:14
example@sqldat.com/mongo/shell/assert.js:639:17
example@sqldat.com/mongo/shell/assert.js:729:16
example@sqldat.com/mongo/shell/db.js:266:5
example@sqldat.com/mongo/shell/collection.js:1058:12
@(shell):1:1

ヌル値

nullを提供するまた、エラーが発生します。

次のドキュメントがあるとします。

{ "_id" : 8, "data" : null }

データのfield nullが含まれています。

$strLenBytesを適用しましょうそのドキュメントへ：

db.other.aggregate(
   [
     { $match: { _id: { $in: [ 8 ] } } },
     {
       $project:
          {
            _id: 0,
            data: 1,
            result: { $strLenBytes: "$data" }
          }
     }
   ]
)

結果：

uncaught exception: Error: command failed: {
	"ok" : 0,
	"errmsg" : "$strLenBytes requires a string argument, found: null",
	"code" : 34473,
	"codeName" : "Location34473"
} : aggregate failed :
example@sqldat.com/mongo/shell/utils.js:25:13
example@sqldat.com/mongo/shell/assert.js:18:14
example@sqldat.com/mongo/shell/assert.js:639:17
example@sqldat.com/mongo/shell/assert.js:729:16
example@sqldat.com/mongo/shell/db.js:266:5
example@sqldat.com/mongo/shell/collection.js:1058:12
@(shell):1:1

欠落しているフィールド

エラーの発生というテーマを継続し、存在しないフィールドを指定するとエラーも発生します。

ドキュメント：

{ "_id" : 9 }

$strLenBytesを適用します：

db.other.aggregate(
   [
     { $match: { _id: { $in: [ 9 ] } } },
     {
       $project:
          {
            _id: 0,
            data: 1,
            result: { $strLenBytes: "$data" }
          }
     }
   ]
)

結果：

Error: command failed: {
	"ok" : 0,
	"errmsg" : "$strLenBytes requires a string argument, found: missing",
	"code" : 34473,
	"codeName" : "Location34473"
} : aggregate failed :
example@sqldat.com/mongo/shell/utils.js:25:13
example@sqldat.com/mongo/shell/assert.js:18:14
example@sqldat.com/mongo/shell/assert.js:639:17
example@sqldat.com/mongo/shell/assert.js:729:16
example@sqldat.com/mongo/shell/db.js:266:5
example@sqldat.com/mongo/shell/collection.js:1058:12
@(shell):1:1