MongoDB $ sample

MongoDBでは、$sample 集約パイプラインステージは、入力から指定された数のドキュメントをランダムに選択します。

例

employeesというコレクションがあるとします。次のドキュメントを使用：

{ "_id" : 1, "name" : "Bob", "salary" : 55000 }
{ "_id" : 2, "name" : "Sarah", "salary" : 128000 }
{ "_id" : 3, "name" : "Fritz", "salary" : 25000 }
{ "_id" : 4, "name" : "Christopher", "salary" : 45000 }
{ "_id" : 5, "name" : "Beck", "salary" : 82000 }
{ "_id" : 6, "name" : "Homer", "salary" : 1 }
{ "_id" : 7, "name" : "Bartholomew", "salary" : 1582000 }
{ "_id" : 8, "name" : "Zoro", "salary" : 300000 }
{ "_id" : 9, "name" : "Xena", "salary" : 382000 }

$sampleを使用できますそのコレクションから指定された数のドキュメントをランダムに選択するステージ。

例：

db.employees.aggregate(
   [
      { 
        $sample: { size: 3 } 
      }
   ]
)

結果：

{ "_id" : 7, "name" : "Bartholomew", "salary" : 1582000 }
{ "_id" : 3, "name" : "Fritz", "salary" : 25000 }
{ "_id" : 2, "name" : "Sarah", "salary" : 128000 }

この場合、サンプルサイズを3に指定しました。3つのドキュメントがランダムな順序で返されたことがわかります。

同じコードを再度実行した場合の結果は次のとおりです。

{ "_id" : 1, "name" : "Bob", "salary" : 55000 }
{ "_id" : 2, "name" : "Sarah", "salary" : 128000 }
{ "_id" : 9, "name" : "Xena", "salary" : 382000 }

さまざまなドキュメントの選択肢があります。

数を増やすことでサンプルサイズを増やすことができます。

例：

db.employees.aggregate(
   [
      { 
        $sample: { size: 5 } 
      }
   ]
)

結果：

{ "_id" : 9, "name" : "Xena", "salary" : 382000 }
{ "_id" : 3, "name" : "Fritz", "salary" : 25000 }
{ "_id" : 4, "name" : "Christopher", "salary" : 45000 }
{ "_id" : 8, "name" : "Zoro", "salary" : 300000 }
{ "_id" : 5, "name" : "Beck", "salary" : 82000 }

すべてのドキュメントをランダムに返却

要求されたサンプルサイズが一致するか、コレクション内のドキュメントの数よりも大きい場合、すべてのドキュメントがランダムな順序で返されます。

例：

db.employees.aggregate(
   [
      { 
        $sample: { size: 100 } 
      }
   ]
)

結果：

{ "_id" : 4, "name" : "Christopher", "salary" : 45000 }
{ "_id" : 8, "name" : "Zoro", "salary" : 300000 }
{ "_id" : 5, "name" : "Beck", "salary" : 82000 }
{ "_id" : 2, "name" : "Sarah", "salary" : 128000 }
{ "_id" : 6, "name" : "Homer", "salary" : 1 }
{ "_id" : 9, "name" : "Xena", "salary" : 382000 }
{ "_id" : 3, "name" : "Fritz", "salary" : 25000 }
{ "_id" : 7, "name" : "Bartholomew", "salary" : 1582000 }
{ "_id" : 1, "name" : "Bob", "salary" : 55000 }

どのように`$sample` 結果を計算します

$sample ステージは、2つの方法のいずれかを使用して結果を生成します。実際に使用される方法は、シナリオによって異なります。

次の表に、各シナリオで使用される方法の概要を示します。

シナリオ	結果を生成するために使用される方法
次のすべての条件が満たされています。 – `$sample` パイプラインの最初の段階です –指定されたサンプルサイズは、コレクション内のドキュメント全体の5％未満です –コレクションには100を超えるドキュメントが含まれています	`$sample` 疑似ランダムカーソルを使用してドキュメントを選択します。
上記の条件はすべてではありません会った。	`$sample` コレクションスキャンを実行した後、ランダムに並べ替えて、指定した数のドキュメントを選択します。

複製

MongoDBのドキュメントは、$sampleについて警告しています。結果セットに同じドキュメントを複数回出力する場合があります。

MongoDB $ sample

例

すべてのドキュメントをランダムに返却

どのように$sample 結果を計算します

複製

どのように`$sample` 結果を計算します