PyMongo-カーソルの反復

次のようなアプローチを検討しましたか：

for line in file
  value = line[a:b]
  cursor = collection.find({"field": value})
  entries = cursor[:] # or pull them out with a loop or comprehension -- just get all the docs
  # then process entries as a list, either singly or in batch

または、次のようになります：

# same loop start
  entries[value] = cursor[:]
# after the loop, all the cursors are out of scope and closed
for value in entries:
  # process entries[value], either singly or in batch

基本的に、結果セットを保存するのに十分なRAMがある限り、カーソルからそれらを引き出して、処理する前にそれらを保持できるはずです。これはそれほど速くはないでしょうが、特にカーソルの速度低下を軽減し、そのために設定されている場合はデータを並行して処理することができます。