Mongodb通配符索引签名和使用限制问题记录

2024-08-10 12:53 数据库作者：威赞

学习MongoDB，体会mongodb的每一个使用细节，欢迎阅读威赞的文章。这是威赞发布的第98篇mongodb技术文章，欢迎浏览本专栏威赞发布的其他文章。如果您认为我的文章对您有帮助或者解决您的问题，欢迎在文章下面点个赞，或者关注威赞。谢谢。威赞文章都是结合官方文档，翻译整理而来，并对每个知识点的描述都认真思考和实践，对难以理解的地方，使用简单容易理解的方式进行阐述。

Mongodb的通配符索引，为灵活的数据结构，提供了便利，但使用上有哪些限制？本文结合Mongodb的官方文档，总结了Mongodb通配符索引的使用和限制。

索引签名

自Mongodb5.0开始，通配符索引的wildcardProjection也会被包含到索引签名当中。索引签名，是识别索引唯一性的标志，包含了构建索引的各种参数。将通配符索引的wildcardProjection包含到索引当中，用户可以建立带有相同索引键但wildcardProjection不同的索引。如为集合books创建两个通配符索引。

db.books.createIndex({"$**": 1},{
    wildcardProjection: {
        "author.name": 1,
        "author.website": 1
    },
    name: "authorWildcard"
})
db.books.createIndex({"$**": 1},{
    wildcardProjection: {
        "publisher.name": 1
    },
    name: "publisherWildcard"
})

查看索引

db.books.getIndexes()
[
  {
    "v": 2,
    "key": {
      "_id": 1
    },
    "name": "_id_"
  },
  {
    "v": 2,
    "key": {
      "$**": 1
    },
    "name": "authorWildcard",
    "wildcardProjection": {
      "author.name": 1,
      "author.website": 1
    }
  },
  {
    "v": 2,
    "key": {
      "$**": 1
    },
    "name": "publisherWildcard",
    "wildcardProjection": {
      "publisher.name": 1
    }
  }
]

两个索引都创建成功

通配符索引限制

复合通配符索引限制一个复合通配符索引只能包含一个通配符表达式，使用下面的表达式构建，是不可以的

{userID: 1, "object1.$**":1, "object2.$**":1}

复合通配符索引当中，非通配符索引键不能使用多键索引键
使用wildcardProjection选项时，构建索引的通配符只能时$**, 不能使用带有特殊路径的通配符表达式。下面的表达式是合法的

{
  key: { "$**:1"},
  name: "index_all_with_projection",
  wildcardProjection: {
    "someFields.name": 1,
    "otherFields.values": 1
  }
}

而带有字段的路径是不合法的

{
  key: { "someFields.$**:1"},
  name: "index_all_with_projection",
  wildcardProjection: {
    "someFields.name": 1,
    "otherFields.values": 1
  }
}

_id字段默认没有包含在通配符索引当中，如果用户构建的通配符索引需要包含_id字段，使用wildcardProjection指定包含_id字段。

db.studentGrades.createIndex({"$**": 1}, {
    wildcardProjection: {
        "grades": 1,
        "_id": 1
    }
})

唯一索引和过期时间

添加通配符索引时，不可指定唯一索引或索引过期时间。

空间索引与哈希

不能将通配符索引与空间索引和哈希索引合并创建通配符索引。

分片数据集

不能将通配符索引用来分片键当中。

通配符索引不支持的查询

数组字段不等于空的查询不支持

如在inventory集合中，字段production_attributes上构建了通配符索引。该通配符索引不支持数组字段的空值不等查询。如下面的查询，Mongodb编排查询计划时，不会使用通配符索引

db.inventory.find({$ne: ["product_attributes.tags", null]})
db.inventory.aggregate([
  {
    $match: { $ne: ["product_attributes.tags", null]}
  }
])

针对嵌入式文档和数组的精确查询

在构建通配符索引时，Mongodb将嵌入式文档和数组进行解析，将解析后的基本数据类型和其对应的字段路径加入到通配符索引当中，而不是将嵌入式文档和数组放入到通配符索引的结构当中。因此通配符索引，无法支持基于嵌入式文档和数组的精确查询。如针对inventory集合的查询，Mongodb在编排查询计划时，不会选择通配符索引。

db.inventory.find(
  {
    "product_attributes": {"price": 29.99}
  }
)
db.inventory.find(
  {
    "product_attributes.tags": ["waterproof", "fireproof"]
  }
)

当然，通配符索引也不能够支持到嵌入式文档和数组的不等查询。

判断字段是否存在

通配符索引是稀疏的。当通配符索引指定的字段值在文档当中不存在时，文档数据不会加入到通配符索引当中。因此通配符索引不支持带有判断字段是否存在的查询。

如通配符索引不支持下面的几个查询

db.inventory.find(
  {
    "product_attributes":  {$exists: false}
  }
)
db.inventory.aggregate([
  {
    $match:  {
      "product_attributes": { $exists: false}
    }
  }
])

多字段查询

MongoDB不能使用非通配符索引来支持查询谓词的一部分而使用通配符索引来支持另一部分。

MongoDB不能在同一个查询中使用多个通配符索引来支持不同的谓词。

在一个通配符索引可以支持多个查询字段的情况下，MongoDB只能使用通配符索引来支持其中一个查询字段。MongoDB会根据对应的通配符索引路径自动选择通配符索引支持的字段。

db.inventory.find(
  {
    "product_attributes.price": {$gt: 20},
    "product_attributes.material": "silk",
    "product_attributes.size": "large"
  }
)

Mongodb通配符索引只能够支持查询条件中的一个条件。而选择哪个条件来使用通配符索引则与通配符索引的路径有关。

查看上面查询的执行计划

{
  "explainVersion": "2",
  "queryPlanner": {
    "namespace": "test.inventory",
    "indexFilterSet": false,
    "parsedQuery": {
      "$and": [
        {
          "product_attributes.material": {
            "$eq": "silk"
          }
        },
        {
          "product_attributes.size": {
      www.devze.com      "$eq": "large"
          }
        },
        {
          "product_attributes.price": {
            "$gt": 20
          }
        }
      ]
    },
    "queryHash": "03951C4C",
    "planCacheKey": "BC3202F5",
    "maxIndexedOrSolutionsReached": false,
    "maxIndexedAndSolutionsReached": false,
    "maxScansToExplodeReached": false,
    "winningPlan": {
      "queryPlan": {
        "stage": "FETCH",
        "planNodeId": 2,
        "filter": {
          "$and": [
            {
              "product_attributes.price": {
                "$gt": 20
              }
            },
            {
              "product_attributes.size": {
                "$eq": "large"
              }
            }
          ]
        },
        "inputStage": {
          "stage": "IXSCAN",
          "planNodeId": 1,
          "keyPattern": {
            "$_path": 1,
            "product_attributes.material": 1
          },
          "indexName": "product_attributes.$**_1",
          "isMultiKey": false,
          "multiKeyPaths": {
            "$_path": [],
            "product_attributes.material": []
          },
          "isUnique": false,
          "isSparse": true,
          "isPartial": false,
          "indexVersion": 2,
          "direction": "forward",
          "indexBounds": {
            "$_path": [
              "[\"product_attributes.material\", \"product_attributes.material\"]"
            ],
            "product_attributes.material": [
              "[\"silk\", \"silk\"]"
            ]
          }
        }
      },
      "slotBasedPlan": {
        "slots": "$$RESULT=s11 env: { s14 = 20, s1 = TimeZoneDatabase(America/Argentina/La_Rioja...Asia/Ashkhabad) (timeZoneDB), s10 = {\"$_path\" : 1, \"product_attributes.material\" : 1}, s6 = KS(3C70726F647563745F617474726962757465732E6D6174657269616C003C73696C6B00FE04), s15 = \"large\", s3 = 1721879566202 (NOW), s2 = Nothing (SEARCH_META), s5 = KS(3C70726F647563745F617474726962757465732E6D6174657269616C003C73696C6B000104) }",
        "stages": "[2] filter {(traverseF(s13, lambda(l1.0) { traverseF(getField(l1.0, \"price\"), lambda(l2.0) { ((l2.0 > s14) ?: false) }, false) }, false) && traverseF(s13, lambda(l3.0) { traverseF(getField(l3.0, \"size\"), lambda(l4.0) { ((l4.0 == s15) ?: false) }, false) }, false))} \n[2] nlj inner [] [s4, s7, s8, s9, s10] \n    left \n        [1] cfilter {(exists(s5) && exists(s6))} \n        [1] ixseek s5 s6 s9 s4 s7 s8 [] @\"259baef3-1faf-4703-8a12-870b2c7e1fjavascript55\" @\"product_attributes.$**_1\" true \n    right \n        [2] limit 1 \n        [2] seek s4 s11 s12 s7 s8 s9 s10 [s13 = product_attributes] @\"259baef3-1faf-4703-8a12-870b2c7e1f55\" true false \n"
      }
    },
    "rejectedPlans": [
      {
        "quer编程yPlan": {
          "stage": "FETCH",
          "planNodeId": 2,
          "filter": {
            "$and": [
              {
                "product_attributes.material": {
                  "$eq": "silk"
                }
              },
              {
                "product_attributes.size": {
                  "$eq": "large"
                }
              }
            ]
          },
          "inputStage": {
            "stage": "IXSCAN",
            "planNodeId": 1,
            "keyPattern": {
              "$_path": 1,
              "product_attributes.price": 1
            },
            "indexName": "product_attributes.$**_1",
            "isMultiKey": false,
            "multiKeyPaths": {
              "$_path": [],
              "product_attributes.price": []
            },
            "isUnique": false,
            "isSparse": true,
            "isPartial": false,
            "indexVersion": 2,
            "direction": "forward",
            "indexBounds": {
              "$_path": [
                "[\"product_attributes.price\", \"product_attributes.price\"]"
              ],
              "product_attributes.price": [
                "(20, inf.0]"
              ]
            }
          }
        },
        "slotBasedPlan": {
          "slots": "$$RESULT=s11 env: { s14 = \"silk\", s10 = {\"$_path\" : 1, \"product_attributes.price\" : 1}, s1 = TimeZoneDatabase(America/Argentina/La_Rioja...Asia/Ashkhabad) (timeZoneDB), s15 = \"large\", s6 = KS(3C70726F647563745F617474726962757465732E70726963650033FFFFFFFFFFFFFFFFFE04), s3 = 1721879566202 (NOW), s5 = KS(3C70726F647563745F617474726962757465732E7072696365002B28FE04), s2 = Nothing (SEARCH_META) }",
          "stages": "[2] filter {(traverseF(s13, lambda(l1.0) { traverseF(getField(l1.0, \"material\"), lambda(l2.0) { ((l2.0 == s14) ?: false) }, false) }, false) && traverseF(s13, lambda(l3.0) { traverseF(getField(l3.0, \"size\"), lambda(l4.0) { ((l4.0 == s15) ?: false) }, false) }, false))} \n[2] nlj inner [] [s4, s7, s8, s9, s10] \n    left \n        [1] cfilter {(exists(s5) && exists(s6))} \n        [1] ixseek s5 s6 s9 s4 s7 s8 [] @\"259baef3-1faf-4703-8a12-870b2c7e1f55\" @\"product_attributes.$**_1\" true \n    right \n        [2] limit 1 \n        [2] seek s4 s11 s12 s7 s8 s9 s10 [s13 = product_attributes] @\"259baef3-1faf-4703-8a12-870b2c7e1f55\" true false \n"
        }
      },
      {
        "queryPlan": {
          "stage": "FETCH",
          "planNodeId": 2,
          "filter": {
            "$and": [
              {
                "product_attributes.material": {
                  "$eq": "silk"
                }
              },
              {
                "product_attributes.price": {
                  "$gt": 20
                }
              }
            ]
          },
          "inputStage": {
            "stage": "IXSCAN",
            "planNodeId": 1,
            "keyPattern": {
              "$_path": 1,
              "product_attributes.size": 1
            },
            "indexName": "product_attributes.$**_1",
            "isMultiKey": false,
            "multiKeyPaths": {
              "$_path": [],
              "product_attributes.sizewww.devze.com": []
            },
            "isUnique": false,
            "isSparse": true,
            "isPartial": false,
            "indexVersion": 2,
            "direction": "forward",
            "indexBounds": {
              "$_path": [
                "[\"product_attributes.size\", \"product_attributes.size\"]"
              ],
              "product_attributes.size": [
                "[\"large\", \"large\"]"
              ]
            }
          }
        },
        "slotBasedPlan": {
          "slots": "$$RESULT=s11 env: { s14 = \"silk\", s10 = {\"$_path\" : 1, \"product_attributes.size\" : 1}, s1 = TimeZoneDatabase(America/Argentina/La_Rioja...Asia/Ashkhabad) (timeZoneDB), s15 = 20, s6 = KS(3C70726F647563745F617474726962757465732E73697A65003C6C6172676500FE04), s3 = 1721879566202 (NOW), s5 = KS(3C70726F647563745F617474726962757465732E73697A65003C6C61726765000104), s2 = Nothing (SEARCH_META) }",
          "stages": "[2] filter {(traverseF(s13, lam编程客栈bda(l1.0) { traverseF(getField(l1.0, \"material\"), lambda(l2.0) { ((l2.0 == s14) ?: false) }, false) }, false) && traverseF(s13, lambda(l3.0) { traverseF(getField(l3.0, \"price\"), lambda(l4.0) { ((l4.0 > s15) ?: false) }, false) }, false))} \n[2] nlj inner [] [s4, s7, s8, s9, s10] \n    left \n        [1] cfilter {(exists(s5) && exists(s6))} \n        [1] ixseek s5 s6 s9 s4 s7 s8 [] @\"259baef3-1faf-4703-8a12-870b2c7e1f55\" @\"product_attributes.$**_1\" true \n    right \n        [2] limit 1 \n        [2] seek s4 s11 s12 s7 s8 s9 s10 [s13 = product_attributes] @\"259baef3-1faf-4703-8a12-870b2c7e1f55\" true false \n"
        }
      }
    ]
  },
  "command": {
    "find": "inventory",
    "filter": {
      "product_attributes.price": {
        "$gt": 20
      },
      "product_attributes.material": "silk",
      "product_attributes.size": "large"
    },
    "$db": "test"
  },
  "serverInfo": {
    "host": "TEST-W11",
    "port": 27017,
    "version": "7.0.4",
    "gitVersion": "38f3e37057a43d2e9f41a39142681a76062d582e"
  },
  "serverParameters": {
    "internalQueryFacetBufferSizeBytes": 104857600,
    "internalQueryFacetMaxOutputDocSizeBytes": 104857600,
    "internalLookupStageIntermediateDocumentMaxSizeBytes": 104857600,
    "internalDocumentSourceGroupMaxMemoryBytes": 104857600,
    "internalQueryMaxblockingSortMemoryUsageBytes": 104857600,
    "internalQueryProhibitBlockingMergeOnMongoS": 0,
    "internalQueryMaxAddToSetBytes": 104857600,
    "internalDocumentSourceSetWindowFieldsMaxMemoryBytes": 104857600,
    "internalQueryFrameworkControl": "trySbeEngine"
  },
  "ok": 1
}

查询排序

通配符查询仅支持索引覆盖查询的排序。排序字段还不能是数组。

如在集合product的product_attributes构建通配符索引

db.products.createIndex({"product_attributes.$**": 1})

当price字段不是数组时，通配符索引可以支持该排序查询

db.products.find(
  {"product_attributes.price": { $gt: 10.00}} 
).sort({"product_attributes.price": 1} )

到此这篇关于Mongodb通配符索引签名和使用限制的文章就介绍到这了,更多相关Mongodb索引使用限制内容请搜索编程客栈(www.devze.com)以前的文章或继续浏览下面的相关文章希望大家以后多多支持编程客栈(www.devze.com)！

继续阅读：Mongodb索引使用限制 Mongodb通配符索引签名

Mongodb通配符索引签名和使用限制问题记录

目录

索引签名

通配符索引限制

唯一索引和过期时间

空间索引与哈希

分片数据集

通配符索引不支持的查询

数组字段不等于空的查询不支持

针对嵌入式文档和数组的精确查询

判断字段是否存在

多字段查询

查询排序

更多精彩内容

精彩评论

最新数据库

SQL Server窗口函数详细指南(函数用法与场景)

SQL Server彻底卸载的终极指南(不重装系统，超级干净)

MySql库与表的基础操作大全

MySQL 复合查询从单表到多表的实战攻略

统计mysql和pgsql库和表占用大小方式

数据库排行榜

Hadoop Key Management Server (KMS)配置及测试

spark报错ERROR ObjectStore: Version information found in metastore differs 2.1.0 from expected schema version 1.2.0. Schema verififcation is disabled hive.metastore.schema.verification so setting version.

Navicat连接Oracle数据库的详细步骤与注意事项

redis-cluster集群调优之cluster-require-full-coverage参数

解决Navicat远程连接MySQL出现 10060 unknow error的方法