今天发现查询mongo时容易慢查询，现在将整个排查及优化过程,总结记录如下。

Database Profiler介绍

类似于mysql,Profiler可以实现慢查询日志的功能，它把数据收集在system.profile集合里。该集合是一个capped collection（有上限的集合）。Profiler默认是关闭的，可以在一个database或者一个instance上设置开启，设置不同的profiling levels.
具体描述见Database Profiler。

慢查询分析流程

慢查询日志一般作为优化步骤里的第一步。通过慢查询日志，定位每一条语句的查询时间。比如超过了200ms，那么查询超过200ms的语句需要优化。然后它通过 .explain() 解析影响行数是不是过大，所以导致查询语句超过200ms。
所以优化步骤一般就是：

用慢查询日志（system.profile）找到超过200ms的语句
然后再通过.explain()解析影响行数，分析为什么超过200ms
决定是不是需要添加索引

开启慢查询

Profiling Levels

Level	Description
0	The profiler is off and does not collect any data. This is the default profiler level.
1	The profiler collects data for operations that take longer than the value of slowms.
2	The profiler collects data for all operations.

Enable and Configure Database Profiling

方式1 - from the mongo shell

# 查看状态：级别和时间
uhost:PRIMARY> db.getProfilingStatus()
{ "was" : 0, "slowms" : 100 }
# 设置Profiling Level，返回的是之前设置的Level,返回"ok":1表示设置成功
uhost:PRIMARY> db.setProfilingLevel(1)
{ "was" : 0, "slowms" : 100, "ok" : 1 }
uhost:PRIMARY> db.getProfilingStatus()
{ "was" : 1, "slowms" : 100 }
# 设置Profiling Level和slowms
uhost:PRIMARY> db.setProfilingLevel(1,200)
{ "was" : 1, "slowms" : 100, "ok" : 1 }
uhost:PRIMARY> db.getProfilingStatus()
{ "was" : 1, "slowms" : 200 }

# New in version 3.6，支持设置sampleRate。
# 默认是1，以下设置0.42，表示采样42%的慢查询
db.setProfilingLevel(1, { sampleRate: 0.42 })

注意：

每次设置之后，返回的是之前的状态
每次设置仅对当前所在的database有效

方式2 - 在mongo启动时设置

作为命令行参数传入：

1	mongod --profile 1 --slowms 15 --slowOpSampleRate 0.5

还可以在配置文件里设置

具体设置参考profile

关闭Profiling

1 2	uhost:PRIMARY> db.setProfilingLevel(0) { "was" : 1, "slowms" : 200, "ok" : 1 }

Profiling的效率问题

这是我能搜到的说法，但是具体原理和数据我还没找到，官网也没看到介绍。

从其他地方复制过来的说法，待论证。”Profiling 功能肯定是会影响效率的，但是不太严重，原因是他使用的是system.profile 来记录，而system.profile 是一个capped collection 这种collection 在操作上有一些限制和特点，但是效率更高“

慢查询结果分析

各项参数含义

这里用的MongoDB版本是2.6, 具体可以参考 https://docs.mongodb.com/v2.6/reference/database-profiler/

uhost:PRIMARY> db.system.profile.find({"ns" : "uhost.DiskAttachInfo","ts":{$gt:new ISODate("2019-07-23T10:11:00Z")}}).pretty()

{
	"op" : "query", #操作类型，有insert、query、update、remove、getmore、command
	"ns" : "uhost.DiskAttachInfo", #操作的collection
	"query" : {
		"backend_types" : {
			"$in" : [
				"ssd"
			]
		}
	},
	"ntoreturn" : 0, #返回的数量
	"ntoskip" : 0, #The number of documents the skip() method specified to skip.
	"nscanned" : 520144, #MongoDB在索引中扫描以执行操作的文档数。 通常，如果nscanned远高于nredurned，则数据库会扫描许多对象以查找目标对象。 考虑创建一个索引来改善这一点。
	"nscannedObjects" : 520144, # The number of documents that MongoDB scans from the collection in order to carry out the operation.
	"keyUpdates" : 0,
	"numYield" : 6,
	"lockStats" : {
		"timeLockedMicros" : {
			"r" : NumberLong(1536711),
			"w" : NumberLong(0)
		},
		"timeAcquiringMicros" : {
			"r" : NumberLong(913),
			"w" : NumberLong(4)
		}
	},
	"nreturned" : 0, // 返回的文档数量
	"responseLength" : 20, // 返回字节长度，如果这个数字很大，考虑值返回所需字段
	"millis" : 820, #消耗的时间（毫秒）
	"execStats" : {
		"type" : "COLLSCAN", #”COLLSCAN“：整个集合扫描，效果最差；”IXSCAN“：索引扫描；"Fetch": ?
		"works" : 520146,
		"yields" : 4063,
		"unyields" : 4063,
		"invalidates" : 0,
		"advanced" : 0,
		"needTime" : 520145,
		"needFetch" : 0,
		"isEOF" : 1,
		"docsTested" : 520144,
		"children" : [ ]
	},
	"ts" : ISODate("2019-07-23T10:11:54.468Z"), #该命令在何时执行
	"client" : "127.0.0.1", #链接ip或则主机
	"allUsers" : [ ],
	"user" : ""
}

分析

如果发现 millis 值比较大，那么就需要作优化。

1 如果nscanned数很大，或者接近记录总数（文档数），那么可能没有用到索引查询，而是全表扫描。

2 如果nscanned值高于nreturned的值，说明数据库为了找到目标文档扫描了很多文档。这时可以考虑创建索引来提高效率。

system.profile补充(这部分待验证)

‘type’的返回参数说明：

COLLSCAN #全表扫描
IXSCAN #索引扫描
FETCH #根据索引去检索指定document
SHARD_MERGE #将各个分片返回数据进行merge
SORT #表明在内存中进行了排序（与老版本的scanAndOrder:true一致）
LIMIT #使用limit限制返回数
SKIP #使用skip进行跳过
IDHACK #针对_id进行查询
SHARDING_FILTER #通过mongos对分片数据进行查询
COUNT #利用db.coll.explain().count()之类进行count运算
COUNTSCAN #count不使用Index进行count时的stage返回
COUNT_SCAN #count使用了Index进行count时的stage返回
SUBPLA #未使用到索引的$or查询的stage返回
TEXT #使用全文索引进行查询时候的stage返回
PROJECTION #限定返回字段时候stage的返回

对于普通查询，我们最希望看到的组合有这些：

Fetch+IDHACK
Fetch+ixscan
Limit+（Fetch+ixscan）
PROJECTION+ixscan
SHARDING_FILTER+ixscan

不希望看到包含如下的type：

COLLSCAN（全表扫）
SORT（使用sort但是无index）
不合理的SKIP，SUBPLA（未用到index的$or）

对于count查询，希望看到的有:COUNT_SCAN

不希望看到的有:COUNTSCAN

常见的慢日志(system.profile)查询语句

返回最近的10条记录：

1	db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()

返回所有的操作，除command类型的

1	db.system.profile.find( { op: { $ne : ‘command‘ } }).pretty()

返回特定集合

1	db.system.profile.find( { ns: "uhost.DiskAttachInfo"} ).pretty()

To return operations slower than 5 milliseconds, run a query similar to the following:

1	db.system.profile.find( { millis : { $gt : 5 } } ).pretty()

To return information from a certain time range, run a query similar to the following:

db.system.profile.find({
  ts : {
    $gt: new ISODate("2012-12-09T03:00:00Z"),
    $lt: new ISODate("2012-12-09T03:40:00Z")
  }
}).pretty()

The following example looks at the time range, suppresses the user field from the output to make it easier to read, and sorts the results by how long each operation took to run:

db.system.profile.find({
  ts : {
    $gt: new ISODate("2011-07-12T03:00:00Z"),
    $lt: new ISODate("2011-07-12T03:40:00Z")
  }
}, { user: 0 }).sort( { millis: -1 } )

Show the Five Most Recent Events

On a database that has profiling enabled, the show profile helper in the mongo shell displays the 5 most recent operations that took at least 1 millisecond to execute. Issue show profile from the mongo shell, as follows:

1	show profile

explain分析

uhost:PRIMARY> db.getCollection('DiskAttachInfo').find({"backend_types":{$in:["ssd"]}}).explain()
{
	"cursor" : "BasicCursor", #返回游标类型，有BasicCursor和BtreeCursor，后者意味着使用了索引。
    "isMultiKey" : false,
	"isMultiKey" : false,
	"n" : 0,
	"nscannedObjects" : 522072, #the total number of documents scanned.
	"nscanned" : 522072,  #the total number of index entries scanned (or documents for a collection scan).
	"nscannedObjectsAllPlans" : 522072,
	"nscannedAllPlans" : 522072,
	"scanAndOrder" : false, #MongoDB是否在内存中对结果集进行了排序
	"indexOnly" : false, #MongoDB是否只使用索引就能完成此次查询
	"nYields" : 4078,  #为了让写入请求能够顺利执行，本次查询暂停暂停的次数。如果有写入请求需求处理，查询会周期性的释放他们的锁，以便写入能够顺利执行
	"nChunkSkips" : 0,
	"millis" : 918, #数据库执行本次查询所耗费的毫秒数。这个数字越小，说明效率越高
	"server" : "172-18-38-130:27017",
	"filterSet" : false
}

详细见cursor.explain()

添加索引

在以上的分析中，”nscanned” 过大，且”type” : “COLLSCAN”为整个集合扫描，”indexOnly” : false表示没有用到索引。考虑创建一个索引来改善这一点。

查看当前索引，在backend_type上并没有索引。

uhost:PRIMARY> db.getCollection('DiskAttachInfo').getIndexes()
[
	{
		"v" : 1,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "uhost.DiskAttachInfo"
	},
	{
		"v" : 1,
		"key" : {
			"uhost_id" : 1
		},
		"name" : "uhost_id_1",
		"ns" : "uhost.DiskAttachInfo",
		"background" : true,
		"safe" : null
	},
	{
		"v" : 1,
		"key" : {
			"attached" : 1
		},
		"name" : "attached_1",
		"ns" : "uhost.DiskAttachInfo",
		"background" : true,
		"safe" : null
	},
	{
		"v" : 1,
		"key" : {
			"disk_id" : 1
		},
		"name" : "disk_id_1",
		"ns" : "uhost.DiskAttachInfo",
		"background" : true,
		"safe" : null
	}
]

添加index

uhost:PRIMARY> db.DiskAttachInfo.createIndex({"backend_type":1},{background: true, safe: null})
{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 4,
	"numIndexesAfter" : 5,
	"ok" : 1
}

添加索引后测试效果

uhost:PRIMARY> db.getCollection('DiskAttachInfo').find({"backend_type":{$in:["ssd"]}}).explain()
{
	"cursor" : "BtreeCursor backend_type_1",
	"isMultiKey" : false,
	"n" : 98023,
	"nscannedObjects" : 98023,
	"nscanned" : 98023,
	"nscannedObjectsAllPlans" : 98023,
	"nscannedAllPlans" : 98023,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 765,
	"nChunkSkips" : 0,
	"millis" : 169,
	"indexBounds" : {
		"backend_type" : [
			[
				"ssd",
				"ssd"
			]
		]
	},
	"server" : "172-18-38-130:27017",
	"filterSet" : false
}

“nscanned” 由原来的52w降低到9w多。
时间（”millis” ）由828ms降低到169ms。

一次mongo慢查询的优化过程