![]() ![]() less smart use of dedup may cause more data to be carried around, e.g. dedup should not allow batch mode searches, but instead requires event ordering and may therefore not allow parallel search pipelines, didn't verify this When looking at run time, make sure you do several executions to get a good average and iron out other activities on the system. You can verify this by looking at the big numbers to the right of in the job inspector, both should show similar and small amounts of data returned to the search head. merge those lists into one on the search headĪssuming both commands are built well, there will not be a huge difference in performance. Cyber Security Splunk is a powerful tool, but with so many available functions and hit-and-miss coverage on forums it can sometimes take some trial and error to get queries right. ![]() For the below table if you see, and above query, it should not display any event as there is no data with >2. search Total > 2 -> it is displaying overall value. stats count As Total -> it is counting the number of occurrences like 2,1,1. produce a deduplicated list on each indexer (prestats / prededup in remoteSearch in the job inspector) to return to the search head I tried above and it is working but not I expected.extract, alias, calculate, lookup, whatever to produce the field.I'm just looking for improve my queries the best as I can.Īssuming you want a list of all values of a field in an index, both these searches would give you that: index=a | stats count by field | fields - countįundamentally, both searches have to do the same work: So, what do you guys think? Is there any REAL performance improvement in using stats over using dedup? Is there any official answer about this question? Somebody even says here that stats dc(yourfield) it's even faster than a simple stats:įor me it makes completely sense, because it's easier to count (or distinct count) just elements by one unique field than check if that same element exists within ALL the data sets. I've been digging for days on the internet, but I can't find an official answer, just some good argumented approaches: Some days ago, one of my colleagues told me that "if you want to delete duplicates on your search, using a stats count by yourfield is more efficient than using dedup yourfield because it has better performance since stats doesn't have to compare ALL the elements of the search while dedup does", but he didn't give me to me any demonstration about it. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |