Splunk Tips and Tricks Part 1: Search and Reporting Tips
Over the years I have collected some basic (and advanced) tips and tricks for utilizing Splunk – both from an operational standpoint of server maintenance as well as from the standpoint of alert, report and dashboard creation.
Below are some of the most basic search tips that I have, additional posts are to follow for more advanced tips and tricks.
- Easy Epoch time conversion to a readable date-time format
This one is pretty basic, but something that people may not know. Instead of needing to do strftime command when trying to switch an Epoch time to human readable (assuming you don’t want a very particular readable format) you can actually just use the | convert command.
index=linuxlogs | convert ctime(starttime)
This is a basic example, but the above converts a field “starttime” from Epoch to the default standard time format.
- View current logged in users
The command below runs a rest API query on the search head to see who is currently logged into the server (SearchHead Clusters need to use a command via the Audit logs to see current logged in users).
|rest /services/authentication/users splunk_server=local | search [| rest /services/authentication/current-context splunk_server=local | rename username as title | fields title] | fields title
- Pushing Values Together Within a Table
Sometimes when you append data, it looks like this:
|Column A||Column B Field||Column C Field|
But you want it to look like this:
|Column A||Column B Field||Column C Field|
The trick to do this is by adding the following after the stats/chart command:
| stats first(*) as *
- Fixing Multiple Multi-Value Columns without Duplicating Data
Sometimes your data becomes multiple Multi-Value columns such as this:
You want it to look like this:
However doing a simple MVEXPAND will cause it to duplicate data AND the row you don’t MVEXPAND will stay a multi variable. So you need to do this:
| eval reading=mvzip(LogoutTime,LoginTime) | mvexpand reading | makemv reading delim="," | eval LogoutTime=mvindex(reading, 0) | eval LoginTime=mvindex(reading,1)
Obviously, variable names will change and you can do this for more than 2 fields. For more fields you add another eval like this:
| eval reading=mvzip(reading,<<NEWFIELD>>)
Everything else stays the same except when you do the mvindex in which the next number is -1.
| eval <<NEWFIELD>>=mvindex(reading,-1)
Keep in mind the mvexpand command is very resource intensive, so if you try doing this on a small server on tens of thousands of results, you’re going to run into resource constraints and it will truncate. Be sure to filter down as much as possible before running this trick!
- Calculating CPU Seconds Per Indexer/Searches Per Indexer Per Minute
To explain this search, let’s start off by saying every search conducted on a search head is given a dedicated core on an indexer that contains that data. This means that if you have a lot of searches running simultaneously (scheduled or otherwise) you can effectively cause searches to become skipped (dropped from being performed) or delayed depending on each individual search setting.
Something important to keep in mind is to get an average CPU second per indexer and number of searches per indexer per minute. This allows you to see if you are nearing (or exceeding) your maximum number of possible searches with your current infrastructure.
So for example, if you have an indexer cluster of 4 servers that each have 4 cores and no replication factor then you can run approximately 4*4 searches per second – 16.
Before you run this search, it would be a good idea to have either a lookup table of your Splunk Indexer names or if your Splunk Indexers all follow the same naming convention you can do what this search does and just use the host= field.
index=_introspection host=splunkIDX* sourcetype=splunk_resource_usage component=PerProcess data.search_props.sid::* | bin _time span=10s | eval ELAPSED='data.elapsed' | stats max(ELAPSED) as ELAPSED by data.search_props.sid data.search_props.type _time | streamstats current=t global=f window=2 earliest(ELAPSED) as prev_ELAPSED latest(ELAPSED) as curr_ELAPSED by data.search_props.sid data.search_props.type | eval collection_interval = 10 | eval delta_ELAPSED = curr_ELAPSED - prev_ELAPSED | eval runtime = if(delta_ELAPSED = 0, min(curr_ELAPSED, collection_interval), delta_ELAPSED) | timechart span=1m partial=f sum(runtime) as cpu_secs dc(data.search_props.sid) AS search_count
- Discovering Non-Executing Searches
Related to calculating the CPU seconds, finding the non-executing searches for Splunk is valuable when determining why a report is missing or if you have any performance issues/constraints on your infrastructure. The below query will list out all the searches that failed to execute for any given time (search time frame) along with the reason the search failed, the user and the limit that the user has (in case it’s a user limit instead of system limit).
index=_internal sourcetype=splunkd component=SHPMaster "Search not executed" | rex "Search not executed:\s(?<reason>.+?)\.\s+(?<limits>(?:current=(?<current>\d+)\s+maximum=(?<maximum>\d+)|usage=(?<usage>\d+)\s+quota=(?<quota>\d+)\s+user=(?<user>.+?)\.))\s+for each:\s(?<search_user>[^;]+);(?<search_context>[^;]+);(?<search_name>.+)" | eval limit=coalesce(maximum,quota,0), actual=coalesce(current,usage,0) | fillnull value=_na_ user | stats count by search_context, search_name, reason, user, search_user, limit
If anything is confusing or has any issues, please let me know. I will post another batch of tips and tricks soon!
Sign up for Programming and Cyber Security Tips and Scripts by Email: