header
header Register : : Login header
header
divider
menuleft
menuright
submenu
left

[August 25th, 2008] Check the home page regarding PowerShell related news from a brand new sponsor: Idera

how can I use get-content to analyze sql server logfile ?
Last Post 28 Oct 2008 08:18 AM by ranmel. 14 Replies.
Printer Friendly
Sort:
PrevPrev NextNext
You are not authorized to post a reply.
Author Messages
ranmelUser is Offline
New Member
New Member
Posts:7
Avatar

--
23 Oct 2008 11:07 PM  

my problem is that SQL server log file looks like this

2008-07-16 09:51:51.40 spid3 "OK" 
2008-07-16 09:51:52.04 spid3 "OK" 
2008-07-16 09:51:52.04 spid5 "OK" 
2008-07-16 09:51:52.04 spid3 "OK" 
2008-07-16 09:51:52.04 spid5 "ERROR:" 
2008-07-16 09:51:52.04 spid3 "OK" 
2008-07-16 09:51:52.04 spid5 "ERROR:" 
2008-07-17 09:51:52.04 spid5 "ERROR:" 
2008-07-17 09:51:52.04 spid3 "OK" 
2008-07-18 09:51:52.04 spid5 "ERROR:" 
2008-07-18 09:51:52.04 spid5 "OK" 
2008-07-18 09:51:52.04 spid5 "ERROR:" 

 

I want to be be able to search for errors
(see also "ERROR:" only when the date is yesterday or today .
(for this example assume that today is 2008-07-18


I was able to filter the errors but I need to filter only the spexific dates
since I do not want to see old errors


note that in SQL Server the location of dates are always the same .
I assume I need to use the substring with
select-String but I am not sure and I do not know how.

this my code
Get-Content c:\sample_ERRORLOG.txt | Select-String "ERROR:" 


thanks in advance
Ran

halr9000User is Offline
PowerShell MVP, Site Admin
Basic Member
Basic Member
Posts:334
Avatar

--
24 Oct 2008 12:40 AM  
PS > $l = '2008-07-18 09:51:52.04 spid5 "ERROR:"'
PS > $l.Split(" ")
2008-07-18
09:51:52.04
spid5
"ERROR:"
PS > $a = $l.Split(" ")
PS > $a[0]
2008-07-18
PS > $l.Split(" ")[3]
"ERROR:"


Does that help? :) I'll be coy, let us know how it goes.
Community Director, PowerShellCommunity.org
Co-host, PowerScripting Podcast
Author, TechProsaic
halr9000User is Offline
PowerShell MVP, Site Admin
Basic Member
Basic Member
Posts:334
Avatar

--
24 Oct 2008 12:43 AM  
Argh, I forgot about the forum bug preventing [ number ] from formatting correctly. We are working on that...
Community Director, PowerShellCommunity.org
Co-host, PowerScripting Podcast
Author, TechProsaic
ranmelUser is Offline
New Member
New Member
Posts:7
Avatar

--
24 Oct 2008 11:30 AM  
I was looking for a way to this in a singlr line using Select-String "ERROR:" with substring fucntion to check the dates
ranmelUser is Offline
New Member
New Member
Posts:7
Avatar

--
24 Oct 2008 12:09 PM  
this an example how I would like to solve the problem ( any other solution is welcome as well)

it does not work but I think you understand what I mean

Get-Content c:\sample_ERRORLOG.txt | Select-String "ERROR:" | where {$_.substring(1,10) -eq Get-Date -format yyyy-M-d }



ran
ShayUser is Offline
Basic Member
Basic Member
Posts:272
Avatar

--
24 Oct 2008 01:33 PM  
This will break the file into two column objects, it gets only rows that contain errors.

PS > get-content err.txt | where {$_.contains("ERROR:")} | select @{n="Date";e={[datetime]$_.substring(0,19)}},
@{n="SPID";e={$_.substring(23,5)}}

Date SPID
---- ----
7/16/2008 09:51:52 spid5
7/16/2008 09:51:52 spid5
7/17/2008 09:51:52 spid5
7/18/2008 09:51:52 spid5
7/18/2008 09:51:52 spid5
Shay Levy
Windows PowerShell MVP
http://blogs.microsoft.co.il/blogs/ScriptFanatic
ranmelUser is Offline
New Member
New Member
Posts:7
Avatar

--
24 Oct 2008 07:52 PM  
first of all thanks, this is the kind of solution I was looking for .
but still I need to be able to filter the lines contains errors (see also :{$_.contains("ERROR:")} )
with another filter.
the last filter I need is to make sure that the lines contains error were created at the last 2 days.
since I do not want to see old errors.


thanks agains

Ran

ShayUser is Offline
Basic Member
Basic Member
Posts:272
Avatar

--
25 Oct 2008 12:34 AM  

You can add a second where-object (at the end of the pipe) to filter just the 2 days old:

PS > $old = (get-date).addDays(-2)
PS > get-content err.txt | where {$_.contains("ERROR:")} | select @{n="Date";e={[datetime]$_.substring(0,19)}},
@{n="SPID";e={$_.substring(23,5)}} | where { $_.date -gt $old }


Or filter them on the fly:

PS > $old = (get-date).addDays(-2)
PS > get-content err.txt | where {$_.contains("ERROR:") -and ([datetime]$_.substring(0,19) -gt $old) } | PS select @{n="Date";e={[datetime]$_.substring(0,19)}},@{n="SPID";e={$_.substring(23,5)}}

Shay Levy
Windows PowerShell MVP
http://blogs.microsoft.co.il/blogs/ScriptFanatic
ShayUser is Offline
Basic Member
Basic Member
Posts:272
Avatar

--
25 Oct 2008 12:49 AM  
BTW, my spid column parsing is not perfect, spid's like 'spid51' will show like 'spid5', this will fix it:

@{n="SPID";e={$_.substring(23,$_.lastIndexOf(" ")-23)}}
Shay Levy
Windows PowerShell MVP
http://blogs.microsoft.co.il/blogs/ScriptFanatic
glnsizeUser is Offline
Basic Member
Basic Member
Posts:101

--
25 Oct 2008 03:05 AM  

hmmm how about a one-liner...

[regex]::matches((get-content err.txt), '(\d{4}\-\d{2}\-\d{2})\s(\S+)\s(\S+)\s\"ERROR:"') | `
?{[datetime]$_.groups[ 1 ].value -gt ((get-date).addDays(-2))} | select value

 ~Glenn

 

ranmelUser is Offline
New Member
New Member
Posts:7
Avatar

--
25 Oct 2008 04:40 PM  
first of all it works .
but I have 2 questions and 1 problem.

[regex]::matches((get-content c:\sample_ERRORLOG.txt), '(\d{4}\-\d{2}\-\d{2})\s(\S+)\s(\S+)\s\"ERROR:"') | `?{[datetime]$_.groups[ 1 ].value -gt ((get-date).addDays(-2))} | select value

how it works it looks very complicated
can you please explain me the regular expression and the second part that deals with comparing dates.

and I have problem .
if i had something to the lines with errors

for example instead of
2008-07-16 09:51:52.04 spid5 "ERROR:"
I write
2008-07-16 09:51:52.04 spid5 "ERROR:" SQL server failed

I still want to see all the line but when I use the code you you send me I keep getting only the "ERROR:" without the
the rest of the line .

and one last question
can I use the same technique without regular expression ?


thanks in advance

Ran
glnsizeUser is Offline
Basic Member
Basic Member
Posts:101

--
25 Oct 2008 06:16 PM  
np... I usually explain any code I post, but the last couple of times it was taken incorrectly. The ONE TIME I don't... j/k

[regex]::matches((get-content err.txt), '(\d{4}\-\d{2}\-\d{2})\s(\S+)\s(\S+)\s\"ERROR.+') | ` 
?{[datetime]$_.groups[ 1 ].value -gt ((get-date).addDays(-2))} | select value

So how does it work? 

[regex]::Matches() uses the [regex] accelerator to call the Matches() method.

Matches() takes two args a string[], and a regular expression. The regular expression in this case:  '(\d{4}\-\d{2}\-\d{2})\s(\S+)\s(\S+)\s\"ERROR*+')

A quick bit of background info.
\d = digit, \s = white space, \S = non-white space, \ = escape, + = greedy 

Greedy is a not so fancy way of saying match as much as possible. When dealing with regular expressions they are processed one character at a time right to left. Therefore you have to account for every character that could be in a string you want to match. i.e. if you know a digit should be first \d, one or two digits \d{1,2}, or as with our example four digits \d{4}...

Knowing that our regex reads

 Match (4 digits - 2 digits - 2 digits) white space (any non-whitespace ) white space (any non-whitespace ) white space and finally the exact text "ERROR followed by anything

That how we turn '2008-07-16 09:51:52.04 spid5 "ERROR:"' into:
2008-10-16 09:51:52.04 spid5 "ERROR:"  #group 0
2008-10-16                                                       #group 1
09:51:52.04                                                       #group 2
spid5                                                                  #group 3

placing portions of a regex within () will denote a group. Hence the groups [0..3] .   Any line that matches our REGEX will be passed down the pipeline, anything that doesn’t match is dropped. For me this is why I use REGEX when possible/practical. Its simple physics the less data you pass down the pipeline the faster your command will run. If performance isn
t an issue i.e. if your script runs fine doesnt worry about it, but if you’re trying to speed up a slow script that parses text REGEX is your man.

Okay, so how does the date stuff work? [datetime] is the .net accelerator for system.datetime by casting our group 1 into system.datetime i.e.

 [datetime]$_.groups[ 1 ].value

We convert the String object "2008-10-16" into a datetime object set to Oct 16th 2008 12:00:00 AM.  Why is that important?

Well PowerShell does ALOT of really cool translation for us. Just the fact that it cast like that is frigin' awesome, but it also very useful. .Net is very strict language meaning only [objx] can interact with [objx], or [datetime] to [datetime].  PowerShell translates all of this on the fly to make our code work. In our case PowerShell translates "2008-10-16" into a datetime object, and then compares that against get-date minus 2 days. (get-date with no params is the same as [datetime]::now ) I believe it accomplishes this by creating a time span and just performing a bool, but I'm not positive...

Any object that matches our time span. Will return true to where-Object, and is passed down the pipeline. Finally, since we never really manipulated the original data. We pipe to (select-object value) to clean up the output.

I know it looks complicated but that's what happens when you compress 15 readable lines of code into one. I hope that took away the veil of complexity ... It's actually Very simple!

Sorry for the confusion,
~Glenn
ranmelUser is Offline
New Member
New Member
Posts:7
Avatar

--
25 Oct 2008 07:31 PM  
well it looks very cool and I think I got it ,
I am impressed with the power of the RegExp that actually creates groups.

the only problem I have is that if I have more data after the "ERROR:"
I can not see it even when piping "select value"

for example instead of
2008-07-16 09:51:52.04 spid5 "ERROR:"
I write
2008-07-16 09:51:52.04 spid5 "ERROR:" SQL server failed
and I want to see the all line .


I belive this have to do with the RegExp but I am not sure.

if you could solve this problem then we're done.

by the way you can learn a lot from these tricks ,
I know I can do it in 15 lines code but your technique open my eyes to new possibilities.

Ran
glnsizeUser is Offline
Basic Member
Basic Member
Posts:101

--
26 Oct 2008 12:52 AM  

ahhh... missed that in your previous post.  Well now that kinda changes everything.  Get-content reads the whole file, into an array. Normally this isn't a problem.  However in this case it is, because GC strips off the line breaks \r\n.  Without those line breaks we can't do a blind match. Therefor my answer is null and void, well kinda...  First your question.

I went to re-write this using foreach and -match, and then /\/\o\/\/ popped in my goggle reader.  It seams your question caught the attention of the Powershell guy himself!  http://thepowershellguy.com/blogs/posh/archive/2008/10/24/hey-powershell-guy-how-can-i-use-get-content-to-analyze-sql-server-logfile.aspx 

Back to me... the problem is that [regex]::Match() matches against the whole string, and when you access the information from get-content without the pipeline. It reads the whole file in only this time there are no line breaks.  See the problem here... the only way to perform a blind match in regex is with "." DOT.  DOT essentially says match the next anything I don't care(with the only exception being line breaks).  Therefore .+ DOT.PLUS will match anything to the end until it encounters a line break. 

Why didn't I catch that before? That was due to an oversight on my part.  When I tested that one liner I never used an actual file. Instead I simulated one...

$log = @"
2008-07-16 09:51:51.40 spid3 "OK" 
2008-07-16 09:51:52.04 spid3 "OK" 
2008-07-16 09:51:52.04 spid5 "OK" 
2008-07-16 09:51:52.04 spid3 "OK" 
2008-07-16 09:51:52.04 spid5 "ERROR:" 
2008-07-16 09:51:52.04 spid3 "OK" 
2008-07-16 09:51:52.04 spid5 "ERROR:" 
2008-07-17 09:51:52.04 spid5 "ERROR:" 
2008-07-17 09:51:52.04 spid3 "OK" 
2008-07-18 09:51:52.04 spid5 "ERROR:" 
2008-07-18 09:51:52.04 spid5 "OK" 
2008-07-18 09:51:52.04 spid5 "ERROR:" 
"@

 [regex]::matches($log, '(\d{4}\-\d{2}\-\d{2})\s(\S+)\s(\S+)\s\"ERROR.+') | ` 
?{[datetime]$_.groups[ 1 ].value -gt ((get-date).addDays(-2))} | select value

The key there, LINE BREAKS!  Now if you absolutely wanted/needed to you could reinsert those line breaks with the following.
 $log = [string]::join([environment]::NewLine, (get-content err.txt))

Thanks for such an interesting question... had fun playing with that. 

Now let me go type all this in my blog :)
~Glenn


UPDATE: I just re-read my post, and want to clarify something.  The problem I ran into was in my implementation not regex itself.  whether you use -match, [regex]::match(), etc they all use the same subsystem.  They just maintain different use cases.

ranmelUser is Offline
New Member
New Member
Posts:7
Avatar

--
28 Oct 2008 08:18 AM  
sorry for the delay.

thank you and to all the other guys on the forum.

I will look into the final solution you will write in your blog (the small script you are planning to write) .

Ran
You are not authorized to post a reply.

Active Forums 4.1
right
   
footer Sponsored by Quest Software • SAPIEN Technologies • ShellTools, LLC • Microsoft Windows Server 2008 footer
footer