header
header Register : : Login header
header
divider
menuleft
menuright
submenu
left

[August 25th, 2008] Check the home page regarding PowerShell related news from a brand new sponsor: Idera

Help parsing text file
Last Post 05 Jun 2008 11:59 PM by NathanW. 8 Replies.
Printer Friendly
Sort:
PrevPrev NextNext
You are not authorized to post a reply.
Author Messages
halr9000User is Offline
Basic Member
Basic Member
Posts:303

--
02 Jun 2008 02:31 PM  

(Reposting due to an error with the forums which i'm tshooting.  Original post from NathanW.)

I was wondering if someone could help me find a way to parse a text file
that I have.  I use this file every month to do a report and the program only
outputs plain txt format, I would like to be able to parse the file and do
some operations like sum and max on the data.  Below is a cut down version of
the file:

#FILE#
Datasets:
Site: {Site}
Direction: 1 - North bound, A hit first., Lane: 0
Survey Duration: 10:30 Wednesday, 12 June 2002 => 12:08 Thursday, 25 July 2002
File: {File Path}
Identifier: 
Algorithm: 
Data type: 

Profile:
Filter time: 10:30 Wednesday, 12 June 2002 => 12:08 Thursday, 25 July 2002
Included classes: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
Speed range: 10 - 160 km/h.
Direction: North, East, South, West (bound)
Separation: All - (Headway)
Name: Factory default profile
Scheme:
Units: Metric (meter, kilometer, m/s, km/h, kg, tonne)

Column Legend:
 0  [Time]  24-hour time (0000 - 2359)
 1  [Total]  Number in time step
 2  [Vbin]  Speed bin totals
 3  [Mean]  Average speed
 4  [Vpp]  Percentile speed

 

* Wednesday, 12 June 2002
 Time  Total  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Mean   Vpp
                10    20    30    40    50    60    70    80    90   100   110   120   130   140   150          85
                20    30    40    50    60    70    80    90   100   110   120   130   140   150   160            
 1030      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1045      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1100      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1115      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1130      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1145      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1200      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1215      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1230      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1245      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1300      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1315      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1330      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1345      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1400      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -


* Thursday, 13 June 2002
 Time  Total  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Mean   Vpp
                10    20    30    40    50    60    70    80    90   100   110   120   130   140   150                                    85
                20    30    40    50    60    70    80    90   100   110   120   130   140   150   160            
 0000      3     0     0     0     0     1     2     0     0     0     0     0     0     0     0     0                                               59.2     -
 0015      1     0     0     0     0     1     0     0     0     0     0     0     0     0     0     0                                               55.6     -
 0030      1     0     0     0     0     0     1     0     0     0     0     0     0     0     0     0                                               60.2     -
 0045      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0                                                 -     -
 0100      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0                                                -     -
 0115      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0  
 0130      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0   
 0145      1     0     0     0     0     1     0     0     0     0     0     0     0     0     0     0                                                59.4     -
 0200      1     0     0     0     0     1     0     0     0     0     0     0     0     0     0     0                                                52.1     -
 0215      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0    
0230      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0   
 0245      2     0     1     0     0     1     0     0     0     0     0     0     0     0     0     0                                                39.9     -
 0300      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0    
 0315      1     0     0     0     0     0     1     0     0     0     0     0     0     0     0     0                                                63.8     -
 0330      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0    
 0345      1     0     0     0     0     1     0     0     0     0     0     0     0     0     0     0                                                52.0     -
 0400      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0    
 0415      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0  
 0430      1     0     1     0     0     0     0     0     0     0     0     0     0     0     0     0                                                28.0     -
 0445      3     0     0     0     1     0     1     1     0     0     0     0     0     0     0     0                                                64.1     -
 0500      4     0     0     0     1     2     1     0     0     0     0     0     0     0     0     0                                                54.2     -
 0515      7     0     0     0     1     3     3     0     0     0     0     0     0     0     0     0                                                56.9     -
 0530      4     0     0     0     1     1     2     0     0     0     0     0     0     0     0     0                                                57.3     -
 0545      8     0     0     1     1     3     3     0     0     0     0     0     0     0     0     0                                                55.5     -
 0600      8     0     0     0     2     2     2     2     0     0     0     0     0     0     0     0                                                61.8     -
 0615      4     0     0     0     1     2     0     1     0     0     0     0     0     0     0     0                                                55.5     -
 0630     13     0     0     0     4     5     3     1     0     0     0     0     0     0     0     0                                              55.6  60.1
 0645     11     0     0     0     0     5     3     3     0     0     0     0     0     0     0     0                                              62.9  70.2

#END FILE#

The data that I really need is all the stuff in between the dates, I would really like to be able to make an object with properties for each column heading so that I can sum and average the data.

I have attached a more complete file to the message

Thanks
Nathan

ShayUser is Offline
Basic Member
Basic Member
Posts:214

--
03 Jun 2008 01:38 PM  
How do you want the end result to look? Can you post a sample?

-Shay
bsonposhUser is Offline
Basic Member
Basic Member
Posts:388

--
03 Jun 2008 01:55 PM  
I think a simple solution would something like

get-content $file | ?{$_ -match "^\s*\d\d\d\d"} | %{$_.Split(" ", [System.StringSplitOptions]::RemoveEmptyEntries)}
dpendergUser is Offline
New Member
New Member
Posts:12

--
04 Jun 2008 01:11 AM  

Nice - very clean.  Loops through the rows until it finds a data match - then splits the matched line into invdividual objects.  So how would you then load those objects into an array for processing?

On a side note, I cut and pasted the whole thing into excel and used the excel text-to-columns function and the data mapped in very nicely.  So I was off in Export-csv land trying to write the text file to excle and use an excel function to process the data.  Unfortunately that doesn't seem much easier.

bsonposhUser is Offline
Basic Member
Basic Member
Posts:388

--
04 Jun 2008 01:23 AM  
They are already arrays, but if you mean arrays as a whole

$myobjCol = @()
foreach($item in (get-content $file | ?{$_ -match "^\s*\d\d\d\d"}))
{
   $myobj = $item | %{$_.Split(" ", [System.StringSplitOptions]::RemoveEmptyEntries)}
   $myobjCol += $myobj
}
$myobjCol
bsonposhUser is Offline
Basic Member
Basic Member
Posts:388

--
04 Jun 2008 01:53 AM  
I like this one better :)


$myobj = @{}
foreach($item in (get-content $file | ?{$_ -match "^\s*\d\d\d\d"}))
{
   $name = $item.Split() | ?{$_ -match "\d\d\d\d"}
   $myobj."$name" = $item | %{$_.Split(" ", [System.StringSplitOptions]::RemoveEmptyEntries)}
}
$myobj
bsonposhUser is Offline
Basic Member
Basic Member
Posts:388

--
04 Jun 2008 03:05 AM  
Sorry... couldnt let go :)

Here is some code that will make a custom object collection and fill it with child objects.

The child object has two properties Date and Data.

Date = The Date line starting with *
Data = Custom object with two properties: Time, Vals

Have fun and let me know how it works for you.


$file = "<your file here>"

# Master object to store custom object collection
$myobjcol = @()

switch -regex -file $file
{
    "^\*"           {
                        # Matching Date and adding it as Date Value to a custom object
                        $myobj = "" | Select-Object Date,Data
                        $myobj.Date = $_
                        $myobj.Data = @()
                        continue
                    }
    "^\s*\d\d\d\d"  {
                        # Matching Data String and adding to a data custom object
                        $dataObj = "" | Select-Object Time,Vals
                        $dataObj.Time = $_.Split() | ?{$_ -match "\d\d\d\d"}
                        $dataObj.Vals = $_.Split(" ", [System.StringSplitOptions]::RemoveEmptyEntries)
                        $myobj.Data += $dataObj
                        continue
                    }
    default         {
                        # checking if Data has value and commiting to Master collection if value is found
                        if($myobj.Data)
                        {
                            Write-Host "Saving Object"
                            $myobjCol += $myobj
                            $myobj = $null
                        }
                        continue
                    }
}

# outputting Master Object
$myobjcol
JaykulUser is Offline
New Member
New Member
Posts:31

--
05 Jun 2008 03:46 AM  
Hey, I'm not going to re-solve the whole thing, 'cause it looks like you've got it sorted, but ... y'all really gotta start using stuff from the Script Repository :) There's an Import-Delimited script on there that can import the "table" data and create custom objects ... http://powershellcentral.com/scripts/195 ... of course, you'd still have to sort out just the table part.
NathanWUser is Offline
New Member
New Member
Posts:3

--
05 Jun 2008 11:59 PM  

 

Hi All,

Thanks for all the help.  I have had a bit of a play around with some of the ideas.

I found that at the bottum of each of the sections there is a total section which looks like this:

07-19    130     0     1     5    30    61    28     5     0     0     0     0     0     0     0     0  54.8  64.4
06-22    141     0     1     5    32    68    30     5     0     0     0     0     0     0     0     0  54.9  64.4
06-00    142     0     1     5    32    69    30     5     0     0     0     0     0     0     0     0  54.9  64.4
00-00    142     0     1     5    32    69    30     5     0     0     0     0     0     0     0     0  54.9  64.4

So each section would look like this:

Time  Total  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Vbin  Mean   Vpp
                10    20    30    40    50    60    70    80    90   100   110   120   130   140   150          85
                20    30    40    50    60    70    80    90   100   110   120   130   140   150   160            
 1030      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1045      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1100      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1115      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1130      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1145      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1200      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1215      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
 1230      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     -     -
07-19    130     0     1     5    30    61    28     5     0     0     0     0     0     0     0     0  54.8  64.4
06-22    141     0     1     5    32    68    30     5     0     0     0     0     0     0     0     0  54.9  64.4
06-00    142     0     1     5    32    69    30     5     0     0     0     0     0     0     0     0  54.9  64.4
00-00    142     0     1     5    32    69    30     5     0     0     0     0     0     0     0     0  54.9  64.4

I wasn't 100% sure if it was a total at the start so I didn't worry about reading it but then I asked the people who made the program and they said that it is a total of the above data.  So Insteed of having to read all the data between the two dates I could just read the summry for each section.
I have come up with the following code to parse the file and grab the totals for each section and then calc an average on the last column(Vpp):

$Summery_Found

= $FALSE
cd
C:\Temp
$list = @()
[
regex]::split($(${C:SpeedData.txt} -join "`n"), "`n") |
% {
     #Match the pattern for the date
     if($_ -match '^\*\s*\w+,\s*\d{1,2}\s*\w+\s*\d{4}$'){
          $Record = "" | Select Date,Time,Data
          $Record.Date = $_
          #Set the flag to false to look for the summury info.
          $Summery_Found = $FALSE
        }
      #Match the pattern for the summery of the data section.
      if($_ -match "[0-9]+[0-9]+\-+[0-9]+[0-9]"){
         if(-not $Summery_Found){
            $values = [regex]::split($_, "\s+")
            $Record.Time = $values[0]
            $Record.Data = $values[18]
            $Summery_Found = $TRUE
            $list += $Record}
           }
}
| Format-Table 
#Print the Average
$ave
= $list | ? {$_.Data -ne "-"} | Select @{Name="Filtered";Expression={[double]$_.Data}} | Measure-Object Filtered -Average | Select Average
"85th Average:" + $ave.average

#Print all the info to the console
$list

I was wondering was if there was a better way of doing the above code.  It seems to work well.

Thanks
Nathan.

You are not authorized to post a reply.

Active Forums 4.1
right
   
footer Sponsored by Quest Software • SAPIEN Technologies • ShellTools, LLC • Microsoft Windows Server 2008 footer
footer