header
header Register : : Login header
header
divider
menuleft
menuright
submenu
left

[August 25th, 2008] Check the home page regarding PowerShell related news from a brand new sponsor: Idera

Comparing CSV
Last Post 10 Dec 2008 09:04 PM by doubleplay1. 8 Replies.
Printer Friendly
Sort:
PrevPrev NextNext
You are not authorized to post a reply.
Author Messages
doubleplay1User is Offline
New Member
New Member
Posts:4
Avatar

--
13 Nov 2008 06:45 AM  

Hi,  I am fairly new to Powershell, but I am loving it. I am stuck at the end of my little script and am having problems comparing/parsing through two csv files and exporting the difference in the same format. 

Ok, so below you will find so far what kind of compare string has been put together so far with the help of other powershell users.

$Master = import-csv c:\scripts\master.csv
$New = import-csv c:\scripts\new.csv
Compare-Object $Master $New|
Where {$._SideIndicator -eq '=>'}|
ForEach-Object {$_.InputObject}|
Export-Csv c:\scripts\finalnew.csv

Basically, I want it to read each line in the master csv (grabbing all the information from each line) and parse through every line on the new.csv file to make sure none of that information is already on the master.  e.g : If line 1 is on master and also on new, do not include it on the finalnew.csv or viceversa; whatever can make the script better would be cool. I basically will be using this as a master list for approved users and new as the new users and make sure that the difference is the output.  The odd part is the compare-object seems to compare somewhat and output a file, but it is almost a hit and miss, it is doing the task that I want it to do.  Again, this might be something simple and it is way over my head at this moment.



I will also post below the code that I am using to export the information from AD and to create the new.csv file I reference above.
Thanks for all your help and Any help/suggestions are welcomed and appreciated!

Get-QADUser -enabled -sizelimit 0 |
Select-Object SamAccountName,@{n="LastLogonTimeStamp";e={$_.LastLogonTimeStamp.value}},WhenCreated,PasswordNeverExpires,AccountExpires,Description|
Sort-Object LastLogonTimeStamp |
Export-Csv c:\scripts\new.csv

smurawskiUser is Offline
New Member
New Member
Posts:46

--
18 Nov 2008 01:09 PM  
A possible issue with the "hit and miss" nature of your results might be the sync window. Compare-Object has a parameter called syncWindow, which by default is 5. This means the comparision will look in a five objects range for a match. Increasing this size will allow it to search further into your collection, but also increases the processing time (since it looks through a greater number of objects).
In your case, I would try increasing the syncWindow and see if your results get a bit better.
Steven Murawski
Co-Host - Mind of Root (www.mindofroot.com)
Host - PowerShell Basics (powershell-basics.com)
PoshoholicUser is Online
PowerShell MVP
New Member
New Member
Posts:38
Avatar

--
18 Nov 2008 03:16 PM  

Something tells me Compare-Object isn't going to do the trick for you here, but since this is PowerShell, there are a lot of options available. You could do this:

$Master = import-csv c:\scripts\master.csv
$New = import-csv c:\scripts\new.csv
$Shared = @(($Master + $New) | Group-Object -Property Name | Where-Object {$_.Count -eq 2} | ForEach-Object {$_.Group[ 0 ]})

At this point, $Shared will contain an array of any items with the same 'Name' property that are in both $Master and $New. If you only want to identify the actual names and not the whole objects (which I suspect is what you're after since you're talking about new users), change that last line to this:

$Shared = @(($Master + $New) | Group-Object -Property Name | Where-Object {$_.Count -eq 2} | ForEach-Object {$_.Group[ 0 ].Name})

Now you have an array of strings where each string is the 'Name' property of an item (a user in your case) in new.csv that is already in master.csv. Removing the duplicates using this list is then very easy.

$New | Where-Object {$Shared -notcontains $_.Name} | Export-Csv c:\scripts\finalnew.csv

Let us know how that works for you.

--
Kirk Munro [MVP]
Poshoholic
http://poshoholic.com

doubleplay1User is Offline
New Member
New Member
Posts:4
Avatar

--
18 Nov 2008 04:13 PM  
Ok guys, first off thanks for your replies!
smurawski : the sync window still did not resolve the issue, I tried a real world scenario and it still missed.
Posh: I tried yours with the slight difference of changing .name to .SamAccountName and it didnt work.
Please let me know if this is ok to do since I am not exporting Name , rather SamAccountName.

I tried it and the final is exactly the same file as the new . I did this by using UltraCompare and seeing there are no differences whatsoever.

Any ideas why it is not even doing any comparing at all?

THanks!
smurawskiUser is Offline
New Member
New Member
Posts:46

--
18 Nov 2008 04:24 PM  

I've run into a similiar problem with Compare-Object when trying to compare the content of DataRows being returned from a database.

What I did there was break the objects down into objects that represented each property and compared those using Compare-Object.

You could try
 


$Master = import-csv c:\scripts\master.csv | foreach { $_.psobject.properties} | select name, value
$New = import-csv c:\scripts\new.csv | foreach { $_.psobject.properties} | select name, value
Compare-Object $Master $New|
Where {$._SideIndicator -eq '=>'}|
ForEach-Object {$_.InputObject}|
Export-Csv c:\scripts\finalnew.csv



That will give you any of the properties (like SamAccountName etc..) that do not exist in the Master.csv file that are present in the new.csv

You will have to rebuild your objects (the export-csv at the end will have a different format ) if you want the same csv output.  I'll try to work that up on my lunch break.

Steven Murawski
Co-Host - Mind of Root (www.mindofroot.com)
Host - PowerShell Basics (powershell-basics.com)
doubleplay1User is Offline
New Member
New Member
Posts:4
Avatar

--
18 Nov 2008 08:53 PM  
Smurawski : Man you lost me, I am gonna take a look at this after work :p
I am fairly new and rebuilding my objects is a little over my head right now, but I do appreciate it .

Now on another note, am I the only that thinks such a useful process should be pretty much out of the package from powershell? Its a common task to parse and compare, i mean csv's, and yet I have been dealing with more than 8 people and a true powershell solution still has not been resolved :)

I will definitely post a full powershell solution when I find one :)

Thanks again!


PoshoholicUser is Online
PowerShell MVP
New Member
New Member
Posts:38
Avatar

--
19 Nov 2008 05:20 AM  
So with SamAccountName, your script looks like this then?

$Master = import-csv c:\scripts\master.csv
$New = import-csv c:\scripts\new.csv
$Shared = @(($Master + $New) | Group-Object -Property SamAccountName | Where-Object {$_.Count -eq 2} | ForEach-Object {$_.Group[ 0 ].SamAccountName})
$New | Where-Object {$Shared -notcontains $_.SamAccountName} | Export-Csv c:\scripts\finalnew.csv

I'm just verifying that you made the three required changes: one in the Group-Object cmdlet, one in the ForEach-Object cmdlet, and one in the Where-Object cmdlet. Please confirm, because this should work.

Also, two more tips to help you iron this out.
1. Download and use PowerGUI (http://powergui.org). It comes with a free debugger so that you can really figure out what is going wrong by stepping through your script.
2. Insert the following statement before the last line and comment out the last line:
$Shared
That will output the SamAccountNames that this script thinks are shared between the two files, if any.

--
Kirk Munro [MVP]
Poshoholic
http://poshoholic.com
PoshoholicUser is Online
PowerShell MVP
New Member
New Member
Posts:38
Avatar

--
19 Nov 2008 05:57 AM  
Be careful with Compare-Object because the name of that cmdlet makes it sound grand, but the default behaviour is nothing more than a simple string comparison that doesn't result in what you want in many if not most cases.

The best way to illustrate this is with an example. Take a look at this transcript where I try to compare to process objects:

[198] PS↑ C:\> $p1 = get-process -Id 820
[199] PS↑ C:\> $p2 = get-process -Id 5584
[200] PS↑ C:\> $p1

Handles NPM(K) PM(K) WS(K) VM(M) CPU(s) Id ProcessName
------- ------ ----- ----- ----- ------ -- -----------
185 12 9952 23108 95 8,52 820 notepad


[201] PS↑ C:\> $p2

Handles NPM(K) PM(K) WS(K) VM(M) CPU(s) Id ProcessName
------- ------ ----- ----- ----- ------ -- -----------
66 6 3216 6796 66 3,46 5584 notepad


[202] PS↑ C:\> Compare-Object $p1 $p2 -IncludeEqual

InputObject SideIndicator
----------- -------------
System.Diagnostics.Process (notepad) ==


[203] PS↑ C:\> $p1.ToString()
System.Diagnostics.Process (notepad)
[204] PS↑ C:\> $p2.ToString()
System.Diagnostics.Process (notepad)
[205] PS↑ C:\> Compare-Object $p1 $p2 -Property Id,ProcessName -IncludeEqual

Id ProcessName SideIndicator
-- ----------- -------------
5584 notepad =>
820 notepad <=

First I retrieve two processes by their id, one at a time. Both processes are notepad.exe. If I were to compare these objects using Compare-Object, I would expect them to be different because they are two different processes. Yet if I use Compare-Object with no other parameters to instruct the cmdlet about how I want to do the comparison, the processes are treated as equal. Why? Because if you call ToString() on each of the processes, you get the same string back. Since that matches, the processes are considered equal. So what do you do to change this? Use the -Property parameter. With -Property, you can specify which property you want to use in the comparison. In your case, you want to use SamAccountName. For the process example, I specify two properties, name and id. Now suddenly Compare-Object properly recognizes that the two processes are not the same at all.

One more example to make sure this is clear. What if you wanted to find any Windows services whose state has changed between two points in time. Here's a transcript to show what that might look like:

[216] PS↑ C:\> $s1 = Get-Service
[217] PS↑ C:\> Stop-Service wuauserv
[218] PS↑ C:\> $s2 = Get-Service
[219] PS↑ C:\> Compare-Object $s1 $s2
[220] PS↑ C:\> Compare-Object $s1 $s2 -Property Name,Status

Name Status SideIndicator
---- ------ -------------
wuauserv Stopped =>
wuauserv Running <=

First I get the service data. Then, after stopping the Windows Update service, I refetch the service data. Finally, I compare the two using Compare-Object with the default settings. This returns nothing, indicating that the two objects are equal. But that's not the case. It does this because the ToString() method on the service objects returns the same string for every service: 'System.ServiceProcess.ServiceController'. So it's like I compare two arrays the exact same size where every value is the exact same. Not what I want. But if I use -Property and specify which properties I care about (Name and Status), I get exactly the information I was looking for.

Now that I've said all that, I'm going to backpedal on what I said in my first reply to this post. Compare-Object can indeed do the comparisons you want it to do. You just need to know how to tell it to do those comparisons. I've never paid much attention to Compare-Object because I could do what I needed to without it. After thinking about this some more, and then digging in to the details, I'll start using it a little more often in my scripts. So it's up to you if you want to try to use Compare-Object and tinker with the SyncWindow parameter if you don't quite get what you need, or you can do your comparison without it like I've shown here.

--
Kirk Munro [MVP]
Poshoholic
http://poshoholic.com
doubleplay1User is Offline
New Member
New Member
Posts:4
Avatar

--
10 Dec 2008 09:04 PM  
Kirk, I am sorry for the late reply but I was away for a while .
After so many tries to correctly compare via powershell, it was suggested to go with Logparser after gathering the data ; This did the job perfectly and extremely fast!
So basically the data gathering was fine, and the compare was done with logparser.
I still will try and get a final one via powershell, and If I do I will post it.
Thanks again and email me if you find a way or have any comments!
You are not authorized to post a reply.

Active Forums 4.1
right
   
footer Sponsored by Quest Software • SAPIEN Technologies • ShellTools, LLC • Microsoft Windows Server 2008 footer
footer