PoshCode Logo PowerShell Code Repository

Select-Random v2.2 by Joel Bennett 6 years ago (modification of post by Joel Bennett view diff)
diff | embed code: <script type="text/javascript" src="http://PoshCode.org/embed/118"></script>download | new post

Select a user-defined number of random elements from the collection … which can be passed as a parameter or input via the pipeline. An improvement over http://www.powershellcentral.com/scripts/60 which allows you to select more than one item, but offers the option to collect the pipeline into RAM to trade speed for memory use (replaces 81 and 83).

  1. # ---------------------------------------------------------------------------
  2. ### <Script>
  3. ### <Author>
  4. ### Joel "Jaykul" Bennett
  5. ### </Author>
  6. ### <Description>
  7. ### Selects a random element from the collection either passed as a parameter or input via the pipeline.
  8. ### If the collection is passed in as an argument, we simply pick a random number between 0 and count-1
  9. ### for each element you want to return, but when processing pipeline input we want to keep memory use
  10. ### to a minimum, so we use a "reservoir sampling" algorithm[1].
  11. ###
  12. ### [1] http://gregable.com/2007/10/reservoir-sampling.html
  13. ###
  14. ### The script stores $count elements (the eventual result) at all times. It continues processing
  15. ### elements until it reaches the end of the input. For each input element $n (the count of the inputs
  16. ### so far) there is a $count/$n chance that it becomes part of the result.
  17. ### * For each previously selected element, there is a $count/($n-1) chance of it being selected
  18. ### * For the ones selected, there's a ($count/$n * 1/$count = 1/$n) chance of it being replaced, so a
  19. ###   ($n-1)/$n chance of it remaining ... thus, it's cumulative probability of being among the selected
  20. ###   elements after the nth input is processed is $count/($n-1) * ($n-1)/$n = $count/$n, as it should be.
  21. ###
  22. ### </Description>
  23. ### <Usage>
  24. ### $arr = 1..5; Select-Random $arr
  25. ### 1..10 | Select-Random -Count 2
  26. ### </Usage>
  27. ### <Version>2.2.0.0</Version>
  28. ### <History>
  29. ### <V id="2.0.0.0">Rewrote using the reservoir sampling technique</V>
  30. ### <V id="2.1.0.0">Fixed a bug in 2.0 which inverted the probability and resulted in the last n items being selected with VERY high probability</V>
  31. ### <V id="2.2.0.0">Use more efficient direct random sampling if the collection is passed as an argument</V>
  32. ### </History>
  33. ### </Script>
  34. # ---------------------------------------------------------------------------
  35. param([int]$count=1, [switch]$collectionMethod, [array]$inputObject=$null)
  36.  
  37. BEGIN {
  38.    if ($args -eq '-?') {
  39. @"
  40. Usage: Select-Random [[-Count] <int>] [-inputObject] <array> (from pipeline) [-?]
  41.  
  42. Parameters:
  43. -Count            : The number of elements to select.
  44. -inputObject      : The collection from which to select a random element.
  45. -collectionMethod : Collect the pipeline input instead of using reservoir
  46. -?                : Display this usage information and exit
  47.  
  48. Examples:
  49. PS> $arr = 1..5; Select-Random $arr
  50. PS> 1..10 | Select-Random -Count 2
  51.  
  52. "@
  53. exit
  54.    }
  55.    else
  56.    {
  57.       $rand = new-object Random
  58.       if ($inputObject)
  59.       {
  60.          # Write-Output $inputObject | &($MyInvocation.InvocationName) -Count $count
  61.       }
  62.       elseif($collectionMethod)
  63.       {
  64.          Write-Verbose "Collecting from the pipeline "
  65.          [Collections.ArrayList]$inputObject = new-object Collections.ArrayList
  66.       }
  67.       else
  68.       {
  69.          $seen = 0
  70.          $selected = new-object object[] $count
  71.       }
  72.    }
  73. }
  74. PROCESS {
  75.    if($_)
  76.    {
  77.       if($collectionMethod)
  78.       {
  79.          $inputObject.Add($_) | out-null
  80.       } else {
  81.          $seen++
  82.          if($seen -lt $count) {
  83.             $selected[$seen-1] = $_
  84.          } ## For each input element $n there is a $count/$n chance that it becomes part of the result.
  85.          elseif($rand.NextDouble() -lt ($count/$seen))
  86.          {
  87.             ## For the ones previously selected, there's a 1/$n chance of it being replaced
  88.             $selected[$rand.Next(0,$count)] = $_
  89.          }
  90.       }
  91.    }
  92. }
  93. END {
  94.    if (-not $inputObject)
  95.    {  ## DO ONCE: (only on the re-invoke, not when using -inputObject)
  96.       Write-Verbose "Selected $count of $seen elements."
  97.       Write-Output $selected
  98.       # foreach($el in $selected) { Write-Output $el }
  99.    }
  100.    else
  101.    {
  102.       Write-Verbose ("{0} elements, selecting {1}." -f $inputObject.Count, $Count)
  103.       foreach($i in 1..$Count) {
  104.          Write-Output $inputObject[$rand.Next(0,$inputObject.Count)]
  105.       }  
  106.    }
  107. }

Submit a correction or amendment below (
click here to make a fresh posting)
After submitting an amendment, you'll be able to view the differences between the old and new posts easily.

Syntax highlighting:


Remember me