PoshCode Logo PowerShell Code Repository

Get-FileEncoding.ps1 by Lee Holmes 4 years ago
embed code: <script type="text/javascript" src="http://PoshCode.org/embed/2153"></script>download | new post

From Windows PowerShell Cookbook (O’Reilly) by Lee Holmes

  1. ##############################################################################
  2. ##
  3. ## Get-FileEncoding
  4. ##
  5. ## From Windows PowerShell Cookbook (O'Reilly)
  6. ## by Lee Holmes (http://www.leeholmes.com/guide)
  7. ##
  8. ##############################################################################
  9.  
  10. <#
  11.  
  12. .SYNOPSIS
  13.  
  14. Gets the encoding of a file
  15.  
  16. .EXAMPLE
  17.  
  18. Get-FileEncoding.ps1 .\UnicodeScript.ps1
  19.  
  20. BodyName          : unicodeFFFE
  21. EncodingName      : Unicode (Big-Endian)
  22. HeaderName        : unicodeFFFE
  23. WebName           : unicodeFFFE
  24. WindowsCodePage   : 1200
  25. IsBrowserDisplay  : False
  26. IsBrowserSave     : False
  27. IsMailNewsDisplay : False
  28. IsMailNewsSave    : False
  29. IsSingleByte      : False
  30. EncoderFallback   : System.Text.EncoderReplacementFallback
  31. DecoderFallback   : System.Text.DecoderReplacementFallback
  32. IsReadOnly        : True
  33. CodePage          : 1201
  34.  
  35. #>
  36.  
  37. param(
  38.     ## The path of the file to get the encoding of.
  39.     $Path
  40. )
  41.  
  42. Set-StrictMode -Version Latest
  43.  
  44. ## The hashtable used to store our mapping of encoding bytes to their
  45. ## name. For example, "255-254 = Unicode"
  46. $encodings = @{}
  47.  
  48. ## Find all of the encodings understood by the .NET Framework. For each,
  49. ## determine the bytes at the start of the file (the preamble) that the .NET
  50. ## Framework uses to identify that encoding.
  51. $encodingMembers = [System.Text.Encoding] |
  52.     Get-Member -Static -MemberType Property
  53.  
  54. $encodingMembers | Foreach-Object {
  55.     $encodingBytes = [System.Text.Encoding]::($_.Name).GetPreamble() -join '-'
  56.     $encodings[$encodingBytes] = $_.Name
  57. }
  58.  
  59. ## Find out the lengths of all of the preambles.
  60. $encodingLengths = $encodings.Keys | Where-Object { $_ } |
  61.     Foreach-Object { ($_ -split "-").Count }
  62.  
  63. ## Assume the encoding is UTF7 by default
  64. $result = "UTF7"
  65.  
  66. ## Go through each of the possible preamble lengths, read that many
  67. ## bytes from the file, and then see if it matches one of the encodings
  68. ## we know about.
  69. foreach($encodingLength in $encodingLengths | Sort -Descending)
  70. {
  71.     $bytes = (Get-Content -encoding byte -readcount $encodingLength $path)[0]
  72.     $encoding = $encodings[$bytes -join '-']
  73.  
  74.     ## If we found an encoding that had the same preamble bytes,
  75.     ## save that output and break.
  76.     if($encoding)
  77.     {
  78.         $result = $encoding
  79.         break
  80.     }
  81. }
  82.  
  83. ## Finally, output the encoding.
  84. [System.Text.Encoding]::$result

Submit a correction or amendment below (
click here to make a fresh posting)
After submitting an amendment, you'll be able to view the differences between the old and new posts easily.

Syntax highlighting:


Remember me