What Data Deduplication Does
- Data deduplication optimizes the file data on the volume by performing the following steps:
- Segment the data in each file into small variable-sized chunks.
- Identify duplicate chunks.
- Maintain a single copy of each chunk.
- Compress the chunks.
- Replace redundant copies of each chunk with a reference to a single copy.
- Replace each file with a reparse point containing references to its data chunks.
PowerShell commands for Windows Server 2012 R2 Data Deduplication
To install deduplication components on the server
Import-Module ServerManager
Add-WindowsFeature -name FS-Data-Deduplication
Import-Module Deduplication
To enable data deduplication on volume
Enable-DedupVolume E: -UsageType HyperV
or
Enable-DedupVolume E: -UsageType Default
Type HyperV – Select this if you are configuring deduplication for running virtual machines.
Type Default – Select this if you are configuring deduplication for general data files.
Set the minimum number of days that must pass before a file is deduplicated
Set-Dedupvolume E: -MinimumFileAgeDays 20
To return a list of the volumes that have been enabled for data deduplication
Get-DedupVolume
or
Get-DedupVolume | format-list
Start deduplication job manually
Optimization job:
Start-DedupJob –Volume E: –Type Optimization
Garbage collection job to process deleted or modified data on the volume so that any data chunks no longer referenced are cleaned up:
Start-DedupJob –Volume E: –Type GarbageCollection
Data integrity scrubbing job:
Start-DedupJob –Volume E: –Type Scrubbing
Get the status of deduplication jobs
Get-DedupJob
Query the key status statistics
Get-DedupStatus
or
Get-DedupStatus E: | fl
Get deduplication metadata information
Get-DedupMetadata
or
Get-DedupMetadata E: