Pages

Thursday, November 7, 2013

Deduplication in Windows 8


Data deduplication is a technique to reduce storage needs by eliminating redundant data in your backup environment. Only one copy of the data is retained on storage media, and redundant data is replaced with a pointer to the unique data copy.
The goal is to store more data in less space by segmenting files into small variable-sized chunks (32–128 KB), identifying duplicate chunks, and maintaining a single copy of each chunk. Redundant copies of the chunk are replaced by a reference to the single copy. The chunks are compressed and then organized into special container files in the System Volume Information folder.



The result is an on-disk transformation of each file as shown in below Figure 1. After deduplication, files are no longer stored as independent streams of data, and they are replaced with stubs that point to data blocks that are stored within a common chunk store. Because these files share blocks, those blocks are only stored once, which reduces the disk space needed to store all files. During file access, the correct blocks are transparently assembled to serve the data without calling the application or the user having any knowledge of the on-disk transformation to the file. This enables administrators to apply deduplication to files without having to worry about any change in behavior to the applications or impact to users who are accessing those files.
Figure 1 On-disk transformation of files



Data deduplication is a Windows Server 2012 feature. Since Windows 8 has the same architecture with Windows Server 2012 we can achieve this by using the Windows 2012 .cab files in windows 8  and enable Data deduplication on Windows 8.


get the cab files from Here
 Open powershell as an Administrator and change directory to c:\dedupefiles

   
 cd c:\dedupefiles\
Run the following command in an elevated powershell window:

dism /online /add-package /source:c:\dedupefiles /packagepath:Microsoft-Windows-VdsInterop-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab
/packagepath:Microsoft-Windows-VdsInterop-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab 
/packagepath:Microsoft-Windows-FileServer-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab 
/packagepath:Microsoft-Windows-FileServer-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab 
/packagepath:Microsoft-Windows-Dedup-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab 
/packagepath:Microsoft-Windows-Dedup-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab
  

28_2

Now run this command to enable the feature:

dism /online /enable-feature /featurename:Dedup-Core /all

Then, enable dedupe on your chosen volume. Here, I am enabling this on my d:\ drive.
NOTE: It will not work on a boot volume.


 Enable-DedupVolume d:

Now lets run the first optimsation run on volume d:

   
Start-DedupJob –Volume d: –Type Optimization

To see the how far the job has progressed (do not expect this to be quick, espically on the first run). You will need to rerun this several times to view the progress. If this command returns no results, the job is complete!

get-dedupejob

Once this has complete, we can see how much space has been saved with the following command.

   
 get-dedupestatus

Deduplication is not an inline process, meaning tasks are run. These tasks can be viewed in task scheduler, but it's just as easy in powershell.

  
Get-DedupSchedule

Optmisation will run in the background; there is also a couple of other tasks which run. Garbage collection and Scrubbing.

There's a bit more you can do, but now you should have a volume with some space saved!

Microsoft website is filled with commands: http://technet.microsoft.com/en-us/library/hh831434.aspx

All the Best ........ :)
Cheers!


No comments:

Post a Comment