Tutorial :Scanning a drive with drilldowns using C#?



Question:

I'm trying to create an application which scans a drive. The tricky part though, is that my drive contains a set of folders that have folders within folders and contain documents. I'm trying to scan the drive, take a "snapshot" of all documents & folders and dump into a .txt file.
The first time i run this app, the output will be a text file with all the folders & files.
The second time i run this application, it will take the 2 text files (the one produced from the 2nd time i run the app and the .txt file from the 1st time i have run the app) and compare them...reporting what has been moved/overridden/deleted.

Does anybody have any code for this? I'm a newbie at this C# stuff and any help would be greatly appreciated.

Thanks in advance.


Solution:1

One thing that we learned in the 80's was that if it's really tempting to use recursion for file system walking, but the moment you do that, someone will make a file system with nesting levels that will cause your stack to overflow. It's far better to use heap-based walking of the file system.

Here is a class I knocked together which does just that. It's not super pretty, but it does the job quite well:

using System;  using System.IO;  using System.Collections.Generic;    namespace DirectoryWalker  {      public class DirectoryWalker : IEnumerable<string>      {          private string _seedPath;          Func<string, bool> _directoryFilter, _fileFilter;            public DirectoryWalker(string seedPath) : this(seedPath, null, null)          {          }            public DirectoryWalker(string seedPath, Func<string, bool> directoryFilter, Func<string, bool> fileFilter)          {              if (seedPath == null)                  throw new ArgumentNullException(seedPath);              _seedPath = seedPath;              _directoryFilter = directoryFilter;              _fileFilter = fileFilter;          }            public IEnumerator<string> GetEnumerator()          {              Queue<string> directories = new Queue<string>();              directories.Enqueue(_seedPath);              Queue<string> files = new Queue<string>();              while (files.Count > 0 || directories.Count > 0)              {                  if (files.Count > 0)                  {                      yield return files.Dequeue();                  }                    if (directories.Count > 0)                  {                      string dir = directories.Dequeue();                      string[] newDirectories = Directory.GetDirectories(dir);                      string[] newFiles = Directory.GetFiles(dir);                      foreach (string path in newDirectories)                      {                          if (_directoryFilter == null || _directoryFilter(path))                              directories.Enqueue(path);                      }                      foreach (string path in newFiles)                      {                          if (_fileFilter == null || _fileFilter(path))                              files.Enqueue(path);                      }                  }              }          }            System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()          {              return GetEnumerator();          }      }  }  

Typical usage is this:

DirectoryWalker walker = new DirectoryWalker(@"C:\pathToSource\src", null, (x => x.EndsWith(".cs")));  foreach (string s in walker)  {      Console.WriteLine(s);  }  

Which recursively lists all files that end in ".cs"


Solution:2

A better approach than your text file comparisons would be to use the FileSystemWatcher Class.

Listens to the file system change notifications and raises events when a directory, or file in a directory, changes.

You could log the changes and then generate your reports as needed from that log.


Solution:3

you can easily utilize the DirectoryInfo/FileInfo classes for this.

Basically instantiate an instance of the DirectoryInfo class, pointing towards the c:\ folder. Then using it's objects walk the folder structure.

http://msdn.microsoft.com/en-us/library/system.io.directoryinfo.aspx has code that could quite easily be translated.

Now, the other part of your question is insanity. You can find the differences between the two files relatively easily, but translating that into what has been moved/deleted/etc will take the some fairly advanced logic structures. After all, if I have two files, both named myfile.dat, and one is found at c:\foo and the other at c:\notfoo, how would the one at c:\notfoo be reported if I deleted the one at c:\foo? Another example, is if I have a file myfile2.dat and copy it from c:\bar to c:\notbar is that considered a move? What happens if I copy it on Tuesday, and then on Thursday I delete c:\bar\myfile2.dat--is that a move or a delete? And would the answer change if I ran the program on every Monday as opposed to daily?

There's a whole host of questions, and their corresponding logic structures which you'd need to think of amd code for in order to build that functionality, and even then, it would not be 100% correct, because it's not paging the file system as changes occur--there will always exist the possibility of a scenario that did not get reported correctly in your logic due to timing, logic structure, process time, when the app runs, or just due to the sheer perversity of computers.

Additionally, the processing time would grow exponentially with the size of your drive. After all, you'd need to check every file against every other file to determine it's state as opposed to its previous state. I'd hate to have to run this against my 600+GB drive at home, let alone the 40TB drives I have on servers at work.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »