Feedback

C# - Parallele Dateisuche über mehrere Laufwerke

Veröffentlicht von am 12/1/2009
(2 Bewertungen)
Sucht Dateien und/oder nach bestimmtem Inhalt in Dateien. Es können beliebig viele Suchstrings übergeben werden. Diese werden nach Laufwerksbuchstabe gruppiert. Alle Gruppen werden parallel durchsucht. Mehrere Suchanfragen auf einem Laufwerk werden sequentiell abgearbeitet (physikalisch nicht parallel möglich). Somit bietet diese Klasse eine optimale Performance.

ParallelFileSearch ist Bestandteil meines DotNetExpansions Framework:
http://cyrons.beanstalkapp.com/general/browse/DotNetExpansions/tags/(neueste Release Nummer)/

Das Paket beinhaltet neben dem Framework auch eine ausführliche Hilfedatei im chm-Format.

Weitere Informationen zum DotNetExpansions Framework gibt es hier:
http://dotnet-forum.de/blogs/rainerhilmer/archive/2009/09/28/dotnet-expansions-framework.aspx

P.S.: ParallelFileSearch wurde mit Microsoft CHESS getestet.
http://msdn.microsoft.com/en-us/devlabs/cc950526.aspx

Ein Demo:

using System;
using System.Collections.Generic;
using System.IO;
using DotNetExpansions.ParallelSearch;


namespace ParallelSearchDemo
{
class Program
{
public static void Main()
{
const bool ignoreExceptions = true;
var parallelFileSearcher = new ParallelFileSearcher(ignoreExceptions);
parallelFileSearcher.FoundFile += FileFoundHandler;
parallelFileSearcher.CallOutUnauthorizedAccess += ShowCurrentlyDeniedAccess;

// Make a container for the search sets.
List<SearchSet> searchSets = new List<SearchSet>();

// Generate a search set...
var searchSet = new SearchSet(
new DirectoryInfo(@"C:\System Volume Information"),
SearchOption.TopDirectoryOnly);
// ...and add it to the container.
searchSets.Add(searchSet);

// Generate another search set...
searchSet = new SearchSet(
new DirectoryInfo(@"I:\Developer\VS2008\Projects\Cyrons\Demos"),
SearchOption.AllDirectories);
// ...and add it to the container.
searchSets.Add(searchSet);

// Define search-parameters.
string filenamePattern = "*.cs";
string contentToSearchFor = ""; // Could also be null.

// Start the search with those parameters.
Console.WriteLine("Search in progress. Please wait...");
List<FileInfo> fileInfos =
parallelFileSearcher.FindFiles(
searchSets, filenamePattern, contentToSearchFor, FindMode.FindAll);
// Do something with fileInfos like, for instance, delete those files.


// Verhindert das selbsttätige Schließen des Konsolenfensters.
Console.WriteLine("\nPress any key to terminate the program.");
Console.ReadKey();
}

private static void ShowCurrentlyDeniedAccess(string fullName)
{
Console.WriteLine("Access denied on " + fullName);
}

private static void FileFoundHandler(string fullFileName)
{
Console.WriteLine(fullFileName);
}
}
}
namespace DotNetExpansions.ParallelSearch
{
   /// <summary>
   /// Gibt an ob die Suche nach dem ersten Fund abgebrochen werden soll,
   /// oder ob die Suche fortgeführt werden soll bis alle Pfade durchsucht sind.
   /// </summary>
   public enum FindMode
   {
      /// <summary>
      /// Stoppt die Suche wenn die erste Datei in einem der angegebenen Ordner gefunden wurde,
      /// die dem Suchkriterium entspricht.
      /// </summary>
      FindOne,
      /// <summary>
      /// Findet alle Dateien in allen angegebenen Ordnern, die dem Suchkriterium entsprechen.
      /// </summary>
      FindAll
   }
}

//=========================================================

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Security.Permissions;
using System.Threading;

namespace DotNetExpansions.ParallelSearch
{
   /// <summary>
   /// Sucht Dateien und/oder nach bestimmtem Inhalt in Dateien.
   /// Es können beliebig viele Suchstrings übergeben werden.
   /// Diese werden nach Laufwerksbuchstabe gruppiert.
   /// Alle Gruppen werden parallel durchsucht.
   /// Somit bietet diese Klasse eine optimale Performance.
   /// </summary>
   /// <example>
   /// <code>
   /// <![CDATA[
   ///using System;
   ///using System.Collections.Generic;
   ///using System.IO;
   ///using DotNetExpansions.ParallelSearch;
   ///
   ///namespace ParallelSearchDemo
   ///{
   ///   class Program
   ///   {
   ///      public static void Main()
   ///      {
   ///         const bool ignoreExceptions = true;
   ///         var parallelFileSearcher = new ParallelFileSearcher(ignoreExceptions);
   ///         parallelFileSearcher.FoundFile += FileFoundHandler;
   ///         parallelFileSearcher.CallOutUnauthorizedAccess += ShowCurrentlyDeniedAccess;
   ///
   ///         // Make a container for the search sets.
   ///         List<SearchSet> searchSets = new List<SearchSet>();
   ///
   ///         // Generate a search set...
   ///         var searchSet = new SearchSet(
   ///            new DirectoryInfo(@"C:\System Volume Information"),
   ///            SearchOption.TopDirectoryOnly);
   ///         // ...and add it to the container.
   ///         searchSets.Add(searchSet);
   ///
   ///         // Generate another search set...
   ///         searchSet = new SearchSet(
   ///            new DirectoryInfo(@"I:\Developer\VS2008\Projects\DotNetExpansions\Demos"),
   ///            SearchOption.AllDirectories);
   ///         // ...and add it to the container.
   ///         searchSets.Add(searchSet);
   ///
   ///         // Define search-parameters.
   ///         string filenamePattern = "*.cs";
   ///         string contentToSearchFor = ""; // Could also be null.
   ///
   ///         // Start the search with those parameters.
   ///         Console.WriteLine("Search in progress. Please wait...");
   ///         List<FileInfo> fileInfos =
   ///            parallelFileSearcher.FindFiles(
   ///            searchSets, filenamePattern, contentToSearchFor, FindMode.FindAll);
   ///         // Do something with fileInfos like, for instance, delete those files.
   ///
   ///         // Verhindert das selbsttätige Schließen des Konsolenfensters.
   ///         Console.WriteLine("\nPress any key to terminate the program.");
   ///         Console.ReadKey();
   ///      }
   ///
   ///      private static void ShowCurrentlyDeniedAccess(string fullName)
   ///      {
   ///         Console.WriteLine("Access denied on " + fullName);
   ///      }
   ///
   ///      private static void FileFoundHandler(string fullFileName)
   ///      {
   ///         Console.WriteLine(fullFileName);
   ///      }
   ///   }
   ///}
   ///
   /// /* Sample Output:
   /// g:\projektinterkosmos\interkosmos\codecontractstrials\codecontractstrials\Program.cs
   /// g:\projektinterkosmos\interkosmos\codecontractstrials\codecontractstrials\SomeClass.cs
   /// 
   /// h:\vs2008\lab\decoratorversuch\decoratorversuch\Program.cs
   /// 
   /// i:\developer\vs2008\projects\demos\eigene\collectiveeventcontract\eineventvieleklassen\CollectiveEventArgs.cs
   /// i:\developer\vs2008\projects\demos\eigene\collectiveeventcontract\eineventvieleklassen\CollectiveEventContract.cs
   /// i:\developer\vs2008\projects\demos\eigene\collectiveeventcontract\eineventvieleklassen\Program.cs
   /// 
   /// Press any key to terminate the program.
   /// */
   /// ]]>
   /// </code>
   /// </example>
   public sealed class ParallelFileSearcher
   {
      #region Delegates

      /// <summary>
      /// Kapselt eine Methode, die aufgerufen wird, wenn die gesuchte Datei gefunden wurde.
      /// </summary>
      public delegate void FoundFileHandler(string fullFileName);

      /// <summary>
      /// Kapselt eine Methode, die aufgerufen wird,
      /// wenn versucht wurde, auf ein Verzeichnis zuzugreifen,
      /// für das keine Zugriffsberechtigung besteht.
      /// </summary>
      public delegate void CurrentAccessDeniedHandler(string fullName);

      #endregion
      #region Events

      /// <summary>
      /// Tritt ein wenn die gesuchte Datei gefunden wurde.
      /// </summary>
      public event FoundFileHandler FoundFile;

      /// <summary>
      /// Tritt ein wenn für den Zugriff auf ein Verzeichnis oder eine Datei keine Berechtigung besteht.
      /// </summary>
      public event CurrentAccessDeniedHandler CallOutUnauthorizedAccess;

      #endregion
      #region Fields

      private List<string> fileList = new List<string>();

      private volatile string content;
      private volatile bool found;
      private FindMode mode;
      private volatile string namePattern;

      #endregion Fields
      #region Constructors

      /// <summary>
      /// Initialisiert eine neue Instanz der <see cref="ParallelFileSearcher"/> Klasse.
      /// </summary>
      /// <param name="suppressExceptions">Wenn auf <c>true</c> gesetzt,
      /// werden IO-Exceptions ignoriert.</param>
      public ParallelFileSearcher(bool suppressExceptions)
      {
         SuppressExceptions = suppressExceptions;
      }

      #endregion Constructors
      #region Properties

      /// <summary>
      /// Ruft einen Wert ab, der angibt ob Exceptions ignoriert werden.
      /// </summary>
      /// <value>
      /// 	<c>true</c> wenn Exceptions ignoriert werden; anderenfalls <c>false</c>.
      /// </value>
      public bool SuppressExceptions { get; private set; }

      #endregion Properties
      #region Methods
      #region Public Methods

      /// <summary>
      /// Findet Dateien und/oder bestimmten Inhalt von Dateien.
      /// </summary>
      /// <param name="searchSets">Eine Liste mit Suchsets (Instanzen der 
      /// <see cref="SearchSet"/>-Klasse.</param>
      /// <param name="filenamePattern">Ein Dateinamensmuster (z.B. *.cs).</param>
      /// <param name="contentToSearchFor">Der textuelle Inhalt, nach dem gesucht werden soll.</param>
      /// <param name="findMode">Der durch die 
      /// <see cref="FindMode"/>-Enumeration angegebene Suchmodus.</param>
      /// <returns>Eine Liste mit voll qualifizierten Dateinamen
      ///  (Instanzen der <see cref="FileInfo"/>-Klasse,
      ///  die den Suchkriterien entspricht.</returns>
      /// <example>
      /// <code>
      /// <![CDATA[
      /// using System;
      /// using System.Collections.Generic;
      /// using System.IO;
      /// using DotNetExpansions.ParallelSearch;
      /// 
      /// namespace ParallelSearchDemo
      /// {
      ///    class Program
      ///    {
      ///       public static void Main()
      ///       {
      ///          const bool ignoreExceptions = true;
      ///          var parallelFileSearcher = new ParallelFileSearcher(ignoreExceptions);
      ///          parallelFileSearcher.FoundFile += FileFoundHandler;
      ///          parallelFileSearcher.CallOutUnauthorizedAccess += ShowCurrentlyDeniedAccess;
      /// 
      ///          // Make a container for the search sets.
      ///          List<SearchSet> searchSets = new List<SearchSet>();
      /// 
      ///          // Generate a search set...
      ///          var searchSet = new SearchSet(
      ///             new DirectoryInfo(@"C:\System Volume Information"),
      ///             SearchOption.TopDirectoryOnly);
      ///          // ...and add it to the container.
      ///          searchSets.Add(searchSet);
      /// 
      ///          // Generate another search set...
      ///          searchSet = new SearchSet(
      ///             new DirectoryInfo(@"I:\Developer\VS2008\Projects\DotNetExpansions\Demos"),
      ///             SearchOption.AllDirectories);
      ///          // ...and add it to the container.
      ///          searchSets.Add(searchSet);
      /// 
      ///          // Define search-parameters.
      ///          string filenamePattern = "*.cs";
      ///          string contentToSearchFor = ""; // Could also be null.
      /// 
      ///          // Start the search with those parameters.
      ///          Console.WriteLine("Search in progress. Please wait...");
      ///          List<FileInfo> fileInfos =
      ///             parallelFileSearcher.FindFiles(
      ///             searchSets, filenamePattern, contentToSearchFor, FindMode.FindAll);
      ///          // Do something with fileInfos like, for instance, delete those files.
      ///       }
      /// 
      ///       private static void ShowCurrentlyDeniedAccess(string fullName)
      ///       {
      ///          Console.WriteLine("Access denied on " + fullName);
      ///       }
      /// 
      ///       private static void FileFoundHandler(string fullFileName)
      ///       {
      ///          Console.WriteLine(fullFileName);
      ///       }
      ///    }
      /// }]]>
      /// </code>
      /// </example>
      public List<FileInfo> FindFiles(
         List<SearchSet> searchSets, string filenamePattern,
         string contentToSearchFor, FindMode findMode)
      {
         var permission = new FileIOPermission(PermissionState.Unrestricted);
         permission.AllFiles = FileIOPermissionAccess.AllAccess;
         SetInstanceParameters(filenamePattern, contentToSearchFor, findMode);
         List<SearchSet> drivesAndFolders = GetDrivesAndFolders(searchSets);
         IEnumerable<IGrouping<string, SearchSet>> fileGroups = GroupDriveItems(drivesAndFolders);
         List<string> internalList = InvokeSearchBase(fileGroups);
         var fileInfos = new List<FileInfo>();
         foreach(var item in internalList)
         {
            fileInfos.Add(new FileInfo(item));
         }
         return fileInfos;
      }

      #endregion
      #region Private methods

      private static List<SearchSet> GetDrivesAndFolders(
         IEnumerable<SearchSet> searchSets)
      {
         var drivesAndFolders = new List<SearchSet>();
         foreach(var item in searchSets)
         {
            drivesAndFolders.Add(new SearchSet(item.DirectoryInformation, item.Recursive));
         }
         return drivesAndFolders;
      }

      private void GetFiles(IEnumerable<FileInfo> files)
      {
         // Wenn keine Content-Angabe vorhanden ist, soll der Inhalt der Files nicht durchsucht werden.
         if(!string.IsNullOrEmpty(content))
         {
            SearchFileContents(files);
         }
         else
         {
            found = true;
            foreach(var file in files)
            {
               fileList.Add(file.FullName);
               if(FoundFile != null)
                  FoundFile(file.FullName);
            }
         }
      }

      private void SearchFileContents(IEnumerable<FileInfo> files)
      {
         foreach(var file in files)
         {
            StreamReader stream = null;
            try
            {
               stream = file.OpenText();
               string fileContent = stream.ReadToEnd();
               if(fileContent.Contains(content))
               {
                  found = true;
                  fileList.Add(file.FullName);
                  if(FoundFile != null)
                     FoundFile(file.FullName);
                  if(mode == FindMode.FindOne)
                  {
                     stream.Close();
                     stream.Dispose();
                     break;
                  }
               }
            }
            catch(IOException)
            {
               if(!SuppressExceptions)
                  throw;
            }
            catch(UnauthorizedAccessException)
            {
               if(!SuppressExceptions)
                  throw;
               if(CallOutUnauthorizedAccess != null)
                  CallOutUnauthorizedAccess(file.FullName);
            }
            finally
            {
               if(stream != null)
               {
                  stream.Close();
                  stream.Dispose();
               }
            }
         }
      }

      private void GetFileInfos(SearchSet container, IEnumerable<DirectoryInfo> directories)
      {
         foreach(var directory in directories)
         {
            try
            {
               FileInfo[] files = directory.GetFiles(namePattern, container.Recursive);
               if(files.Length > 0)
                  GetFiles(files);
            }
            catch(UnauthorizedAccessException)
            {
               if(!SuppressExceptions)
                  throw;
               if(CallOutUnauthorizedAccess != null)
                  CallOutUnauthorizedAccess(directory.FullName);
            }
            catch(IOException)
            {
               if(!SuppressExceptions)
                  throw;
            }
         }
      }

      private static IEnumerable<IGrouping<string, SearchSet>> GroupDriveItems(
         IEnumerable<SearchSet> folders)
      {
         IEnumerable<IGrouping<string, SearchSet>> query =
            folders.GroupBy(drive => drive.DirectoryInformation.Root.ToString(), container => container);
         return query;
      }

      private void InvokeSearchByFileGroups(
         IEnumerable<IGrouping<string, SearchSet>> fileGroups,
         AutoResetEvent[] threadReadyEvents)
      {
         int counter = 0;
         foreach(IGrouping<string, SearchSet> fileGroup in fileGroups)
         {
            threadReadyEvents[counter] = new AutoResetEvent(false);
            IEnumerable<SearchSet> containers =
               from items in fileGroup
               select items;
            object parameters =
               new SearchTaskParameters(containers, threadReadyEvents[counter]);
            if(!found)
               ThreadPool.QueueUserWorkItem(Search, parameters);
            counter++;
         }
      }

      private List<string> InvokeSearchBase(
         IEnumerable<IGrouping<string, SearchSet>> fileGroups)
      {
         var threadReadyEvents = new AutoResetEvent[fileGroups.Count()];
         InvokeSearchByFileGroups(fileGroups, threadReadyEvents);
         WaitForThreads(threadReadyEvents);
         return fileList;
      }

      private void Search(object parameters)
      {
         if(found && mode == FindMode.FindOne)
            return;
         // Container auspacken -->>
         IEnumerable<SearchSet> folderGroup;
         AutoResetEvent doneEvent;
         UnpackParameterContainer(parameters, out folderGroup, out doneEvent);
         // <<--
         // Schleife für Folder innerhalb eines Laufwerks.
         foreach(SearchSet container in folderGroup)
         {
            var dir = new DirectoryInfo(container.DirectoryInformation.FullName);
            if(!dir.Exists)
               break;
            dir.GetAccessControl();
            var directories = new DirectoryInfo[] { };
            try
            {
               directories = dir.GetDirectories();
            }
            catch(UnauthorizedAccessException)
            {
               if(!SuppressExceptions)
                  throw;
               if(CallOutUnauthorizedAccess != null)
                  CallOutUnauthorizedAccess(container.DirectoryInformation.FullName);
            }
            GetFileInfos(container, directories);
         }
         doneEvent.Set();
      }

      private void SetInstanceParameters(
         string filenamePattern, string contentToSearchFor, FindMode findMode)
      {
         if(string.IsNullOrEmpty(filenamePattern))
            throw new ArgumentException("filenamePattern fehlt.");
         namePattern = filenamePattern;
         content = contentToSearchFor;
         mode = findMode;
      }

      private static void UnpackParameterContainer(
         object parameters, out IEnumerable<SearchSet> folderGroup,
         out AutoResetEvent doneEvent)
      {
         folderGroup = ((SearchTaskParameters)parameters).Containers;
         doneEvent = ((SearchTaskParameters)parameters).DoneEvent;
      }

      private void WaitForThreads(AutoResetEvent[] threadReadyEvents)
      {
         if(mode == FindMode.FindAll)
         {
            WaitHandle.WaitAll(threadReadyEvents);
            threadReadyEvents = null;
         }
         else
         {
            WaitHandle.WaitAny(threadReadyEvents);
            threadReadyEvents = null;
         }
      }

      #endregion Private methods
      #endregion Methods
   }
}

//=========================================================

using System.IO;

namespace DotNetExpansions.ParallelSearch
{
   /// <summary>
   /// Stellt einen Suchset, bestehend aus dem zu durchsuchenden Pfad
   /// und der Angabe, ob dieser Pfad rekursiv (inklusive Unterordnern) durchsucht werden soll.
   /// </summary>
   public sealed class SearchSet
   {
      #region Constructors

      /// <summary>
      /// Initialisiert eine neue Instanz der <see cref="SearchSet"/> Klasse.
      /// </summary>
      /// <param name="directoryInfo">Eine Pfadangabe in Form einer 
      /// <see cref="DirectoryInfo"/>-Instanz.</param>
      /// <param name="recursive">Eine <see cref="SearchOption"/>-Enumeration,
      /// die angibt ob der Pfad rekursiv durchsucht werden soll.</param>
      public SearchSet(DirectoryInfo directoryInfo, SearchOption recursive)
      {
         DirectoryInformation = directoryInfo;
         Recursive = recursive;
      }

      #endregion Constructors
      #region Properties

      /// <summary>
      /// Ruft den aktuell zu durchsuchenden Pfad des Suchsets ab.
      /// </summary>
      /// <value>Eine Instanz der <see cref="DirectoryInfo"/>-Klasse.</value>
      public DirectoryInfo DirectoryInformation { get; private set; }
      
      /// <summary>
      /// Ruft die Suchoption des aktuellen Suchsets ab.
      /// </summary>
      /// <value>Eine der Enumerationen von <see cref="SearchOption"/>.</value>
      public SearchOption Recursive { get; private set; }

      #endregion Properties
   }
}

//=========================================================

using System.Collections.Generic;
using System.Threading;

namespace DotNetExpansions.ParallelSearch
{
   /// <summary>
   /// Beinhaltet Parameter, die an die Suchthreads übegeben werden.
   /// <remarks>
   /// Eine Methode die über den Threadpool gestartet wird,
   /// darf nur einen Parameter vom Typ Object haben.
   /// Mittels der ParameterContainer-Klasse wird diese Klippe elegant umschifft.
   /// </remarks>
   /// </summary>
   internal class SearchTaskParameters
   {
      #region Constructors

      /// <summary>
      /// Initialisiert eine neue Instanz der <see cref="SearchTaskParameters"/>-Klasse.
      /// </summary>
      /// <param name="containers">SearchSets</param>
      /// <param name="doneEvent">Jeder Thread bekommt ein 
      /// <see cref="AutoResetEvent"/> in's Gepäck.</param>
      internal SearchTaskParameters(
         IEnumerable<SearchSet> containers, AutoResetEvent doneEvent)
      {
         Containers = containers;
         DoneEvent = doneEvent;
      }

      #endregion Constructors
      #region Properties

      internal AutoResetEvent DoneEvent { get; private set; }
      internal IEnumerable<SearchSet> Containers { get; private set; }

      #endregion Properties
   }
}
Abgelegt unter parallel, suche, dateisuche, datei.

Kommentare zum Snippet

 

Logge dich ein, um hier zu kommentieren!