Feedback

C# - Stopwörter in einem Text entfernen (Englisch)

Veröffentlicht von am 30.07.2006
(3 Bewertungen)
Bei der Volltextsuche stören häufig Wörter wie "most" oder "each", diese können mit dieser Funktion entfernt werden.

benötigte Namespaces:
System.Collections
System.Text.RegularExpressions
GFU-Schulungen  [Anzeige]

C# Komplett

Sie kennen sich in objektorientierter Programmierung aus. Sie schreiben C++ oder Java? Und nun stehen Sie vor der Aufgabe, in C# Anwendungen zu erstellen. Das C# Komplett-Seminar verschafft Ihnen umfassende Programmierkenntnisse in dieser Sprache. Nach der Schulung entwickeln Sie selbständig Anwendungen mit C#. Sie kennen die Datentypen und Klassenbibliotheken der objektorientierten Programmiersprache C#. Der Komplettkurs setzt bei den Grundlagen von C# ein. Sie arbeiten mit Variablen und konvertieren Typen. Multithreading, Delegates, Generics sind nach dem Seminar für Sie kein Geheimnis mehr.

C# Grundlagen

Die Schulung vermittelt Ihnen die Grundlagen von C# in der Arbeit mit Datentypen sowie bei Klassenbibliotheken. Sie lernen, mit Variablen umzugehen und deren verschiedene Formen zu konvertieren. 

/// <summary>
/// Diese Funktion entfernt alle englischen Stopworte aus dem übergebenen String.
/// </summary>
/// <param name="Orginaltext">der Orginaltext</param>
/// <returns>der Orginaltext ohne Stopworte und ohne satz- und sonderzeichen</returns>
private string StoppwörterEntfernen(string orginaltext)
{
    // dieses Array enthällt alle Stoppwörter
    string[] arrStopwörter = { "about", "above", "across", "after", "afterwards", "again", "against", "albeit", "all", "almost", "alone", "along", "already", "also", "although", "always", "among", "amongst", "and", "another", "any", "anyhow", "anyone", "anything", "anywhere", "are", "around", "became", "because", "become", "becomes", "becoming", "been", "before", "beforehand", "behind", "being", "below", "beside", "besides", "between", "beyond", "both", "but", "cannot", "comprises", "corresponding", "could", "described", "desired", "does", "down", "during", "each", "either", "else", "elsewhere", "enough", "etc", "even", "ever", "every", "everyone", "everything", "everywhere", "except", "few", "first", "for", "former", "formerly", "from", "further", "generally", "had", "has", "have", "having", "hence", "her", "here", "hereafter", "hereby", "herein", "hereupon", "hers", "herself", "him", "himself", "his", "how", "however", "indeed", "into", "its", "itself", "last", "latter", "latterly", "least", "less", "many", "may", "means", "meanwhile", "might", "more", "moreover", "most", "mostly", "much", "must", "myself", "namely", "neither", "never", "nevertheless", "next", "nobody", "none", "noone", "nor", "not", "nothing", "now", "nowhere", "off", "often", "once", "one", "only", "onto", "other", "others", "otherwise", "our", "ours", "ourselves", "out", "over", "own", "particularly", "per", "perhaps", "preferably", "preferred", "present", "rather", "relatively", "respectively", "said", "same", "seem", "seemed", "seeming", "seems", "several", "she", "should", "since", "some", "somehow", "someone", "something", "sometime", "sometimes", "somewhere", "still", "such", "sitable", "than", "that", "the", "their", "them", "themselves", "then", "thence", "there", "thereafter", "thereby", "therefor", "therefore", "therein", "thereof", "thereto", "thereupon", "these", "they", "this", "those", "though", "through", "throughout", "thru", "thus", "together", "too", "toward", "towards", "under", "until", "upon", "use", "various", "very", "was", "well", "were", "what", "whatever", "whatsoever", "when", "whence", "whenever", "whensoever", "where", "whereafter", "whereas", "whereat", "whereby", "wherefrom", "wherein", "whereinto", "whereof", "whereon", "whereto", "whereunto", "whereupon", "wherever", "wherewith", "whether", "which", "whichever", "whichsoever", "while", "whilst", "whither", "who", "whoever", "whole", "whom", "whomever", "whomsoever", "whose", "whosoever", "why", "will", "with", "within", "without", "would", "yet", "you", "your", "yours", "yourself", "yourselves" };

    // Dictionary anlegen
    Dictionary<string, bool> dicStöppwörter = new Dictionary<string, bool>();
    
    // Dictionary füllen
    foreach (string stoppwort in arrStopwörter)
        dicStöppwörter.Add(stoppwort, true);
 
    // String Array aus dem Orginaltext anlegen, alle "nicht - Wortzeichen" werden entfernt
    string[] arrOrginaltext = Regex.Split(orginaltext, @"\W+");

    // Stringbuilder für den Rückgabewert anlegen
    StringBuilder rückgabe = new StringBuilder();

    // nur alle nicht-Stoppwörter werden an den Rückgabestring angehängt
    foreach (string wort in arrOrginaltext)
        if (!dicStöppwörter.ContainsKey(wort.ToLower()))
            rückgabe.Append(wort + " ");

    return rückgabe.ToString();
}
Abgelegt unter Stopwort, Stopwörter, Suche, Volltextsuche.

Kommentare zum Snippet

 

Logge dich ein, um hier zu kommentieren!