Feedback

C# - Stopwörter in einem Text entfernen (Englisch)

Veröffentlicht von am 30.07.2006
(3 Bewertungen)
Bei der Volltextsuche stören häufig Wörter wie "most" oder "each", diese können mit dieser Funktion entfernt werden.

benötigte Namespaces:
System.Collections
System.Text.RegularExpressions
GFU-Schulungen  [Anzeige]

Angular mit ASP.NET Core für .NET-Entwickler

.NET ist Ihnen vertraut, als Entwickler verfügen Sie über einschlägige Kenntnisse. In diesem Kurs lernen Sie nun, Angular in .NET-Umgebungen einzusetzen. Sie verstehen das Konzept von Angular und integrieren das clientseitige JS-Framework sicher in.NET-Anwendungen.

JavaScript für .NET-Entwickler

Sie sind .NET-Entwickler und nun stehen Sie vor der Aufgabe, JavaScript in Ihre Arbeit einzubinden. Die Schulung vermittelt Ihnen die JavaScript Grundlagen und die Funktionen der Scriptsprache. Sie wissen, wie objektorientierte Programmierung in JavaScript funktioniert und lernen abschließend Best Practicies Fälle kennen.

/// <summary>
/// Diese Funktion entfernt alle englischen Stopworte aus dem übergebenen String.
/// </summary>
/// <param name="Orginaltext">der Orginaltext</param>
/// <returns>der Orginaltext ohne Stopworte und ohne satz- und sonderzeichen</returns>
private string StoppwörterEntfernen(string orginaltext)
{
    // dieses Array enthällt alle Stoppwörter
    string[] arrStopwörter = { "about", "above", "across", "after", "afterwards", "again", "against", "albeit", "all", "almost", "alone", "along", "already", "also", "although", "always", "among", "amongst", "and", "another", "any", "anyhow", "anyone", "anything", "anywhere", "are", "around", "became", "because", "become", "becomes", "becoming", "been", "before", "beforehand", "behind", "being", "below", "beside", "besides", "between", "beyond", "both", "but", "cannot", "comprises", "corresponding", "could", "described", "desired", "does", "down", "during", "each", "either", "else", "elsewhere", "enough", "etc", "even", "ever", "every", "everyone", "everything", "everywhere", "except", "few", "first", "for", "former", "formerly", "from", "further", "generally", "had", "has", "have", "having", "hence", "her", "here", "hereafter", "hereby", "herein", "hereupon", "hers", "herself", "him", "himself", "his", "how", "however", "indeed", "into", "its", "itself", "last", "latter", "latterly", "least", "less", "many", "may", "means", "meanwhile", "might", "more", "moreover", "most", "mostly", "much", "must", "myself", "namely", "neither", "never", "nevertheless", "next", "nobody", "none", "noone", "nor", "not", "nothing", "now", "nowhere", "off", "often", "once", "one", "only", "onto", "other", "others", "otherwise", "our", "ours", "ourselves", "out", "over", "own", "particularly", "per", "perhaps", "preferably", "preferred", "present", "rather", "relatively", "respectively", "said", "same", "seem", "seemed", "seeming", "seems", "several", "she", "should", "since", "some", "somehow", "someone", "something", "sometime", "sometimes", "somewhere", "still", "such", "sitable", "than", "that", "the", "their", "them", "themselves", "then", "thence", "there", "thereafter", "thereby", "therefor", "therefore", "therein", "thereof", "thereto", "thereupon", "these", "they", "this", "those", "though", "through", "throughout", "thru", "thus", "together", "too", "toward", "towards", "under", "until", "upon", "use", "various", "very", "was", "well", "were", "what", "whatever", "whatsoever", "when", "whence", "whenever", "whensoever", "where", "whereafter", "whereas", "whereat", "whereby", "wherefrom", "wherein", "whereinto", "whereof", "whereon", "whereto", "whereunto", "whereupon", "wherever", "wherewith", "whether", "which", "whichever", "whichsoever", "while", "whilst", "whither", "who", "whoever", "whole", "whom", "whomever", "whomsoever", "whose", "whosoever", "why", "will", "with", "within", "without", "would", "yet", "you", "your", "yours", "yourself", "yourselves" };

    // Dictionary anlegen
    Dictionary<string, bool> dicStöppwörter = new Dictionary<string, bool>();
    
    // Dictionary füllen
    foreach (string stoppwort in arrStopwörter)
        dicStöppwörter.Add(stoppwort, true);
 
    // String Array aus dem Orginaltext anlegen, alle "nicht - Wortzeichen" werden entfernt
    string[] arrOrginaltext = Regex.Split(orginaltext, @"\W+");

    // Stringbuilder für den Rückgabewert anlegen
    StringBuilder rückgabe = new StringBuilder();

    // nur alle nicht-Stoppwörter werden an den Rückgabestring angehängt
    foreach (string wort in arrOrginaltext)
        if (!dicStöppwörter.ContainsKey(wort.ToLower()))
            rückgabe.Append(wort + " ");

    return rückgabe.ToString();
}
Abgelegt unter Stopwort, Stopwörter, Suche, Volltextsuche.

Kommentare zum Snippet

 

Logge dich ein, um hier zu kommentieren!