Maison > Questions et réponses > le corps du texte
P粉8850351142023-08-26 00:50:59
J'en ai besoin pour un projet C#, voici donc le Code Python mentionné ci-dessus. Assurez-vous d'inclure using System.Text.RegularExpressions;
dans le fichier source.
private string GetIndefiniteArticle(string noun_phrase) { string word = null; var m = Regex.Match(noun_phrase, @"\w+"); if (m.Success) word = m.Groups[0].Value; else return "an"; var wordi = word.ToLower(); foreach (string anword in new string[] { "euler", "heir", "honest", "hono" }) if (wordi.StartsWith(anword)) return "an"; if (wordi.StartsWith("hour") && !wordi.StartsWith("houri")) return "an"; var char_list = new char[] { 'a', 'e', 'd', 'h', 'i', 'l', 'm', 'n', 'o', 'r', 's', 'x' }; if (wordi.Length == 1) { if (wordi.IndexOfAny(char_list) == 0) return "an"; else return "a"; } if (Regex.Match(word, "(?!FJO|[HLMNS]Y.|RY[EO]|SQU|(F[LR]?|[HL]|MN?|N|RH?|S[CHKLMNPTVW]?|X(YL)?)[AEIOU])[FHLMNRSX][A-Z]").Success) return "an"; foreach (string regex in new string[] { "^e[uw]", "^onc?e\b", "^uni([^nmd]|mo)", "^u[bcfhjkqrst][aeiou]" }) { if (Regex.IsMatch(wordi, regex)) return "a"; } if (Regex.IsMatch(word, "^U[NK][AIEO]")) return "a"; else if (word == word.ToUpper()) { if (wordi.IndexOfAny(char_list) == 0) return "an"; else return "a"; } if (wordi.IndexOfAny(new char[] { 'a', 'e', 'i', 'o', 'u' }) == 0) return "an"; if (Regex.IsMatch(wordi, "^y(b[lor]|cl[ea]|fere|gg|p[ios]|rou|tt)")) return "an"; return "a"; }
P粉9330033502023-08-26 00:45:17
Ce que vous voulez, c'est identifier l'article indéfini approprié. Lingua::EN:: Inflect
est un module Perl qui fonctionne bien. J'ai extrait le code correspondant et l'ai collé ci-dessous. Il ne s'agit que d'un tas de cas et de quelques expressions régulières, donc le portage vers PHP ne devrait pas être difficile. Un ami l'a porté sur Python Si quelqu'un est intéressé, cliquez ici.
# 2. INDEFINITE ARTICLES # THIS PATTERN MATCHES STRINGS OF CAPITALS STARTING WITH A "VOWEL-SOUND" # CONSONANT FOLLOWED BY ANOTHER CONSONANT, AND WHICH ARE NOT LIKELY # TO BE REAL WORDS (OH, ALL RIGHT THEN, IT'S JUST MAGIC!) my $A_abbrev = q{ (?! FJO | [HLMNS]Y. | RY[EO] | SQU | ( F[LR]? | [HL] | MN? | N | RH? | S[CHKLMNPTVW]? | X(YL)?) [AEIOU]) [FHLMNRSX][A-Z] }; # THIS PATTERN CODES THE BEGINNINGS OF ALL ENGLISH WORDS BEGINING WITH A # 'y' FOLLOWED BY A CONSONANT. ANY OTHER Y-CONSONANT PREFIX THEREFORE # IMPLIES AN ABBREVIATION. my $A_y_cons = 'y(b[lor]|cl[ea]|fere|gg|p[ios]|rou|tt)'; # EXCEPTIONS TO EXCEPTIONS my $A_explicit_an = enclose join '|', ( "euler", "hour(?!i)", "heir", "honest", "hono", ); my $A_ordinal_an = enclose join '|', ( "[aefhilmnorsx]-?th", ); my $A_ordinal_a = enclose join '|', ( "[bcdgjkpqtuvwyz]-?th", ); sub A { my ($str, $count) = @_; my ($pre, $word, $post) = ( $str =~ m/\A(\s*)(?:an?\s+)?(.+?)(\s*)\Z/i ); return $str unless $word; my $result = _indef_article($word,$count); return $pre.$result.$post; } sub AN { goto &A } sub _indef_article { my ( $word, $count ) = @_; $count = $persistent_count if !defined($count) && defined($persistent_count); return "$count $word" if defined $count && $count!~/^($PL_count_one)$/io; # HANDLE USER-DEFINED VARIANTS my $value; return "$value $word" if defined($value = ud_match($word, @A_a_user_defined)); # HANDLE ORDINAL FORMS $word =~ /^($A_ordinal_a)/i and return "a $word"; $word =~ /^($A_ordinal_an)/i and return "an $word"; # HANDLE SPECIAL CASES $word =~ /^($A_explicit_an)/i and return "an $word"; $word =~ /^[aefhilmnorsx]$/i and return "an $word"; $word =~ /^[bcdgjkpqtuvwyz]$/i and return "a $word"; # HANDLE ABBREVIATIONS $word =~ /^($A_abbrev)/ox and return "an $word"; $word =~ /^[aefhilmnorsx][.-]/i and return "an $word"; $word =~ /^[a-z][.-]/i and return "a $word"; # HANDLE CONSONANTS $word =~ /^[^aeiouy]/i and return "a $word"; # HANDLE SPECIAL VOWEL-FORMS $word =~ /^e[uw]/i and return "a $word"; $word =~ /^onc?e\b/i and return "a $word"; $word =~ /^uni([^nmd]|mo)/i and return "a $word"; $word =~ /^ut[th]/i and return "an $word"; $word =~ /^u[bcfhjkqrst][aeiou]/i and return "a $word"; # HANDLE SPECIAL CAPITALS $word =~ /^U[NK][AIEO]?/ and return "a $word"; # HANDLE VOWELS $word =~ /^[aeiou]/i and return "an $word"; # HANDLE y... (BEFORE CERTAIN CONSONANTS IMPLIES (UNNATURALIZED) "i.." SOUND) $word =~ /^($A_y_cons)/io and return "an $word"; # OTHERWISE, GUESS "a" return "a $word"; }