The hyphen must not be among the “Remove the following characters” list but can still be removed selectively from within author names without affecting the separator between different names. Here is a regex to do it.
[authors:regex("\W*(\w+?)(?=[A-Z])","$1-")]_[shortyear]
Explanation:
- Find 0 or more non-word characters
- followed by 1 word-character
- and as many additional word-characters as needed (but no more)
- until but the next character coming up is in uppercase [A-Z] (but not including the uppercase character).
- Drop non-word characters, if any, from the beginning of the match (to remove hyphens and any other non-word characters within names).
- Keep the remaining word characters (the author name) and add
-
at the end of each match (to delineate author names).
- Add
_
and shortyear after the list of delineated names.
Example: Nazzal, Sharif Q. and Al-Dubai, Mohammed and Mounir, Ragia and Ali, Sherif and Mounir, Mohamed
becomes
Nazzal-AlDubai-Mounir-Ali-Mounir_21
Notice that “Al-Dubai” has changed to “AlDubai”, making it impossible to mistake the hyphenated name for two different author-names, while new dashes have been inserted between each of the author name.
Note that the regex is still incomplete, because the set [A-Z]
does not match accented characters such as Ü
É
and Ô
.
@Yunfan_He, you can add ranges of additional characters, if you are familiar with them. Alternatively, you can add individual characters this way as you discover imperfections in the citationkey expression:
[authors:regex("\W*(\w+?)(?=[A-ZÜÉÔ])","$1-")]_[shortyear]
Edit: Note that if the name contains more than one occurrence of non-word + word characters, the expression will only replace the last. The “Remove the following characters” option will take care of non-hyphens, but the expression will only remove the last hyphen in a multi-hyphenated name. Mixed cases also pose a challenge. Here are some examples.
O'Brien-MacDonald, Alice and Double-Hyphen-Me, Bob and O'Connor-Callahan, Cameron