API Reference¶
Table of Contents |
|
---|---|
Functions: |
|
Replace emoji names with Unicode codes |
|
Replace Unicode emoji with emoji shortcodes |
|
Find Unicode emoji in a string |
|
Replace Unicode emoji with a customizable string |
|
Location of all emoji in a string |
|
Distinct list of emojis in the string |
|
Number of emojis in a string |
|
Check if a string/character is a single emoji |
|
Check if a string contains only emojis |
|
Find Unicode/Emoji version of an emoji |
|
Module variables: |
|
Dict of all emoji |
|
Dict of Unicode/Emoji status |
|
Module wide configuration |
|
Classes: |
|
- class emoji.EmojiMatch(emoji: str, start: int, end: int, data: Dict[str, Any] | None)[source]¶
Represents a match of a “recommended for general interchange” (RGI) emoji in a string.
- data¶
The entry from
EMOJI_DATA
for this emoji orNone
if the emoji is non-RGI
- data_copy() Dict[str, Any] [source]¶
Returns a copy of the data from
EMOJI_DATA
for this match with the additional keysmatch_start
andmatch_end
.
- emoji¶
The emoji substring
- end¶
The end index of the match in the string
- is_zwj() bool [source]¶
Checks if this is a ZWJ-emoji.
- Returns:
True if this is a ZWJ-emoji, False otherwise
- split() EmojiMatchZWJ | EmojiMatch [source]¶
Splits a ZWJ-emoji into its constituents.
- Returns:
An
EmojiMatchZWJ
containing the “sub-emoji” if this is a ZWJ-emoji, otherwise self
- start¶
The start index of the match in the string
- class emoji.EmojiMatchZWJ(match: EmojiMatch)[source]¶
Represents a match of multiple emoji in a string that were joined by zero-width-joiners (ZWJ/
\u200D
).- emojis: List[EmojiMatch]¶
List of sub emoji as EmojiMatch objects
- is_zwj() bool [source]¶
Checks if this is a ZWJ-emoji.
- Returns:
True if this is a ZWJ-emoji, False otherwise
- split() EmojiMatchZWJ [source]¶
Splits a ZWJ-emoji into its constituents.
- Returns:
An
EmojiMatchZWJ
containing the “sub-emoji” if this is a ZWJ-emoji, otherwise self
- class emoji.EmojiMatchZWJNonRGI(first_emoji_match: EmojiMatch, second_emoji_match: EmojiMatch)[source]¶
Represents a match of multiple emoji in a string that were joined by zero-width-joiners (ZWJ/
\u200D
). This class is only used for emoji that are not “recommended for general interchange” (non-RGI) by Unicode.org. The data property of this class is always None.- emojis: List[EmojiMatch]¶
List of sub emoji as EmojiMatch objects
- class emoji.Token(chars: str, value: str | EmojiMatch)[source]¶
A named tuple containing the matched string and its
EmojiMatch
object if it is an emoji or a single character that is not a unicode emoji.- chars: str¶
Alias for field number 0
- value: str | EmojiMatch¶
Alias for field number 1
- emoji.analyze(string: str, non_emoji: bool = False, join_emoji: bool = True) Iterator[Token] [source]¶
Find unicode emoji in a string. Yield each emoji as a named tuple
Token
(chars, EmojiMatch)
orToken
(chars, EmojiMatchZWJNonRGI)
. Ifnon_emoji
is True, also yield all other characters asToken
(char, char)
.- Parameters:
string – String to analyze
non_emoji – If True also yield all non-emoji characters as Token(char, char)
join_emoji – If True, multiple EmojiMatch are merged into a single EmojiMatchZWJNonRGI if they are separated only by a ZWJ.
- class emoji.config[source]¶
Module-wide configuration
- demojize_keep_zwj = True¶
Change the behavior of
emoji.demojize()
regarding zero-width-joiners (ZWJ/\u200D
) in emoji that are not “recommended for general interchange” (non-RGI). It has no effect on RGI emoji.For example this family emoji with different skin tones “👨👩🏿👧🏻👦🏾” contains four person emoji that are joined together by three ZWJ characters:
👨\u200D👩🏿\u200D👧🏻\u200D👦🏾
If
True
, the zero-width-joiners will be kept andemoji.emojize()
can reverse theemoji.demojize()
operation:emoji.emojize(emoji.demojize(s)) == s
The example emoji would be converted to
:man:\u200d:woman_dark_skin_tone:\u200d:girl_light_skin_tone:\u200d:boy_medium-dark_skin_tone:
If
False
, the zero-width-joiners will be removed andemoji.emojize()
can only reverse the individual emoji:emoji.emojize(emoji.demojize(s)) != s
The example emoji would be converted to
:man::woman_dark_skin_tone::girl_light_skin_tone::boy_medium-dark_skin_tone:
- static load_language(language: List[str] | str | None = None)[source]¶
Load one or multiple languages into memory. If no language is specified, all languages will be loaded.
This makes language data accessible in the
EMOJI_DATA
dict. For example to access a French emoji name, first load French withemoji.config.load_language('fr')
and then access it with
emoji.EMOJI_DATA['🏄']['fr']
Available languages are listed in
LANGUAGES
- replace_emoji_keep_zwj = False¶
Change the behavior of
emoji.replace_emoji()
regarding zero-width-joiners (ZWJ/\u200D
) in emoji that are not “recommended for general interchange” (non-RGI). It has no effect on RGI emoji.See
config.demojize_keep_zwj
for more information.
- emoji.demojize(string: str, delimiters: Tuple[str, str] = (':', ':'), language: str = 'en', version: float | None = None, handle_version: str | Callable[[str, Dict[str, str]], str] | None = None) str [source]¶
- Replace Unicode emoji in a string with emoji shortcodes. Useful for storage.
>>> import emoji >>> print(emoji.emojize("Python is fun :thumbs_up:")) Python is fun 👍 >>> print(emoji.demojize("Python is fun 👍")) Python is fun :thumbs_up: >>> print(emoji.demojize("icode is tricky 😯", delimiters=("__", "__"))) Unicode is tricky __hushed_face__
- Parameters:
string – String contains Unicode characters. MUST BE UNICODE.
delimiters – (optional) User delimiters other than
_DEFAULT_DELIMITER
language – Choose language of emoji name: language code ‘es’, ‘de’, etc. or ‘alias’ to use English aliases
version – (optional) Max version. If set to an Emoji Version, all emoji above this version will be removed.
handle_version –
(optional) Replace the emoji above
version
instead of removing it. handle_version can be either a string or a callablehandle_version(emj: str, data: dict) -> str
; If it is a callable, it’s passed the Unicode emoji and the data dict fromEMOJI_DATA
and must return a replacement string to be used. The passed data is in the form of:handle_version('\U0001F6EB', { 'en' : ':airplane_departure:', 'status' : fully_qualified, 'E' : 1, 'alias' : [':flight_departure:'], 'de': ':abflug:', 'es': ':avión_despegando:', ... })
- emoji.distinct_emoji_list(string: str) List[str] [source]¶
Returns distinct list of emojis from the string.
- emoji.emoji_count(string: str, unique: bool = False) int [source]¶
Returns the count of emojis in a string.
- Parameters:
unique – (optional) True if count only unique emojis
- emoji.emoji_list(string: str) List[_EmojiListReturn] [source]¶
- Returns the location and emoji in list of dict format.
>>> emoji.emoji_list("Hi, I am fine. 😁") [{'match_start': 15, 'match_end': 16, 'emoji': '😁'}]
- emoji.emojize(string: str, delimiters: Tuple[str, str] = (':', ':'), variant: Literal['text_type', 'emoji_type'] | None = None, language: str = 'en', version: float | None = None, handle_version: str | Callable[[str, Dict[str, str]], str] | None = None) str [source]¶
- Replace emoji names in a string with Unicode codes.
>>> import emoji >>> print(emoji.emojize("Python is fun :thumbsup:", language='alias')) Python is fun 👍 >>> print(emoji.emojize("Python is fun :thumbs_up:")) Python is fun 👍 >>> print(emoji.emojize("Python is fun {thumbs_up}", delimiters = ("{", "}"))) Python is fun 👍 >>> print(emoji.emojize("Python is fun :red_heart:", variant="text_type")) Python is fun ❤ >>> print(emoji.emojize("Python is fun :red_heart:", variant="emoji_type")) Python is fun ❤️ # red heart, not black heart
- Parameters:
string – String contains emoji names.
delimiters – (optional) Use delimiters other than _DEFAULT_DELIMITER. Each delimiter should contain at least one character that is not part of a-zA-Z0-9 and
_-&.()!?#*+,
. Seeemoji.core._EMOJI_NAME_PATTERN
for the regular expression of unsafe characters.variant – (optional) Choose variation selector between “base”(None), VS-15 (“text_type”) and VS-16 (“emoji_type”)
language – Choose language of emoji name: language code ‘es’, ‘de’, etc. or ‘alias’ to use English aliases
version – (optional) Max version. If set to an Emoji Version, all emoji above this version will be ignored.
handle_version –
(optional) Replace the emoji above
version
instead of ignoring it. handle_version can be either a string or a callable; If it is a callable, it’s passed the Unicode emoji and the data dict fromEMOJI_DATA
and must return a replacement string to be used:handle_version('\U0001F6EB', { 'en' : ':airplane_departure:', 'status' : fully_qualified, 'E' : 1, 'alias' : [':flight_departure:'], 'de': ':abflug:', 'es': ':avión_despegando:', ... })
- Raises:
ValueError – if
variant
is neither None, ‘text_type’ or ‘emoji_type’
- emoji.is_emoji(string: str) bool [source]¶
Returns True if the string is a single emoji, and it is “recommended for general interchange” by Unicode.org.
- emoji.purely_emoji(string: str) bool [source]¶
Returns True if the string contains only emojis. This might not imply that is_emoji for all the characters, for example, if the string contains variation selectors.
- emoji.replace_emoji(string: str, replace: str | Callable[[str, Dict[str, str]], str] = '', version: float = -1) str [source]¶
Replace Unicode emoji in a customizable string.
- Parameters:
string – String contains Unicode characters. MUST BE UNICODE.
replace – (optional) replace can be either a string or a callable; If it is a callable, it’s passed the Unicode emoji and the data dict from
EMOJI_DATA
and must return a replacement string to be used. replace(str, dict) -> strversion – (optional) Max version. If set to an Emoji Version, only emoji above this version will be replaced.
- emoji.version(string: str) float [source]¶
Returns the Emoji Version of the emoji.
- See https://www.unicode.org/reports/tr51/#Versioning for more information.
>>> emoji.version("😁") 0.6 >>> emoji.version(":butterfly:") 3
- Parameters:
string – An emoji or a text containing an emoji
- Raises:
ValueError – if
string
does not contain an emoji
EMOJI_DATA¶
- emoji.EMOJI_DATA: dict¶
Contains all emoji as keys and their names, Unicode version and status.
The data is stored in JSON files: https://github.com/carpedm20/emoji/tree/master/emoji/unicode_codes
The names in other languages than English are not loaded by default. They can be loaded with the
config.load_language()
function.EMOJI_DATA = { '🥇': { 'en' : ':1st_place_medal:', 'status' : emoji.STATUS["fully_qualified"], 'E' : 3 }, ... } # After config.load_language() to load more languages: EMOJI_DATA = { '🥇': { 'en' : ':1st_place_medal:', 'status' : emoji.STATUS["fully_qualified"], 'E' : 3, 'de': ':goldmedaille:', 'es': ':medalla_de_oro:', 'fr': ':médaille_d’or:', 'pt': ':medalha_de_ouro:', 'it': ':medaglia_d’oro:' }, ... }
Emoji status¶
- emoji.STATUS: dict¶
The status values that are used in
emoji.EMOJI_DATA
.For more information on the meaning of these values see http://www.unicode.org/reports/tr51/#Emoji_Implementation_Notes
component = 1 fully_qualified = 2 minimally_qualified = 3 unqualified = 4 STATUS: Dict[str, int] = { 'component': component, 'fully_qualified': fully_qualified, 'minimally_qualified': minimally_qualified, 'unqualified': unqualified, }
- emoji.LANGUAGES: dict¶
All available languages, that can be used as the
language
parameter inemojize()
anddemojize()
. (Additionally the special"alias"
language can be used inemojize()
anddemojize()
).LANGUAGES: List[str] = [ 'en', 'es', 'ja', 'ko', 'pt', 'it', 'fr', 'de', 'fa', 'id', 'zh', 'ru', 'tr', 'ar', ]
Emoji version¶
Every emoji in emoji.EMOJI_DATA
has a version number. The number refers to the release of
that emoji in the Unicode Standard.
It is stored in the key 'E'
. For example the emoji 🥇 :1st_place_medal:
is version
E3.0
that is Emoji 3.0 or Unicode 9.0:
>>> emoji.EMOJI_DATA['🥇']['E']
3
For more information see http://www.unicode.org/reports/tr51/#Versioning
The following table lists all versions, the number that is used in emoji.EMOJI_DATA
in
the “Data File Comment” column:
---------------+-------------+------------------+-------------------+
Emoji Version | Date | Unicode Version | Data File Comment |
---------------+-------------+------------------+-------------------+
N/A | 2010-10-11 | Unicode 6.0 | E0.6 |
N/A | 2014-06-16 | Unicode 7.0 | E0.7 |
Emoji 1.0 | 2015-06-09 | Unicode 8.0 | E1.0 |
Emoji 2.0 | 2015-11-12 | Unicode 8.0 | E2.0 |
Emoji 3.0 | 2016-06-03 | Unicode 9.0 | E3.0 |
Emoji 4.0 | 2016-11-22 | Unicode 9.0 | E4.0 |
Emoji 5.0 | 2017-06-20 | Unicode 10.0 | E5.0 |
Emoji 11.0 | 2018-05-21 | Unicode 11.0 | E11.0 |
Emoji 12.0 | 2019-03-05 | Unicode 12.0 | E12.0 |
Emoji 12.1 | 2019-10-21 | Unicode 12.1 | E12.1 |
Emoji 13.0 | 2020-03-10 | Unicode 13.0 | E13.0 |
Emoji 13.1 | 2020-09-15 | Unicode 13.0 | E13.1 |
Emoji 14.0 | 2021-09-14 | Unicode 14.0 | E14.0 |
Emoji 15.0 | 2022-09-13 | Unicode 15.0 | E15.0 |
Emoji 15.1 | 2023-09-12 | Unicode 15.1 | E15.1 |
Emoji 16.0 | 2024-09-10 | Unicode 16.0 | E16.0 |
http://www.unicode.org/reports/tr51/#Versioning