API Reference¶

Table of Contents
Functions:
`emojize()`	Replace emoji names with Unicode codes
`demojize()`	Replace Unicode emoji with emoji shortcodes
`analyze()`	Find Unicode emoji in a string
`replace_emoji()`	Replace Unicode emoji with a customizable string
`emoji_list()`	Location of all emoji in a string
`distinct_emoji_list()`	Distinct list of emojis in the string
`emoji_count()`	Number of emojis in a string
`is_emoji()`	Check if a string/character is a single emoji
`purely_emoji()`	Check if a string contains only emojis
`version()`	Find Unicode/Emoji version of an emoji
Module variables:
`EMOJI_DATA`	Dict of all emoji
`STATUS`	Dict of Unicode/Emoji status
`config`	Module wide configuration
Classes:
`EmojiMatch`
`EmojiMatchZWJ`
`EmojiMatchZWJNonRGI`
`Token`

class emoji.EmojiMatch(emoji: str, start: int, end: int, data: Dict[str, Any] | None)[source]¶

Represents a match of a “recommended for general interchange” (RGI) emoji in a string.

data¶: The entry from EMOJI_DATA for this emoji or None if the emoji is non-RGI

data_copy() → Dict[str, Any][source]¶: Returns a copy of the data from EMOJI_DATA for this match with the additional keys match_start and match_end.

emoji¶: The emoji substring

end¶: The end index of the match in the string

is_zwj() → bool[source]¶

Checks if this is a ZWJ-emoji.

Returns:: True if this is a ZWJ-emoji, False otherwise

split() → EmojiMatchZWJ | EmojiMatch[source]¶

Splits a ZWJ-emoji into its constituents.

Returns:: An EmojiMatchZWJ containing the “sub-emoji” if this is a ZWJ-emoji, otherwise self

start¶: The start index of the match in the string

class emoji.EmojiMatchZWJ(match: EmojiMatch)[source]¶

Represents a match of multiple emoji in a string that were joined by zero-width-joiners (ZWJ/\u200D).

emojis: List[EmojiMatch]¶: List of sub emoji as EmojiMatch objects

is_zwj() → bool[source]¶

Checks if this is a ZWJ-emoji.

Returns:: True if this is a ZWJ-emoji, False otherwise

join() → str[source]¶: Joins a ZWJ-emoji into a string

split() → EmojiMatchZWJ[source]¶

Splits a ZWJ-emoji into its constituents.

Returns:: An EmojiMatchZWJ containing the “sub-emoji” if this is a ZWJ-emoji, otherwise self

class emoji.EmojiMatchZWJNonRGI(first_emoji_match: EmojiMatch, second_emoji_match: EmojiMatch)[source]¶

Represents a match of multiple emoji in a string that were joined by zero-width-joiners (ZWJ/\u200D). This class is only used for emoji that are not “recommended for general interchange” (non-RGI) by Unicode.org. The data property of this class is always None.

emojis: List[EmojiMatch]¶: List of sub emoji as EmojiMatch objects

class emoji.Token(chars: str, value: str | EmojiMatch)[source]¶

A named tuple containing the matched string and its EmojiMatch object if it is an emoji or a single character that is not a unicode emoji.

chars: str¶: Alias for field number 0

value: str | EmojiMatch¶: Alias for field number 1

emoji.analyze(string: str, non_emoji: bool = False, join_emoji: bool = True) → Iterator[Token][source]¶

Find unicode emoji in a string. Yield each emoji as a named tuple Token (chars, EmojiMatch) or Token (chars, EmojiMatchZWJNonRGI). If non_emoji is True, also yield all other characters as Token (char, char) .

Parameters:

string – String to analyze
non_emoji – If True also yield all non-emoji characters as Token(char, char)
join_emoji – If True, multiple EmojiMatch are merged into a single EmojiMatchZWJNonRGI if they are separated only by a ZWJ.

class emoji.config[source]¶

Module-wide configuration

demojize_keep_zwj = True¶

Change the behavior of emoji.demojize() regarding zero-width-joiners (ZWJ/\u200D) in emoji that are not “recommended for general interchange” (non-RGI). It has no effect on RGI emoji.

For example this family emoji with different skin tones “👨‍👩🏿‍👧🏻‍👦🏾” contains four person emoji that are joined together by three ZWJ characters: 👨\u200D👩🏿\u200D👧🏻\u200D👦🏾

If True, the zero-width-joiners will be kept and emoji.emojize() can reverse the emoji.demojize() operation: emoji.emojize(emoji.demojize(s)) == s

The example emoji would be converted to :man:\u200d:woman_dark_skin_tone:\u200d:girl_light_skin_tone:\u200d:boy_medium-dark_skin_tone:

If False, the zero-width-joiners will be removed and emoji.emojize() can only reverse the individual emoji: emoji.emojize(emoji.demojize(s)) != s

The example emoji would be converted to :man::woman_dark_skin_tone::girl_light_skin_tone::boy_medium-dark_skin_tone:

static load_language(language: List[str] | str | None = None)[source]¶

Load one or multiple languages into memory. If no language is specified, all languages will be loaded.

This makes language data accessible in the EMOJI_DATA dict. For example to access a French emoji name, first load French with

emoji.config.load_language('fr')

and then access it with

emoji.EMOJI_DATA['🏄']['fr']

Available languages are listed in LANGUAGES

replace_emoji_keep_zwj = False¶

Change the behavior of emoji.replace_emoji() regarding zero-width-joiners (ZWJ/\u200D) in emoji that are not “recommended for general interchange” (non-RGI). It has no effect on RGI emoji.

See config.demojize_keep_zwj for more information.

emoji.demojize(string: str, delimiters: Tuple[str, str] = (':', ':'), language: str = 'en', version: float | None = None, handle_version: str | Callable[[str, Dict[str, str]], str] | None = None) → str[source]¶

Replace Unicode emoji in a string with emoji shortcodes. Useful for storage.

>>> import emoji
>>> print(emoji.emojize("Python is fun :thumbs_up:"))
Python is fun 👍
>>> print(emoji.demojize("Python is fun 👍"))
Python is fun :thumbs_up:
>>> print(emoji.demojize("icode is tricky 😯", delimiters=("__", "__")))
Unicode is tricky __hushed_face__

Parameters:

string – String contains Unicode characters. MUST BE UNICODE.
delimiters – (optional) User delimiters other than _DEFAULT_DELIMITER
language – Choose language of emoji name: language code ‘es’, ‘de’, etc. or ‘alias’ to use English aliases
version – (optional) Max version. If set to an Emoji Version, all emoji above this version will be removed.
handle_version –
(optional) Replace the emoji above version instead of removing it. handle_version can be either a string or a callable handle_version(emj: str, data: dict) -> str; If it is a callable, it’s passed the Unicode emoji and the data dict from EMOJI_DATA and must return a replacement string to be used. The passed data is in the form of:
```
handle_version('\U0001F6EB', {
    'en' : ':airplane_departure:',
    'status' : fully_qualified,
    'E' : 1,
    'alias' : [':flight_departure:'],
    'de': ':abflug:',
    'es': ':avión_despegando:',
    ...
})
```

emoji.distinct_emoji_list(string: str) → List[str][source]¶: Returns distinct list of emojis from the string.

emoji.emoji_count(string: str, unique: bool = False) → int[source]¶

Returns the count of emojis in a string.

Parameters:: unique – (optional) True if count only unique emojis

emoji.emoji_list(string: str) → List[_EmojiListReturn][source]¶

Returns the location and emoji in list of dict format.

>>> emoji.emoji_list("Hi, I am fine. 😁")
[{'match_start': 15, 'match_end': 16, 'emoji': '😁'}]

emoji.emojize(string: str, delimiters: Tuple[str, str] = (':', ':'), variant: Literal['text_type', 'emoji_type'] | None = None, language: str = 'en', version: float | None = None, handle_version: str | Callable[[str, Dict[str, str]], str] | None = None) → str[source]¶

Replace emoji names in a string with Unicode codes.

>>> import emoji
>>> print(emoji.emojize("Python is fun :thumbsup:", language='alias'))
Python is fun 👍
>>> print(emoji.emojize("Python is fun :thumbs_up:"))
Python is fun 👍
>>> print(emoji.emojize("Python is fun {thumbs_up}", delimiters = ("{", "}")))
Python is fun 👍
>>> print(emoji.emojize("Python is fun :red_heart:", variant="text_type"))
Python is fun ❤
>>> print(emoji.emojize("Python is fun :red_heart:", variant="emoji_type"))
Python is fun ❤️ # red heart, not black heart

Parameters:

string – String contains emoji names.
delimiters – (optional) Use delimiters other than _DEFAULT_DELIMITER. Each delimiter should contain at least one character that is not part of a-zA-Z0-9 and _-&.()!?#*+,. See emoji.core._EMOJI_NAME_PATTERN for the regular expression of unsafe characters.
variant – (optional) Choose variation selector between “base”(None), VS-15 (“text_type”) and VS-16 (“emoji_type”)
language – Choose language of emoji name: language code ‘es’, ‘de’, etc. or ‘alias’ to use English aliases
version – (optional) Max version. If set to an Emoji Version, all emoji above this version will be ignored.
handle_version –
(optional) Replace the emoji above version instead of ignoring it. handle_version can be either a string or a callable; If it is a callable, it’s passed the Unicode emoji and the data dict from EMOJI_DATA and must return a replacement string to be used:
```
handle_version('\U0001F6EB', {
    'en' : ':airplane_departure:',
    'status' : fully_qualified,
    'E' : 1,
    'alias' : [':flight_departure:'],
    'de': ':abflug:',
    'es': ':avión_despegando:',
    ...
})
```

Raises:

ValueError – if variant is neither None, ‘text_type’ or ‘emoji_type’

emoji.is_emoji(string: str) → bool[source]¶: Returns True if the string is a single emoji, and it is “recommended for general interchange” by Unicode.org.

emoji.purely_emoji(string: str) → bool[source]¶: Returns True if the string contains only emojis. This might not imply that is_emoji for all the characters, for example, if the string contains variation selectors.

emoji.replace_emoji(string: str, replace: str | Callable[[str, Dict[str, str]], str] = '', version: float = -1) → str[source]¶

Replace Unicode emoji in a customizable string.

Parameters:

string – String contains Unicode characters. MUST BE UNICODE.
replace – (optional) replace can be either a string or a callable; If it is a callable, it’s passed the Unicode emoji and the data dict from EMOJI_DATA and must return a replacement string to be used. replace(str, dict) -> str
version – (optional) Max version. If set to an Emoji Version, only emoji above this version will be replaced.

emoji.version(string: str) → float[source]¶

Returns the Emoji Version of the emoji.

See https://www.unicode.org/reports/tr51/#Versioning for more information.

>>> emoji.version("😁")
0.6
>>> emoji.version(":butterfly:")
3

Parameters:: string – An emoji or a text containing an emoji
Raises:: ValueError – if string does not contain an emoji

EMOJI_DATA¶

emoji.EMOJI_DATA: dict¶

Contains all emoji as keys and their names, Unicode version and status.

The data is stored in JSON files: https://github.com/carpedm20/emoji/tree/master/emoji/unicode_codes

The names in other languages than English are not loaded by default. They can be loaded with the config.load_language() function.

EMOJI_DATA = {
  '🥇': {
      'en' : ':1st_place_medal:',
      'status' : emoji.STATUS["fully_qualified"],
      'E' : 3
  },
  ...
}

# After config.load_language() to load more languages:

EMOJI_DATA = {
  '🥇': {
      'en' : ':1st_place_medal:',
      'status' : emoji.STATUS["fully_qualified"],
      'E' : 3,
      'de': ':goldmedaille:',
      'es': ':medalla_de_oro:',
      'fr': ':médaille_d’or:',
      'pt': ':medalha_de_ouro:',
      'it': ':medaglia_d’oro:'
  },
  ...
}

Emoji status¶

emoji.STATUS: dict¶

The status values that are used in emoji.EMOJI_DATA.

For more information on the meaning of these values see http://www.unicode.org/reports/tr51/#Emoji_Implementation_Notes

emoji/unicode_codes/data_dict.py¶

component = 1
fully_qualified = 2
minimally_qualified = 3
unqualified = 4

STATUS: Dict[str, int] = {
    'component': component,
    'fully_qualified': fully_qualified,
    'minimally_qualified': minimally_qualified,
    'unqualified': unqualified,
}

emoji.LANGUAGES: dict¶

All available languages, that can be used as the language parameter in emojize() and demojize(). (Additionally the special "alias" language can be used in emojize() and demojize()).

emoji/unicode_codes/data_dict.py¶

LANGUAGES: List[str] = [
    'en',
    'es',
    'ja',
    'ko',
    'pt',
    'it',
    'fr',
    'de',
    'fa',
    'id',
    'zh',
    'ru',
    'tr',
    'ar',
]

Emoji version¶

Every emoji in emoji.EMOJI_DATA has a version number. The number refers to the release of that emoji in the Unicode Standard. It is stored in the key 'E'. For example the emoji 🥇 :1st_place_medal: is version E3.0 that is Emoji 3.0 or Unicode 9.0:

>>> emoji.EMOJI_DATA['🥇']['E']
3

For more information see http://www.unicode.org/reports/tr51/#Versioning

The following table lists all versions, the number that is used in emoji.EMOJI_DATA in the “Data File Comment” column:

Unicode/Emoji Version (emoji/unicode_codes/data_dict.py)¶

---------------+-------------+------------------+-------------------+
Emoji Version  |    Date     | Unicode Version  | Data File Comment |
---------------+-------------+------------------+-------------------+
N/A            | 2010-10-11  | Unicode 6.0      | E0.6              |
N/A            | 2014-06-16  | Unicode 7.0      | E0.7              |
Emoji 1.0      | 2015-06-09  | Unicode 8.0      | E1.0              |
Emoji 2.0      | 2015-11-12  | Unicode 8.0      | E2.0              |
Emoji 3.0      | 2016-06-03  | Unicode 9.0      | E3.0              |
Emoji 4.0      | 2016-11-22  | Unicode 9.0      | E4.0              |
Emoji 5.0      | 2017-06-20  | Unicode 10.0     | E5.0              |
Emoji 11.0     | 2018-05-21  | Unicode 11.0     | E11.0             |
Emoji 12.0     | 2019-03-05  | Unicode 12.0     | E12.0             |
Emoji 12.1     | 2019-10-21  | Unicode 12.1     | E12.1             |
Emoji 13.0     | 2020-03-10  | Unicode 13.0     | E13.0             |
Emoji 13.1     | 2020-09-15  | Unicode 13.0     | E13.1             |
Emoji 14.0     | 2021-09-14  | Unicode 14.0     | E14.0             |
Emoji 15.0     | 2022-09-13  | Unicode 15.0     | E15.0             |
Emoji 15.1     | 2023-09-12  | Unicode 15.1     | E15.1             |
Emoji 16.0     | 2024-09-10  | Unicode 16.0     | E16.0             |

             http://www.unicode.org/reports/tr51/#Versioning

API Reference¶

EMOJI_DATA¶

Emoji status¶

Emoji version¶

emoji

Navigation

Related Topics