API Reference

Table of Contents

Functions:

emojize()

Replace emoji names with Unicode codes

demojize()

Replace Unicode emoji with emoji shortcodes

analyze()

Find Unicode emoji in a string

replace_emoji()

Replace Unicode emoji with a customizable string

emoji_list()

Location of all emoji in a string

distinct_emoji_list()

Distinct list of emojis in the string

emoji_count()

Number of emojis in a string

is_emoji()

Check if a string/character is a single emoji

purely_emoji()

Check if a string contains only emojis

version()

Find Unicode/Emoji version of an emoji

Module variables:

EMOJI_DATA

Dict of all emoji

STATUS

Dict of Unicode/Emoji status

config

Module wide configuration

Classes:

EmojiMatch

EmojiMatchZWJ

EmojiMatchZWJNonRGI

Token

class emoji.EmojiMatch(emoji: str, start: int, end: int, data: Dict[str, Any] | None)[source]

Represents a match of a “recommended for general interchange” (RGI) emoji in a string.

data

The entry from EMOJI_DATA for this emoji or None if the emoji is non-RGI

data_copy() Dict[str, Any][source]

Returns a copy of the data from EMOJI_DATA for this match with the additional keys match_start and match_end.

emoji

The emoji substring

end

The end index of the match in the string

is_zwj() bool[source]

Checks if this is a ZWJ-emoji.

Returns:

True if this is a ZWJ-emoji, False otherwise

split() EmojiMatchZWJ | EmojiMatch[source]

Splits a ZWJ-emoji into its constituents.

Returns:

An EmojiMatchZWJ containing the “sub-emoji” if this is a ZWJ-emoji, otherwise self

start

The start index of the match in the string

class emoji.EmojiMatchZWJ(match: EmojiMatch)[source]

Represents a match of multiple emoji in a string that were joined by zero-width-joiners (ZWJ/\u200D).

emojis: List[EmojiMatch]

List of sub emoji as EmojiMatch objects

is_zwj() bool[source]

Checks if this is a ZWJ-emoji.

Returns:

True if this is a ZWJ-emoji, False otherwise

join() str[source]

Joins a ZWJ-emoji into a string

split() EmojiMatchZWJ[source]

Splits a ZWJ-emoji into its constituents.

Returns:

An EmojiMatchZWJ containing the “sub-emoji” if this is a ZWJ-emoji, otherwise self

class emoji.EmojiMatchZWJNonRGI(first_emoji_match: EmojiMatch, second_emoji_match: EmojiMatch)[source]

Represents a match of multiple emoji in a string that were joined by zero-width-joiners (ZWJ/\u200D). This class is only used for emoji that are not “recommended for general interchange” (non-RGI) by Unicode.org. The data property of this class is always None.

emojis: List[EmojiMatch]

List of sub emoji as EmojiMatch objects

class emoji.Token(chars: str, value: str | EmojiMatch)[source]

A named tuple containing the matched string and its EmojiMatch object if it is an emoji or a single character that is not a unicode emoji.

chars: str

Alias for field number 0

value: str | EmojiMatch

Alias for field number 1

emoji.analyze(string: str, non_emoji: bool = False, join_emoji: bool = True) Iterator[Token][source]

Find unicode emoji in a string. Yield each emoji as a named tuple Token (chars, EmojiMatch) or Token (chars, EmojiMatchZWJNonRGI). If non_emoji is True, also yield all other characters as Token (char, char) .

Parameters:
  • string – String to analyze

  • non_emoji – If True also yield all non-emoji characters as Token(char, char)

  • join_emoji – If True, multiple EmojiMatch are merged into a single EmojiMatchZWJNonRGI if they are separated only by a ZWJ.

class emoji.config[source]

Module-wide configuration

demojize_keep_zwj = True

Change the behavior of emoji.demojize() regarding zero-width-joiners (ZWJ/\u200D) in emoji that are not “recommended for general interchange” (non-RGI). It has no effect on RGI emoji.

For example this family emoji with different skin tones “👨‍👩🏿‍👧🏻‍👦🏾” contains four person emoji that are joined together by three ZWJ characters: 👨\u200D👩🏿\u200D👧🏻\u200D👦🏾

If True, the zero-width-joiners will be kept and emoji.emojize() can reverse the emoji.demojize() operation: emoji.emojize(emoji.demojize(s)) == s

The example emoji would be converted to :man:\u200d:woman_dark_skin_tone:\u200d:girl_light_skin_tone:\u200d:boy_medium-dark_skin_tone:

If False, the zero-width-joiners will be removed and emoji.emojize() can only reverse the individual emoji: emoji.emojize(emoji.demojize(s)) != s

The example emoji would be converted to :man::woman_dark_skin_tone::girl_light_skin_tone::boy_medium-dark_skin_tone:

static load_language(language: List[str] | str | None = None)[source]

Load one or multiple languages into memory. If no language is specified, all languages will be loaded.

This makes language data accessible in the EMOJI_DATA dict. For example to access a French emoji name, first load French with

emoji.config.load_language('fr')

and then access it with

emoji.EMOJI_DATA['🏄']['fr']

Available languages are listed in LANGUAGES

replace_emoji_keep_zwj = False

Change the behavior of emoji.replace_emoji() regarding zero-width-joiners (ZWJ/\u200D) in emoji that are not “recommended for general interchange” (non-RGI). It has no effect on RGI emoji.

See config.demojize_keep_zwj for more information.

emoji.demojize(string: str, delimiters: Tuple[str, str] = (':', ':'), language: str = 'en', version: float | None = None, handle_version: str | Callable[[str, Dict[str, str]], str] | None = None) str[source]
Replace Unicode emoji in a string with emoji shortcodes. Useful for storage.
>>> import emoji
>>> print(emoji.emojize("Python is fun :thumbs_up:"))
Python is fun 👍
>>> print(emoji.demojize("Python is fun 👍"))
Python is fun :thumbs_up:
>>> print(emoji.demojize("icode is tricky 😯", delimiters=("__", "__")))
Unicode is tricky __hushed_face__
Parameters:
  • string – String contains Unicode characters. MUST BE UNICODE.

  • delimiters – (optional) User delimiters other than _DEFAULT_DELIMITER

  • language – Choose language of emoji name: language code ‘es’, ‘de’, etc. or ‘alias’ to use English aliases

  • version – (optional) Max version. If set to an Emoji Version, all emoji above this version will be removed.

  • handle_version

    (optional) Replace the emoji above version instead of removing it. handle_version can be either a string or a callable handle_version(emj: str, data: dict) -> str; If it is a callable, it’s passed the Unicode emoji and the data dict from EMOJI_DATA and must return a replacement string to be used. The passed data is in the form of:

    handle_version('\U0001F6EB', {
        'en' : ':airplane_departure:',
        'status' : fully_qualified,
        'E' : 1,
        'alias' : [':flight_departure:'],
        'de': ':abflug:',
        'es': ':avión_despegando:',
        ...
    })
    

emoji.distinct_emoji_list(string: str) List[str][source]

Returns distinct list of emojis from the string.

emoji.emoji_count(string: str, unique: bool = False) int[source]

Returns the count of emojis in a string.

Parameters:

unique – (optional) True if count only unique emojis

emoji.emoji_list(string: str) List[_EmojiListReturn][source]
Returns the location and emoji in list of dict format.
>>> emoji.emoji_list("Hi, I am fine. 😁")
[{'match_start': 15, 'match_end': 16, 'emoji': '😁'}]
emoji.emojize(string: str, delimiters: Tuple[str, str] = (':', ':'), variant: Literal['text_type', 'emoji_type'] | None = None, language: str = 'en', version: float | None = None, handle_version: str | Callable[[str, Dict[str, str]], str] | None = None) str[source]
Replace emoji names in a string with Unicode codes.
>>> import emoji
>>> print(emoji.emojize("Python is fun :thumbsup:", language='alias'))
Python is fun 👍
>>> print(emoji.emojize("Python is fun :thumbs_up:"))
Python is fun 👍
>>> print(emoji.emojize("Python is fun {thumbs_up}", delimiters = ("{", "}")))
Python is fun 👍
>>> print(emoji.emojize("Python is fun :red_heart:", variant="text_type"))
Python is fun ❤
>>> print(emoji.emojize("Python is fun :red_heart:", variant="emoji_type"))
Python is fun ❤️ # red heart, not black heart
Parameters:
  • string – String contains emoji names.

  • delimiters – (optional) Use delimiters other than _DEFAULT_DELIMITER. Each delimiter should contain at least one character that is not part of a-zA-Z0-9 and _-&.()!?#*+,. See emoji.core._EMOJI_NAME_PATTERN for the regular expression of unsafe characters.

  • variant – (optional) Choose variation selector between “base”(None), VS-15 (“text_type”) and VS-16 (“emoji_type”)

  • language – Choose language of emoji name: language code ‘es’, ‘de’, etc. or ‘alias’ to use English aliases

  • version – (optional) Max version. If set to an Emoji Version, all emoji above this version will be ignored.

  • handle_version

    (optional) Replace the emoji above version instead of ignoring it. handle_version can be either a string or a callable; If it is a callable, it’s passed the Unicode emoji and the data dict from EMOJI_DATA and must return a replacement string to be used:

    handle_version('\U0001F6EB', {
        'en' : ':airplane_departure:',
        'status' : fully_qualified,
        'E' : 1,
        'alias' : [':flight_departure:'],
        'de': ':abflug:',
        'es': ':avión_despegando:',
        ...
    })
    

Raises:

ValueError – if variant is neither None, ‘text_type’ or ‘emoji_type’

emoji.is_emoji(string: str) bool[source]

Returns True if the string is a single emoji, and it is “recommended for general interchange” by Unicode.org.

emoji.purely_emoji(string: str) bool[source]

Returns True if the string contains only emojis. This might not imply that is_emoji for all the characters, for example, if the string contains variation selectors.

emoji.replace_emoji(string: str, replace: str | Callable[[str, Dict[str, str]], str] = '', version: float = -1) str[source]

Replace Unicode emoji in a customizable string.

Parameters:
  • string – String contains Unicode characters. MUST BE UNICODE.

  • replace – (optional) replace can be either a string or a callable; If it is a callable, it’s passed the Unicode emoji and the data dict from EMOJI_DATA and must return a replacement string to be used. replace(str, dict) -> str

  • version – (optional) Max version. If set to an Emoji Version, only emoji above this version will be replaced.

emoji.version(string: str) float[source]

Returns the Emoji Version of the emoji.

See https://www.unicode.org/reports/tr51/#Versioning for more information.
>>> emoji.version("😁")
0.6
>>> emoji.version(":butterfly:")
3
Parameters:

string – An emoji or a text containing an emoji

Raises:

ValueError – if string does not contain an emoji

EMOJI_DATA

emoji.EMOJI_DATA: dict

Contains all emoji as keys and their names, Unicode version and status.

The data is stored in JSON files: https://github.com/carpedm20/emoji/tree/master/emoji/unicode_codes

The names in other languages than English are not loaded by default. They can be loaded with the config.load_language() function.

EMOJI_DATA = {
  '🥇': {
      'en' : ':1st_place_medal:',
      'status' : emoji.STATUS["fully_qualified"],
      'E' : 3
  },
  ...
}

# After config.load_language() to load more languages:

EMOJI_DATA = {
  '🥇': {
      'en' : ':1st_place_medal:',
      'status' : emoji.STATUS["fully_qualified"],
      'E' : 3,
      'de': ':goldmedaille:',
      'es': ':medalla_de_oro:',
      'fr': ':médaille_d’or:',
      'pt': ':medalha_de_ouro:',
      'it': ':medaglia_d’oro:'
  },
  ...
}

Emoji status

emoji.STATUS: dict

The status values that are used in emoji.EMOJI_DATA.

For more information on the meaning of these values see http://www.unicode.org/reports/tr51/#Emoji_Implementation_Notes

emoji/unicode_codes/data_dict.py
component = 1
fully_qualified = 2
minimally_qualified = 3
unqualified = 4

STATUS: Dict[str, int] = {
    'component': component,
    'fully_qualified': fully_qualified,
    'minimally_qualified': minimally_qualified,
    'unqualified': unqualified,
}

emoji.LANGUAGES: dict

All available languages, that can be used as the language parameter in emojize() and demojize(). (Additionally the special "alias" language can be used in emojize() and demojize()).

emoji/unicode_codes/data_dict.py
LANGUAGES: List[str] = [
    'en',
    'es',
    'ja',
    'ko',
    'pt',
    'it',
    'fr',
    'de',
    'fa',
    'id',
    'zh',
    'ru',
    'tr',
    'ar',
]


Emoji version

Every emoji in emoji.EMOJI_DATA has a version number. The number refers to the release of that emoji in the Unicode Standard. It is stored in the key 'E'. For example the emoji 🥇 :1st_place_medal: is version E3.0 that is Emoji 3.0 or Unicode 9.0:

>>> emoji.EMOJI_DATA['🥇']['E']
3

For more information see http://www.unicode.org/reports/tr51/#Versioning

The following table lists all versions, the number that is used in emoji.EMOJI_DATA in the “Data File Comment” column:

Unicode/Emoji Version (emoji/unicode_codes/data_dict.py)

---------------+-------------+------------------+-------------------+
Emoji Version  |    Date     | Unicode Version  | Data File Comment |
---------------+-------------+------------------+-------------------+
N/A            | 2010-10-11  | Unicode 6.0      | E0.6              |
N/A            | 2014-06-16  | Unicode 7.0      | E0.7              |
Emoji 1.0      | 2015-06-09  | Unicode 8.0      | E1.0              |
Emoji 2.0      | 2015-11-12  | Unicode 8.0      | E2.0              |
Emoji 3.0      | 2016-06-03  | Unicode 9.0      | E3.0              |
Emoji 4.0      | 2016-11-22  | Unicode 9.0      | E4.0              |
Emoji 5.0      | 2017-06-20  | Unicode 10.0     | E5.0              |
Emoji 11.0     | 2018-05-21  | Unicode 11.0     | E11.0             |
Emoji 12.0     | 2019-03-05  | Unicode 12.0     | E12.0             |
Emoji 12.1     | 2019-10-21  | Unicode 12.1     | E12.1             |
Emoji 13.0     | 2020-03-10  | Unicode 13.0     | E13.0             |
Emoji 13.1     | 2020-09-15  | Unicode 13.0     | E13.1             |
Emoji 14.0     | 2021-09-14  | Unicode 14.0     | E14.0             |
Emoji 15.0     | 2022-09-13  | Unicode 15.0     | E15.0             |
Emoji 15.1     | 2023-09-12  | Unicode 15.1     | E15.1             |
Emoji 16.0     | 2024-09-10  | Unicode 16.0     | E16.0             |

             http://www.unicode.org/reports/tr51/#Versioning