extract_msg.attachments Package

Subpackages/Submodules

Package contents

class extract_msg.attachments.Attachment(msg: MSGFile, dir_: str, propStore: PropertiesStore)[source]

Bases: AttachmentBase

A standard data attachment of an MSG file.

Parameters:
  • msg – the Message instance that the attachment belongs to.

  • dir – the directory inside the MSG file where the attachment is located.

  • propStore – The PropertiesStore instance for the attachment.

getFilename(**kwargs) str[source]

Returns the filename to use for the attachment.

Parameters:
  • contentId – Use the contentId, if available.

  • customFilename – A custom name to use for the file.

If the filename starts with “UnknownFilename” then there is no guarantee that the files will have exactly the same filename.

regenerateRandomName() str[source]

Used to regenerate the random filename used if the attachment cannot find a usable filename.

save(**kwargs) Tuple[SaveType, List[str] | str | None][source]

Saves the attachment data.

The name of the file is determined by several factors. The first thing that is checked is if you have provided :param customFilename: to this function. If you have, that is the name that will be used. If no custom name has been provided and :param contentId: is True, the file will be saved using the content ID of the attachment. If it is not found or :param contentId: is False, the long filename will be used. If the long filename is not found, the short one will be used. If after all of this a usable filename has not been found, a random one will be used (accessible from randomFilename()). After the name to use has been determined, it will then be shortened to make sure that it is not more than the value of :param maxNameLength:.

To change the directory that the attachment is saved to, set the value of :param customPath: when calling this function. The default save directory is the working directory.

If you want to save the contents into a ZipFile or similar object, either pass a path to where you want to create one or pass an instance to :param zip:. If :param zip: is an instance, :param customPath: will refer to a location inside the zip file.

Parameters:
  • extractEmbedded – If True, causes the attachment, should it be an embedded MSG file, to save as a .msg file instead of calling it’s save function.

  • skipEmbedded – If True, skips saving this attachment if it is an embedded MSG file.

property data: bytes

The bytes making up the attachment data.

property randomFilename: str

The random filename to be used by this attachment.

property type: AttachmentType

An enum value that identifies the type of attachment.

class extract_msg.attachments.AttachmentBase(msg: MSGFile, dir_: str, propStore: PropertiesStore)[source]

Bases: ABC

The base class for all standard Attachments used by the module.

Parameters:
  • msg – the Message instance that the attachment belongs to.

  • dir – the directory inside the MSG file where the attachment is located.

  • propStore – The PropertiesStore instance for the attachment.

exists(filename: str | List[str] | Tuple[str]) bool[source]

Checks if stream exists inside the attachment folder.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

sExists(filename: str | List[str] | Tuple[str]) bool[source]

Checks if the string stream exists inside the attachment folder.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

existsTypedProperty(id, _type=None) Tuple[bool, int][source]

Determines if the stream with the provided id exists.

The return of this function is 2 values, the first being a bool for if anything was found, and the second being how many were found.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

abstract getFilename(**kwargs) str[source]

Returns the filename to use for the attachment.

Parameters:
  • contentId – Use the contentId, if available.

  • customFilename – A custom name to use for the file.

If the filename starts with “UnknownFilename” then there is no guarantee that the files will have exactly the same filename.

getMultipleBinary(filename: str | List[str] | Tuple[str]) List[bytes] | None[source]

Gets a multiple binary property as a list of bytes objects.

Like getStringStream(), the 4 character type suffix should be omitted. So if you want the stream “__substg1.0_00011102” then the filename would simply be “__substg1.0_0001”.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

getMultipleString(filename: str | List[str] | Tuple[str]) List[str] | None[source]

Gets a multiple string property as a list of str objects.

Like getStringStream(), the 4 character type suffix should be omitted. So if you want the stream “__substg1.0_00011102” then the filename would simply be “__substg1.0_0001”.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

getNamedAs(propertyName: str, guid: str, overrideClass: Type[_T] | Callable[[Any], _T]) _T | None[source]

Returns the named property, setting the class if specified.

Parameters:

overrideClass – Class/function to use to morph the data that was read. The data will be the first argument to the class’s __init__ method or the function itself, if that is what is provided. If the value is None, this function is not called. If you want it to be called regardless, you should handle the data directly.

getNamedProp(propertyName: str, guid: str, default: _T | None = None) Any | _T[source]

instance.namedProperties.get((propertyName, guid), default)

Can be overriden to create new behavior.

getPropertyAs(propertyName: int | str, overrideClass: Type[_T] | Callable[[Any], _T]) _T | None[source]

Returns the property, setting the class if found.

Parameters:

overrideClass – Class/function to use to morph the data that was read. The data will be the first argument to the class’s __init__ method or the function itself, if that is what is provided. If the value is None, this function is not called. If you want it to be called regardless, you should handle the data directly.

getPropertyVal(name: int | str, default: _T | None = None) Any | _T[source]

instance.props.getValue(name, default)

Can be overridden to create new behavior.

getSingleOrMultipleBinary(filename: str | List[str] | Tuple[str]) List[bytes] | bytes | None[source]

A combination of getStringStream() and getMultipleString().

Checks to see if a single binary stream exists to return, otherwise tries to return the multiple binary stream of the same ID.

Like getStringStream(), the 4 character type suffix should be omitted. So if you want the stream “__substg1.0_00010102” then the filename would simply be “__substg1.0_0001”.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

getSingleOrMultipleString(filename: str | List[str] | Tuple[str]) str | List[str] | None[source]

A combination of getStringStream() and getMultipleString().

Checks to see if a single string stream exists to return, otherwise tries to return the multiple string stream of the same ID.

Like getStringStream(), the 4 character type suffix should be omitted. So if you want the stream “__substg1.0_0001001F” then the filename would simply be “__substg1.0_0001”.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

getStream(filename: str | List[str] | Tuple[str]) bytes | None[source]

Gets a binary representation of the requested stream.

This should ALWAYS return a bytes object if it was found, otherwise returns None.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

getStreamAs(streamID: str | List[str] | Tuple[str], overrideClass: Type[_T] | Callable[[Any], _T]) _T | None[source]

Returns the specified stream, modifying it to the specified class if it is found.

Parameters:

overrideClass – Class/function to use to morph the data that was read. The data will be the first argument to the class’s __init__ method or the function itself, if that is what is provided. If the value is None, this function is not called. If you want it to be called regardless, you should handle the data directly.

getStringStream(filename: str | List[str] | Tuple[str]) str | None[source]

Gets a string representation of the requested stream.

Rather than the full filename, you should only feed this function the filename sans the type. So if the full name is “__substg1.0_001A001F”, the filename this function should receive should be “__substg1.0_001A”.

This should ALWAYS return a string if it was found, otherwise returns None.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

getStringStreamAs(streamID: str | List[str] | Tuple[str], overrideClass: Type[_T] | Callable[[Any], _T]) _T | None[source]

Returns the specified string stream, modifying it to the specified class if it is found.

Parameters:

overrideClass – Class/function to use to morph the data that was read. The data will be the first argument to the class’s __init__ method or the function itself, if that is what is provided. If the value is None, this function is not called. If you want it to be called regardless, you should handle the data directly.

listDir(streams: bool = True, storages: bool = False) List[List[str]][source]

Lists the streams and/or storages that exist in the attachment directory.

Returns the paths excluding the attachment directory, allowing the paths to be directly used for accessing a stream.

slistDir(streams: bool = True, storages: bool = False) List[str][source]

Like listDir, except it returns the paths as strings.

abstract save(**kwargs) Tuple[SaveType, List[str] | str | None][source]

Saves the attachment data.

The name of the file is determined by the logic of getFilename(). If you are a developer, ensure that you use this behavior.

To change the directory that the attachment is saved to, set the value of :param customPath: when calling this function. The default save directory is the working directory.

If you want to save the contents into a ZipFile or similar object, either pass a path to where you want to create one or pass an instance to :param zip:. If :param zip: is an instance, :param customPath: will refer to a location inside the zip file.

Parameters:
  • extractEmbedded – If True, causes the attachment, should it be an embedded MSG file, to save as a .msg file instead of calling its save function.

  • skipEmbedded – If True, skips saving this attachment if it is an embedded MSG file.

Returns:

A tuple that specifies how the data was saved. The value of the first item specifies what the second value will be.

property attachmentEncoding: bytes | None

The encoding information about the attachment object.

Will return b'\x2A\x86\x48\x86\xf7\x14\x03\x0b\x01' if encoded in MacBinary format, otherwise it is unset.

property additionalInformation: str | None

The additional information about the attachment.

This property MUST be an empty string if attachmentEncoding is not set. Otherwise it MUST be set to a string of the format “:CREA:TYPE” where “:CREA” is the four-letter Macintosh file creator code and “:TYPE” is a four-letter Macintosh type code.

property cid: str | None

Returns the Content ID of the attachment, if it exists.

property clsid: str

Returns the CLSID for the data stream/storage of the attachment.

property contentId: str | None

Alias of cid.

property createdAt: datetime | None

Alias of creationTime.

property creationTime: datetime | None

The time the attachment was created.

abstract property data: object | None

The attachment data, if any.

Returns None if there is no data to save.

property dataType: Type[object] | None

The class that the data type will use, if it can be retrieved.

This is a safe way to do type checking on data before knowing if it will raise an exception. Returns None if no data will be returned or if an exception will be raised.

property dir: str

Returns the directory inside the MSG file where the attachment is located.

property displayName: str | None

Returns the display name of the folder.

property exceptionReplaceTime: datetime | None

The original date and time at which the instance in the recurrence pattern would have occurred if it were not an exception.

Only applicable if the attachment is an Exception object.

property extension: str | None

The reported extension for the file.

property hidden: bool

Indicates whether an Attachment object is hidden from the end user.

property isAttachmentContactPhoto: bool

Whether the attachment is a contact photo for a Contact object.

property lastModificationTime: datetime | None

The last time the attachment was modified.

property longFilename: str | None

Returns the long file name of the attachment, if it exists.

property longPathname: str | None

The fully qualified path and file name with extension.

property mimetype: str | None

The content-type mime header of the attachment, if specified.

property modifiedAt: datetime | None

Alias of lastModificationTime.

property msg: MSGFile

Returns the MSGFile instance the attachment belongs to.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

property name: str | None

The best name available for the file.

Uses long filename before short.

property namedProperties: NamedProperties

The NamedAttachmentProperties instance for this attachment.

property payloadClass: str | None

The class name of an object that can display the contents of the message.

property props: PropertiesStore

Returns the Properties instance of the attachment.

property renderingPosition: int | None

The offset, in rendered characters, to use when rendering the attachment within the main message text.

A value of 0xFFFFFFFF indicates a hidden attachment that is not to be rendered.

property shortFilename: str | None

The short file name of the attachment, if it exists.

property treePath: List[weakref.ReferenceType[Any]]

A path, as a tuple of instances, needed to get to this instance through the MSGFile-Attachment tree.

abstract property type: AttachmentType

An enum value that identifies the type of attachment.

class extract_msg.attachments.BrokenAttachment(msg: MSGFile, dir_: str, propStore: PropertiesStore)[source]

Bases: AttachmentBase

An attachment that has suffered a fatal error.

Will not generate from a NotImplementedError exception.

Parameters:
  • msg – the Message instance that the attachment belongs to.

  • dir – the directory inside the MSG file where the attachment is located.

  • propStore – The PropertiesStore instance for the attachment.

getFilename(**_) str[source]

Returns the filename to use for the attachment.

Parameters:
  • contentId – Use the contentId, if available.

  • customFilename – A custom name to use for the file.

If the filename starts with “UnknownFilename” then there is no guarantee that the files will have exactly the same filename.

save(**kwargs) Tuple[SaveType, List[str] | str | None][source]

Raises a NotImplementedError unless :param skipNotImplemented: is set to True.

If it is, returns a value that indicates no data was saved.

property data: None

Broken attachments have no data.

property type: AttachmentType

An enum value that identifies the type of attachment.

class extract_msg.attachments.CustomAttachmentHandler(attachment: AttachmentBase)[source]

Bases: ABC

A class designed to help with custom attachments that may require parsing in special ways that are completely different from one another.

getStream(filename: str | List[str] | Tuple[str]) bytes | None[source]

Gets a stream from the custom data directory.

getStreamAs(streamID: str | List[str] | Tuple[str], overrideClass: Type[_T] | Callable[[Any], _T]) _T | None[source]

Returns the specified stream, modifying it to the specified class if it is found.

Parameters:

overrideClass – Class/function to use to morph the data that was read. The data will be the first argument to the class’s __init__ function or the function itself, if that is what is provided. If the value is None, this function is not called. If you want it to be called regardless, you should handle the data directly.

abstract classmethod isCorrectHandler(attachment: AttachmentBase) bool[source]

Checks if this is the correct handler for the attachment.

abstract generateRtf() bytes | None[source]

Generates the RTF to inject in place of the objattph tag.

If this function should do nothing, returns None.

property attachment: AttachmentBase

The attachment this handler is associated with.

abstract property data: bytes | None

Gets the data for the attachment.

If an attachment should do nothing when saving, returns None.

abstract property name: str | None

Returns the name to be used when saving the attachment.

abstract property obj: object | None

Returns an object representing the data.

May return the same value as :property data:.

If there is no object to represent the custom attachment, including bytes, returns None.

property objInfo: ODTStruct | None

The structure representing the stream “x03ObjInfo”, if it exists.

property ole: OleStreamStruct | None

The structure representing the stream “x01Ole”, if it exists.

property presentationObjs: Dict[int, OLEPresentationStream] | None

Returns a dict of all presentation streams, as bytes.

class extract_msg.attachments.EmbeddedMsgAttachment(msg: MSGFile, dir_: str, propStore: PropertiesStore)[source]

Bases: AttachmentBase

The attachment entry for an Embedded MSG file.

Parameters:
  • msg – the Message instance that the attachment belongs to.

  • dir – the directory inside the MSG file where the attachment is located.

  • propStore – The PropertiesStore instance for the attachment.

getFilename(**kwargs) str[source]

Returns the filename to use for the attachment.

Parameters:
  • contentId – Use the contentId, if available.

  • customFilename – A custom name to use for the file.

If the filename starts with “UnknownFilename” then there is no guarantee that the files will have exactly the same filename.

save(**kwargs) Tuple[SaveType, List[str] | str | None][source]

Saves the attachment data.

The name of the file is determined by the logic of getFilename(). If you are a developer, ensure that you use this behavior.

To change the directory that the attachment is saved to, set the value of :param customPath: when calling this function. The default save directory is the working directory.

If you want to save the contents into a ZipFile or similar object, either pass a path to where you want to create one or pass an instance to :param zip:. If :param zip: is an instance, :param customPath: will refer to a location inside the zip file.

Parameters:
  • extractEmbedded – If True, causes the attachment, should it be an embedded MSG file, to save as a .msg file instead of calling its save function.

  • skipEmbedded – If True, skips saving this attachment if it is an embedded MSG file.

Returns:

A tuple that specifies how the data was saved. The value of the first item specifies what the second value will be.

property data: MSGFile

Returns the attachment data.

property type: AttachmentType

An enum value that identifies the type of attachment.

class extract_msg.attachments.SignedAttachment(msg, data: bytes, name: str, mimetype: str, node: Message)[source]

Bases: object

Parameters:
  • msg – The MSGFile instance this attachment is associated with.

  • data – The bytes that compose this attachment.

  • name – The reported name of the attachment.

  • mimetype – The reported mimetype of the attachment.

  • node – The email Message instance for this node.

save(**kwargs) Tuple[SaveType, List[str] | str | None][source]

Saves the attachment data.

The name of the file is determined by several factors. The first thing that is checked is if you have provided :param customFilename: to this function. If you have, that is the name that will be used. Otherwise, the name from name will be used. After the name to use has been determined, it will then be shortened to make sure that it is not more than the value of :param maxNameLength:.

To change the directory that the attachment is saved to, set the value of :param customPath: when calling this function. The default save directory is the working directory.

If you want to save the contents into a ZipFile or similar object, either pass a path to where you want to create one or pass an instance to :param zip:. If :param zip: is an instance, :param customPath: will refer to a location inside the zip file.

saveEmbededMessage(**kwargs) Tuple[SaveType, List[str] | str | None][source]

Seperate function from save to allow it to easily be overridden by a subclass.

property asBytes: bytes
property data: bytes | MSGFile

The bytes that compose this attachment.

property dataType: Type[type] | None

The class that the data type will use, if it can be retrieved.

This is a safe way to do type checking on data before knowing if it will raise an exception. Returns None if no data will be returns or if an exception will be raised.

property emailMessage: Message

The email Message instance that is the source for this attachment.

property mimetype: str

The reported mimetype of the attachment.

property msg: MSGFile

The MSGFile instance this attachment belongs to.

Raises:

ReferenceError – The associated MSGFile instance has been garbage collected.

property name: str

The reported name of this attachment.

property longFilename: str

The reported name of this attachment.

property shortFilename: str

The reported name of this attachment.

property treePath: List[weakref]

A path, as a tuple of instances, needed to get to this instance through the MSGFile-Attachment tree.

property type: AttachmentType
class extract_msg.attachments.UnsupportedAttachment(msg: MSGFile, dir_: str, propStore: PropertiesStore)[source]

Bases: AttachmentBase

An attachment whose type is not currently supported.

Parameters:
  • msg – the Message instance that the attachment belongs to.

  • dir – the directory inside the MSG file where the attachment is located.

  • propStore – The PropertiesStore instance for the attachment.

getFilename(**_) str[source]

Returns the filename to use for the attachment.

Parameters:
  • contentId – Use the contentId, if available.

  • customFilename – A custom name to use for the file.

If the filename starts with “UnknownFilename” then there is no guarantee that the files will have exactly the same filename.

save(**kwargs) Tuple[SaveType, List[str] | str | None][source]

Raises a NotImplementedError unless :param skipNotImplemented: is set to True.

If it is, returns a value that indicates no data was saved.

property data: None

Broken attachments have no data.

property type: AttachmentType

An enum value that identifies the type of attachment.

class extract_msg.attachments.WebAttachment(msg: MSGFile, dir_: str, propStore: PropertiesStore)[source]

Bases: AttachmentBase

An attachment that exists on the internet and not attached to the MSG file directly.

Parameters:
  • msg – the Message instance that the attachment belongs to.

  • dir – the directory inside the MSG file where the attachment is located.

  • propStore – The PropertiesStore instance for the attachment.

getFilename() str[source]

Returns the filename to use for the attachment.

Parameters:
  • contentId – Use the contentId, if available.

  • customFilename – A custom name to use for the file.

If the filename starts with “UnknownFilename” then there is no guarantee that the files will have exactly the same filename.

save(**kwargs) Tuple[SaveType, List[str] | str | None][source]

Raises a NotImplementedError unless :param skipNotImplemented: is set to True.

If it is, returns a value that indicates no data was saved.

property data: None

The bytes making up the attachment data.

property originalPermissionType: AttachmentPermissionType | None

The permission type data associated with a web reference attachment.

property permissionType: AttachmentPermissionType | None

The permission type data associated with a web reference attachment.

property providerName: str | None

The type of web service manipulating the attachment.

property type: AttachmentType

An enum value that identifies the type of attachment.

property url: str | None

The url for the web attachment. If this is not set, that is probably an error.

extract_msg.attachments.initStandardAttachment(msg: MSGFile, dir_: str) AttachmentBase[source]

Returns an instance of AttachmentBase for the attachment in the MSG file at the specified internal directory.

Parameters:

errorBehavior – Used to tell the function what to do on errors.

extract_msg.attachments.registerHandler(handler: Type[CustomAttachmentHandler]) None[source]

Registers the CustomAttachmentHandler subclass as a handler.

Raises:

TypeError – The handler was not a subclass of CustomAttachmentHandler.